mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-16 08:44:21 +08:00
5dcc0df493
17264 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Luiz Augusto von Dentz
|
18a4c59af6 |
Bluetooth: hci_sync: Fix overwriting request callback
[ Upstream commit |
||
Luiz Augusto von Dentz
|
b53e5ef62f |
Bluetooth: hci_core: Cancel request on command timeout
[ Upstream commit |
||
Luiz Augusto von Dentz
|
54db3630de |
Bluetooth: Remove BT_HS
[ Upstream commit |
||
Jonas Dreßler
|
96958d8164 |
Bluetooth: Remove HCI_POWER_OFF_TIMEOUT
[ Upstream commit |
||
Arnd Bergmann
|
eb2c11b27c |
net: bql: fix building with BQL disabled
It is now possible to disable BQL, but that causes the cpsw driver to break:
drivers/net/ethernet/ti/am65-cpsw-nuss.c:297:28: error: no member named 'dql' in 'struct netdev_queue'
297 | dql_avail(&netif_txq->dql),
There is already a helper function in net/sch_generic.h that could
be used to help here. Move its implementation into the common
linux/netdevice.h along with the other bql interfaces and change
both users over to the new interface.
Fixes:
|
||
Jeremy Kerr
|
3773d65ae5 |
net: mctp: take ownership of skb in mctp_local_output
Currently, mctp_local_output only takes ownership of skb on success, and
we may leak an skb if mctp_local_output fails in specific states; the
skb ownership isn't transferred until the actual output routing occurs.
Instead, make mctp_local_output free the skb on all error paths up to
the route action, so it always consumes the passed skb.
Fixes:
|
||
Pablo Neira Ayuso
|
9e0f043038 |
netfilter: nft_flow_offload: reset dst in route object after setting up flow
dst is transferred to the flow object, route object does not own it
anymore. Reset dst in route object, otherwise if flow_offload_add()
fails, error path releases dst twice, leading to a refcount underflow.
Fixes:
|
||
Paolo Abeni
|
b8adb69a7d |
mptcp: fix lockless access in subflow ULP diag
Since the introduction of the subflow ULP diag interface, the
dump callback accessed all the subflow data with lockless.
We need either to annotate all the read and write operation accordingly,
or acquire the subflow socket lock. Let's do latter, even if slower, to
avoid a diffstat havoc.
Fixes:
|
||
Tobias Waldekranz
|
dc489f8625 |
net: bridge: switchdev: Skip MDB replays of deferred events on offload
Before this change, generation of the list of MDB events to replay
would race against the creation of new group memberships, either from
the IGMP/MLD snooping logic or from user configuration.
While new memberships are immediately visible to walkers of
br->mdb_list, the notification of their existence to switchdev event
subscribers is deferred until a later point in time. So if a replay
list was generated during a time that overlapped with such a window,
it would also contain a replay of the not-yet-delivered event.
The driver would thus receive two copies of what the bridge internally
considered to be one single event. On destruction of the bridge, only
a single membership deletion event was therefore sent. As a
consequence of this, drivers which reference count memberships (at
least DSA), would be left with orphan groups in their hardware
database when the bridge was destroyed.
This is only an issue when replaying additions. While deletion events
may still be pending on the deferred queue, they will already have
been removed from br->mdb_list, so no duplicates can be generated in
that scenario.
To a user this meant that old group memberships, from a bridge in
which a port was previously attached, could be reanimated (in
hardware) when the port joined a new bridge, without the new bridge's
knowledge.
For example, on an mv88e6xxx system, create a snooping bridge and
immediately add a port to it:
root@infix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \
> ip link set dev x3 up master br0
And then destroy the bridge:
root@infix-06-0b-00:~$ ip link del dev br0
root@infix-06-0b-00:~$ mvls atu
ADDRESS FID STATE Q F 0 1 2 3 4 5 6 7 8 9 a
DEV:0 Marvell 88E6393X
33:33:00:00:00:6a 1 static - - 0 . . . . . . . . . .
33:33:ff:87:e4:3f 1 static - - 0 . . . . . . . . . .
ff:ff:ff:ff:ff:ff 1 static - - 0 1 2 3 4 5 6 7 8 9 a
root@infix-06-0b-00:~$
The two IPv6 groups remain in the hardware database because the
port (x3) is notified of the host's membership twice: once via the
original event and once via a replay. Since only a single delete
notification is sent, the count remains at 1 when the bridge is
destroyed.
Then add the same port (or another port belonging to the same hardware
domain) to a new bridge, this time with snooping disabled:
root@infix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \
> ip link set dev x3 up master br1
All multicast, including the two IPv6 groups from br0, should now be
flooded, according to the policy of br1. But instead the old
memberships are still active in the hardware database, causing the
switch to only forward traffic to those groups towards the CPU (port
0).
Eliminate the race in two steps:
1. Grab the write-side lock of the MDB while generating the replay
list.
This prevents new memberships from showing up while we are generating
the replay list. But it leaves the scenario in which a deferred event
was already generated, but not delivered, before we grabbed the
lock. Therefore:
2. Make sure that no deferred version of a replay event is already
enqueued to the switchdev deferred queue, before adding it to the
replay list, when replaying additions.
Fixes:
|
||
Jakub Kicinski
|
aec7961916 |
tls: fix race between async notify and socket close
The submitting thread (one which called recvmsg/sendmsg)
may exit as soon as the async crypto handler calls complete()
so any code past that point risks touching already freed data.
Try to avoid the locking and extra flags altogether.
Have the main thread hold an extra reference, this way
we can depend solely on the atomic ref counter for
synchronization.
Don't futz with reiniting the completion, either, we are now
tightly controlling when completion fires.
Reported-by: valis <sec@valis.email>
Fixes:
|
||
Paolo Abeni
|
63e4b9d693 |
netfilter pull request 24-02-08
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmXEuicACgkQ1V2XiooU IOSbvA/9F2BC9TYKAh23/0EFbD4jOl4e26YE4E+Eu8AteoQ/nD+oI+mtWgw2hVXg zXvm1vfIc02jGuGfcPZ+EIv/dkznnDqqUpUGa4ixtgvRw2bKkb2kKMlrFsjzsihj yabXydwhxYE9b4Ch2AmRyApTLRMocte1IJ3ci4YUXwf68wZlOe2bIG5wyzGkFpjF QZN/Rr14UKjC57EYNdUG9UdybWSqSKD23LPZSaLvi6wxoZd8cIcIkng5K4N0WVKF lNskuNFY+j+bJz2Yn3mWIlCoM3R1N2B04t7wRkYnKWkSuwymG3O7JC3RUQaZDBZw 8AogEbvXaIY3nxyN4lHZ/jzM/QzNB1WHlPx6RjWKHoNhnas+xuBYrjCdJZwtEu8g xs27Tjk3QtCIuaMuhN0RFqiq93MqZD/qx++kwMwJA0Wrg76MLPpf8yEWwVGYcAEG 0EWa61UfPezbcVkW8XveW6lgDfcOIOpBevxDQ3Nf7JB0AcbVBks7oDpGwDc5Pdz5 6y7WQIilxUtu9bHODUxrshxgTBwsocVkXUTIogCihUC+SgSZF+/G796c9Iy5/kPq BtmSNJOJyCbnivkqKTLF0Pv0BplOv7W1sx2/fo+IfRXYTHoXVjHe1BYP0Ck3WEtS 9EPsFlI5f4AOtnPF3JrTPec9PvuHyVN+8aOPi82wlKiayJcXy1I= =Rh2n -----END PGP SIGNATURE----- Merge tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Narrow down target/match revision to u8 in nft_compat. 2) Bail out with unused flags in nft_compat. 3) Restrict layer 4 protocol to u16 in nft_compat. 4) Remove static in pipapo get command that slipped through when reducing set memory footprint. 5) Follow up incremental fix for the ipset performance regression, this includes the missing gc cancellation, from Jozsef Kadlecsik. 6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0 as no filtering, from Felix Huettner. 7) Reject direction for NFT_CT_ID. 8) Use timestamp to check for set element expiration while transaction is handled to prevent garbage collection from removing set elements that were just added by this transaction. Packet path and netlink dump/get path still use current time to check for expiration. 9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal. 10) map_index needs to be percpu and per-set, not just percpu. At this time its possible for a pipapo set to fill the all-zero part with ones and take the 'might have bits set' as 'start-from-zero' area. From Florian Westphal. This includes three patches: - Change scratchpad area to a structure that provides space for a per-set-and-cpu toggle and uses it of the percpu one. - Add a new free helper to prepare for the next patch. - Remove the scratch_aligned pointer and makes AVX2 implementation use the exact same memory addresses for read/store of the matching state. netfilter pull request 24-02-08 * tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_set_pipapo: remove scratch_aligned pointer netfilter: nft_set_pipapo: add helper to release pcpu scratch area netfilter: nft_set_pipapo: store index in scratch maps netfilter: nft_set_rbtree: skip end interval element from gc netfilter: nfnetlink_queue: un-break NF_REPEAT netfilter: nf_tables: use timestamp to check for set element timeout netfilter: nft_ct: reject direction for ct id netfilter: ctnetlink: fix filtering for zone 0 netfilter: ipset: Missing gc cancellations fixed netfilter: nft_set_pipapo: remove static in nft_pipapo_get() netfilter: nft_compat: restrict match/target protocol to u16 netfilter: nft_compat: reject unused compat flag netfilter: nft_compat: narrow down revision to unsigned 8-bits ==================== Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Pablo Neira Ayuso
|
7395dfacff |
netfilter: nf_tables: use timestamp to check for set element timeout
Add a timestamp field at the beginning of the transaction, store it
in the nftables per-netns area.
Update set backend .insert, .deactivate and sync gc path to use the
timestamp, this avoids that an element expires while control plane
transaction is still unfinished.
.lookup and .update, which are used from packet path, still use the
current time to check if the element has expired. And .get path and dump
also since this runs lockless under rcu read size lock. Then, there is
async gc which also needs to check the current time since it runs
asynchronously from a workqueue.
Fixes:
|
||
Jakub Kicinski
|
335bac1daa |
wireless fixes for v6.8-rc4
This time we have unusually large wireless pull request. Several functionality fixes to both stack and iwlwifi. Lots of fixes to warnings, especially to MODULE_DESCRIPTION(). -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmXCAewRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZtbOAf+PFH5X248YVlVgeamCDVYNsFO0bgsyBN1 xHDWxxxLFv5uKuRvjx5oa4C4tzRs7kH8h3u5SUqTkIvMcHbzpVe6oBNMDknOtS96 lQ+lmfuIlKMvfmdG3QK3GyAcn57detHZ0InLsM5OmOdcUexmyJa7qc8qbqKZvX3d jux3ISQbyHyisnWHGHwWgKIY0UtZiesAuf/Ug09Ah0WW6KMLgNSoDc9/mQlKoGu7 gZpmB1xsJaPBPB0Tb1kD5XrGouGnDqN9obS4HH9fCDZwPDBTLDjuYtrtQU9cqtk5 4EUhuoeefTuep8ZJQgxn05j4D5NWQj9VCEvhaRwoF+Iz+t2/HLHuKQ== =4UCw -----END PGP SIGNATURE----- Merge tag 'wireless-2024-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.8-rc4 This time we have unusually large wireless pull request. Several functionality fixes to both stack and iwlwifi. Lots of fixes to warnings, especially to MODULE_DESCRIPTION(). * tag 'wireless-2024-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (31 commits) wifi: mt76: mt7996: fix fortify warning wifi: brcmfmac: Adjust n_channels usage for __counted_by wifi: iwlwifi: do not announce EPCS support wifi: iwlwifi: exit eSR only after the FW does wifi: iwlwifi: mvm: fix a battery life regression wifi: mac80211: accept broadcast probe responses on 6 GHz wifi: mac80211: adding missing drv_mgd_complete_tx() call wifi: mac80211: fix waiting for beacons logic wifi: mac80211: fix unsolicited broadcast probe config wifi: mac80211: initialize SMPS mode correctly wifi: mac80211: fix driver debugfs for vif type change wifi: mac80211: set station RX-NSS on reconfig wifi: mac80211: fix RCU use in TDLS fast-xmit wifi: mac80211: improve CSA/ECSA connection refusal wifi: cfg80211: detect stuck ECSA element in probe resp wifi: iwlwifi: remove extra kernel-doc wifi: fill in MODULE_DESCRIPTION()s for mt76 drivers wifi: fill in MODULE_DESCRIPTION()s for wilc1000 wifi: fill in MODULE_DESCRIPTION()s for wl18xx wifi: fill in MODULE_DESCRIPTION()s for p54spi ... ==================== Link: https://lore.kernel.org/r/20240206095722.CD9D2C433F1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Johannes Berg
|
177fbbcb4e |
wifi: cfg80211: detect stuck ECSA element in probe resp
We recently added some validation that we don't try to connect to an AP that is currently in a channel switch process, since that might want the channel to be quiet or we might not be able to connect in time to hear the switching in a beacon. This was in commit |
||
Jakub Kicinski
|
93561eefbf |
netfilter pull request 24-01-31
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmW6zq0ACgkQ1V2XiooU IOTTbQ//SnsPif56pzc+CzOhb6v3Gih1o+4dyH1GX5JHozwmuvFx4Iw3jz5DgMsL vzQBb89htlGqdj3tXnuX0mVZEGDBr/KO8jNT7Qnxrms3oRfhu+UKBl8y51f73LRe 79vqKaMmmaUen/fzyhTqxyqQFYzguEAzXq1Rs/arTuNjhnKLf4/R255sQRTRUe0v 08KEk/7D3FnmJ0pUtoh3E7mukodl2e5w/UH9CrerTtJ04jaRi17I+QmW3TmVFjx4 8k9+echgiF7iKPLlPJRWhHgzXPKEwUupu8rDemIg+QXNsicp+2T8HNCmJjE3Y8Qx jcEpqZOMxmlnvBeoelrGqnJ1GVAWTgOhi2R2hLLYEyRJ8LpIxhcZLI6ovQTO2u3v HIuPdAMmcsLI1wX9Lq/ybF9TqrCH1l2M9LqVerz139tXgDgOerMZvjsHp8Xv7nQz L2HDA39EI+ZPEEh3SCdomrydM2gp3r3d1hp4SUZLMFUZYR/CEUYt8DlX+iHWGgn0 GaVKgmgf/PqokyhYxpihB4m/ctmghPcxkscPaEDJas4uKYqDRRB1c1btA2f7IVnr 8sMt6gSG0HQgK6OoD6Sl+nACt3EoPdg/NaivrJ1rmWmQpb6EhIqJdILMZ0+s+lyx WBJnI16uJQUPVifGUb30qJdm/Go0cix5fOVJRMAaIUU0IFgezkE= =OdAk -----END PGP SIGNATURE----- Merge tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) TCP conntrack now only evaluates window negotiation for packets in the REPLY direction, from Ryan Schaefer. Otherwise SYN retransmissions trigger incorrect window scale negotiation. From Ryan Schaefer. 2) Restrict tunnel objects to NFPROTO_NETDEV which is where it makes sense to use this object type. 3) Fix conntrack pick up from the middle of SCTP_CID_SHUTDOWN_ACK packets. From Xin Long. 4) Another attempt from Jozsef Kadlecsik to address the slow down of the swap command in ipset. 5) Replace a BUG_ON by WARN_ON_ONCE in nf_log, and consolidate check for the case that the logger is NULL from the read side lock section. 6) Address lack of sanitization for custom expectations. Restrict layer 3 and 4 families to what it is supported by userspace. * tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger netfilter: ipset: fix performance regression in swap operation netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV netfilter: conntrack: correct window scaling with retransmitted SYN ==================== Link: https://lore.kernel.org/r/20240131225943.7536-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Eric Dumazet
|
4d322dce82 |
af_unix: fix lockdep positive in sk_diag_dump_icons()
syzbot reported a lockdep splat [1].
Blamed commit hinted about the possible lockdep
violation, and code used unix_state_lock_nested()
in an attempt to silence lockdep.
It is not sufficient, because unix_state_lock_nested()
is already used from unix_state_double_lock().
We need to use a separate subclass.
This patch adds a distinct enumeration to make things
more explicit.
Also use swap() in unix_state_double_lock() as a clean up.
v2: add a missing inline keyword to unix_state_lock_nested()
[1]
WARNING: possible circular locking dependency detected
6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Not tainted
syz-executor.1/2542 is trying to acquire lock:
ffff88808b5df9e8 (rlock-AF_UNIX){+.+.}-{2:2}, at: skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
but task is already holding lock:
ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&u->lock/1){+.+.}-{2:2}:
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
sk_diag_dump_icons net/unix/diag.c:87 [inline]
sk_diag_fill+0x6ea/0xfe0 net/unix/diag.c:157
sk_diag_dump net/unix/diag.c:196 [inline]
unix_diag_dump+0x3e9/0x630 net/unix/diag.c:220
netlink_dump+0x5c1/0xcd0 net/netlink/af_netlink.c:2264
__netlink_dump_start+0x5d7/0x780 net/netlink/af_netlink.c:2370
netlink_dump_start include/linux/netlink.h:338 [inline]
unix_diag_handler_dump+0x1c3/0x8f0 net/unix/diag.c:319
sock_diag_rcv_msg+0xe3/0x400
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2543
sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
sock_write_iter+0x39a/0x520 net/socket.c:1160
call_write_iter include/linux/fs.h:2085 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xa74/0xca0 fs/read_write.c:590
ksys_write+0x1a0/0x2c0 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
-> #0 (rlock-AF_UNIX){+.+.}-{2:2}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
__do_sys_sendmmsg net/socket.c:2753 [inline]
__se_sys_sendmmsg net/socket.c:2750 [inline]
__x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&u->lock/1);
lock(rlock-AF_UNIX);
lock(&u->lock/1);
lock(rlock-AF_UNIX);
*** DEADLOCK ***
1 lock held by syz-executor.1/2542:
#0: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089
stack backtrace:
CPU: 1 PID: 2542 Comm: syz-executor.1 Not tainted 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
check_noncircular+0x366/0x490 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
__do_sys_sendmmsg net/socket.c:2753 [inline]
__se_sys_sendmmsg net/socket.c:2750 [inline]
__x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f26d887cda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f26d95a60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00007f26d89abf80 RCX: 00007f26d887cda9
RDX: 000000000000003e RSI: 00000000200bd000 RDI: 0000000000000004
RBP: 00007f26d88c947a R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000008c0 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f26d89abf80 R15: 00007ffcfe081a68
Fixes:
|
||
Pablo Neira Ayuso
|
776d451648 |
netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV
Bail out on using the tunnel dst template from other than netdev family.
Add the infrastructure to check for the family in objects.
Fixes:
|
||
Nicolas Dichtel
|
e622502c31 |
ipmr: fix kernel panic when forwarding mcast packets
The stacktrace was:
[ 86.305548] BUG: kernel NULL pointer dereference, address: 0000000000000092
[ 86.306815] #PF: supervisor read access in kernel mode
[ 86.307717] #PF: error_code(0x0000) - not-present page
[ 86.308624] PGD 0 P4D 0
[ 86.309091] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 86.309883] CPU: 2 PID: 3139 Comm: pimd Tainted: G U 6.8.0-6wind-knet #1
[ 86.311027] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
[ 86.312728] RIP: 0010:ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985)
[ 86.313399] Code: f9 1f 0f 87 85 03 00 00 48 8d 04 5b 48 8d 04 83 49 8d 44 c5 00 48 8b 40 70 48 39 c2 0f 84 d9 00 00 00 49 8b 46 58 48 83 e0 fe <80> b8 92 00 00 00 00 0f 84 55 ff ff ff 49 83 47 38 01 45 85 e4 0f
[ 86.316565] RSP: 0018:ffffad21c0583ae0 EFLAGS: 00010246
[ 86.317497] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 86.318596] RDX: ffff9559cb46c000 RSI: 0000000000000000 RDI: 0000000000000000
[ 86.319627] RBP: ffffad21c0583b30 R08: 0000000000000000 R09: 0000000000000000
[ 86.320650] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[ 86.321672] R13: ffff9559c093a000 R14: ffff9559cc00b800 R15: ffff9559c09c1d80
[ 86.322873] FS: 00007f85db661980(0000) GS:ffff955a79d00000(0000) knlGS:0000000000000000
[ 86.324291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 86.325314] CR2: 0000000000000092 CR3: 000000002f13a000 CR4: 0000000000350ef0
[ 86.326589] Call Trace:
[ 86.327036] <TASK>
[ 86.327434] ? show_regs (/build/work/knet/arch/x86/kernel/dumpstack.c:479)
[ 86.328049] ? __die (/build/work/knet/arch/x86/kernel/dumpstack.c:421 /build/work/knet/arch/x86/kernel/dumpstack.c:434)
[ 86.328508] ? page_fault_oops (/build/work/knet/arch/x86/mm/fault.c:707)
[ 86.329107] ? do_user_addr_fault (/build/work/knet/arch/x86/mm/fault.c:1264)
[ 86.329756] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.330350] ? __irq_work_queue_local (/build/work/knet/kernel/irq_work.c:111 (discriminator 1))
[ 86.331013] ? exc_page_fault (/build/work/knet/./arch/x86/include/asm/paravirt.h:693 /build/work/knet/arch/x86/mm/fault.c:1515 /build/work/knet/arch/x86/mm/fault.c:1563)
[ 86.331702] ? asm_exc_page_fault (/build/work/knet/./arch/x86/include/asm/idtentry.h:570)
[ 86.332468] ? ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985)
[ 86.333183] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.333920] ipmr_mfc_add (/build/work/knet/./include/linux/rcupdate.h:782 /build/work/knet/net/ipv4/ipmr.c:1009 /build/work/knet/net/ipv4/ipmr.c:1273)
[ 86.334583] ? __pfx_ipmr_hash_cmp (/build/work/knet/net/ipv4/ipmr.c:363)
[ 86.335357] ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470)
[ 86.336135] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.336854] ? ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470)
[ 86.337679] do_ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:944)
[ 86.338408] ? __pfx_unix_stream_read_actor (/build/work/knet/net/unix/af_unix.c:2862)
[ 86.339232] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.339809] ? aa_sk_perm (/build/work/knet/security/apparmor/include/cred.h:153 /build/work/knet/security/apparmor/net.c:181)
[ 86.340342] ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:1415)
[ 86.340859] raw_setsockopt (/build/work/knet/net/ipv4/raw.c:836)
[ 86.341408] ? security_socket_setsockopt (/build/work/knet/security/security.c:4561 (discriminator 13))
[ 86.342116] sock_common_setsockopt (/build/work/knet/net/core/sock.c:3716)
[ 86.342747] do_sock_setsockopt (/build/work/knet/net/socket.c:2313)
[ 86.343363] __sys_setsockopt (/build/work/knet/./include/linux/file.h:32 /build/work/knet/net/socket.c:2336)
[ 86.344020] __x64_sys_setsockopt (/build/work/knet/net/socket.c:2340)
[ 86.344766] do_syscall_64 (/build/work/knet/arch/x86/entry/common.c:52 /build/work/knet/arch/x86/entry/common.c:83)
[ 86.345433] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.346161] ? syscall_exit_work (/build/work/knet/./include/linux/audit.h:357 /build/work/knet/kernel/entry/common.c:160)
[ 86.346938] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.347657] ? syscall_exit_to_user_mode (/build/work/knet/kernel/entry/common.c:215)
[ 86.348538] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223)
[ 86.349262] ? do_syscall_64 (/build/work/knet/./arch/x86/include/asm/cpufeature.h:171 /build/work/knet/arch/x86/entry/common.c:98)
[ 86.349971] entry_SYSCALL_64_after_hwframe (/build/work/knet/arch/x86/entry/entry_64.S:129)
The original packet in ipmr_cache_report() may be queued and then forwarded
with ip_mr_forward(). This last function has the assumption that the skb
dst is set.
After the below commit, the skb dst is dropped by ipv4_pktinfo_prepare(),
which causes the oops.
Fixes:
|
||
Paolo Abeni
|
fdf8e6d18c |
bpf-for-netdev
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZbIcmAAKCRDbK58LschI gw7OAP4xKTkfcHSZUWcIOWgmdCX41B1zUPdyH2w3woHVH8kKRgEApWisrUKZq30Q 7seUBxVNK1HXBS+egqqRT7Rgs4iFhQs= =iBo/ -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-01-25 The following pull-request contains BPF updates for your *net* tree. We've added 12 non-merge commits during the last 2 day(s) which contain a total of 13 files changed, 190 insertions(+), 91 deletions(-). The main changes are: 1) Fix bpf_xdp_adjust_tail() in context of XSK zero-copy drivers which support XDP multi-buffer. The former triggered a NULL pointer dereference upon shrinking, from Maciej Fijalkowski & Tirthendu Sarkar. 2) Fix a bug in riscv64 BPF JIT which emitted a wrong prologue and epilogue for struct_ops programs, from Pu Lehui. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue i40e: set xdp_rxq_info::frag_size xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers ice: remove redundant xdp_rxq_info registration i40e: handle multi-buffer packets that are shrunk by xdp prog ice: work on pre-XDP prog frag count xsk: fix usage of multi-buffer BPF helpers for ZC XDP xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags xsk: recycle buffer in case Rx queue was full riscv, bpf: Fix unpredictable kernel crash about RV64 struct_ops ==================== Link: https://lore.kernel.org/r/20240125084416.10876-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Maciej Fijalkowski
|
c5114710c8 |
xsk: fix usage of multi-buffer BPF helpers for ZC XDP
Currently when packet is shrunk via bpf_xdp_adjust_tail() and memory
type is set to MEM_TYPE_XSK_BUFF_POOL, null ptr dereference happens:
[1136314.192256] BUG: kernel NULL pointer dereference, address:
0000000000000034
[1136314.203943] #PF: supervisor read access in kernel mode
[1136314.213768] #PF: error_code(0x0000) - not-present page
[1136314.223550] PGD 0 P4D 0
[1136314.230684] Oops: 0000 [#1] PREEMPT SMP NOPTI
[1136314.239621] CPU: 8 PID: 54203 Comm: xdpsock Not tainted 6.6.0+ #257
[1136314.250469] Hardware name: Intel Corporation S2600WFT/S2600WFT,
BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[1136314.265615] RIP: 0010:__xdp_return+0x6c/0x210
[1136314.274653] Code: ad 00 48 8b 47 08 49 89 f8 a8 01 0f 85 9b 01 00 00 0f 1f 44 00 00 f0 41 ff 48 34 75 32 4c 89 c7 e9 79 cd 80 ff 83 fe 03 75 17 <f6> 41 34 01 0f 85 02 01 00 00 48 89 cf e9 22 cc 1e 00 e9 3d d2 86
[1136314.302907] RSP: 0018:ffffc900089f8db0 EFLAGS: 00010246
[1136314.312967] RAX: ffffc9003168aed0 RBX: ffff8881c3300000 RCX:
0000000000000000
[1136314.324953] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
ffffc9003168c000
[1136314.336929] RBP: 0000000000000ae0 R08: 0000000000000002 R09:
0000000000010000
[1136314.348844] R10: ffffc9000e495000 R11: 0000000000000040 R12:
0000000000000001
[1136314.360706] R13: 0000000000000524 R14: ffffc9003168aec0 R15:
0000000000000001
[1136314.373298] FS: 00007f8df8bbcb80(0000) GS:ffff8897e0e00000(0000)
knlGS:0000000000000000
[1136314.386105] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1136314.396532] CR2: 0000000000000034 CR3: 00000001aa912002 CR4:
00000000007706f0
[1136314.408377] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1136314.420173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1136314.431890] PKRU: 55555554
[1136314.439143] Call Trace:
[1136314.446058] <IRQ>
[1136314.452465] ? __die+0x20/0x70
[1136314.459881] ? page_fault_oops+0x15b/0x440
[1136314.468305] ? exc_page_fault+0x6a/0x150
[1136314.476491] ? asm_exc_page_fault+0x22/0x30
[1136314.484927] ? __xdp_return+0x6c/0x210
[1136314.492863] bpf_xdp_adjust_tail+0x155/0x1d0
[1136314.501269] bpf_prog_ccc47ae29d3b6570_xdp_sock_prog+0x15/0x60
[1136314.511263] ice_clean_rx_irq_zc+0x206/0xc60 [ice]
[1136314.520222] ? ice_xmit_zc+0x6e/0x150 [ice]
[1136314.528506] ice_napi_poll+0x467/0x670 [ice]
[1136314.536858] ? ttwu_do_activate.constprop.0+0x8f/0x1a0
[1136314.546010] __napi_poll+0x29/0x1b0
[1136314.553462] net_rx_action+0x133/0x270
[1136314.561619] __do_softirq+0xbe/0x28e
[1136314.569303] do_softirq+0x3f/0x60
This comes from __xdp_return() call with xdp_buff argument passed as
NULL which is supposed to be consumed by xsk_buff_free() call.
To address this properly, in ZC case, a node that represents the frag
being removed has to be pulled out of xskb_list. Introduce
appropriate xsk helpers to do such node operation and use them
accordingly within bpf_xdp_adjust_tail().
Fixes:
|
||
Maciej Fijalkowski
|
f7f6aa8e24 |
xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is used by drivers to notify data path whether xdp_buff contains fragments or not. Data path looks up mentioned flag on first buffer that occupies the linear part of xdp_buff, so drivers only modify it there. This is sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on stack or it resides within struct representing driver's queue and fragments are carried via skb_frag_t structs. IOW, we are dealing with only one xdp_buff. ZC mode though relies on list of xdp_buff structs that is carried via xsk_buff_pool::xskb_list, so ZC data path has to make sure that fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise, xsk_buff_free() could misbehave if it would be executed against xdp_buff that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can take place when within supplied XDP program bpf_xdp_adjust_tail() is used with negative offset that would in turn release the tail fragment from multi-buffer frame. Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would result in releasing all the nodes from xskb_list that were produced by driver before XDP program execution, which is not what is intended - only tail fragment should be deleted from xskb_list and then it should be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never make it up to user space, so from AF_XDP application POV there would be no traffic running, however due to free_list getting constantly new nodes, driver will be able to feed HW Rx queue with recycled buffers. Bottom line is that instead of traffic being redirected to user space, it would be continuously dropped. To fix this, let us clear the mentioned flag on xsk_buff_pool side during xdp_buff initialization, which is what should have been done right from the start of XSK multi-buffer support. Fixes: |
||
George Guo
|
b253d87fd7 |
netfilter: nf_tables: cleanup documentation
- Correct comments for nlpid, family, udlen and udata in struct nft_table, and afinfo is no longer a member of enum nft_set_class. - Add comment for data in struct nft_set_elem. - Add comment for flags in struct nft_ctx. - Add comments for timeout in struct nft_set_iter, and flags is not a member of struct nft_set_iter, remove the comment for it. - Add comments for commit, abort, estimate and gc_init in struct nft_set_ops. - Add comments for pending_update, num_exprs, exprs and catchall_list in struct nft_set. - Add comment for ext_len in struct nft_set_ext_tmpl. - Add comment for inner_ops in struct nft_expr_type. - Add comments for clone, destroy_clone, reduce, gc, offload, offload_action, offload_stats in struct nft_expr_ops. - Add comments for blob_gen_0, blob_gen_1, bound, genmask, udlen, udata, blob_next in struct nft_chain. - Add comment for flags in struct nft_base_chain. - Add comments for udlen, udata in struct nft_object. - Add comment for type in struct nft_object_ops. - Add comment for hook_list in struct nft_flowtable, and remove comments for dev_name and ops which are not members of struct nft_flowtable. Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> |
||
Ido Schimmel
|
32f2a0afa9 |
net/sched: flower: Fix chain template offload
When a qdisc is deleted from a net device the stack instructs the
underlying driver to remove its flow offload callback from the
associated filter block using the 'FLOW_BLOCK_UNBIND' command. The stack
then continues to replay the removal of the filters in the block for
this driver by iterating over the chains in the block and invoking the
'reoffload' operation of the classifier being used. In turn, the
classifier in its 'reoffload' operation prepares and emits a
'FLOW_CLS_DESTROY' command for each filter.
However, the stack does not do the same for chain templates and the
underlying driver never receives a 'FLOW_CLS_TMPLT_DESTROY' command when
a qdisc is deleted. This results in a memory leak [1] which can be
reproduced using [2].
Fix by introducing a 'tmplt_reoffload' operation and have the stack
invoke it with the appropriate arguments as part of the replay.
Implement the operation in the sole classifier that supports chain
templates (flower) by emitting the 'FLOW_CLS_TMPLT_{CREATE,DESTROY}'
command based on whether a flow offload callback is being bound to a
filter block or being unbound from one.
As far as I can tell, the issue happens since cited commit which
reordered tcf_block_offload_unbind() before tcf_block_flush_all_chains()
in __tcf_block_put(). The order cannot be reversed as the filter block
is expected to be freed after flushing all the chains.
[1]
unreferenced object 0xffff888107e28800 (size 2048):
comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
hex dump (first 32 bytes):
b1 a6 7c 11 81 88 ff ff e0 5b b3 10 81 88 ff ff ..|......[......
01 00 00 00 00 00 00 00 e0 aa b0 84 ff ff ff ff ................
backtrace:
[<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
[<ffffffff81ab374e>] __kmalloc+0x4e/0x90
[<ffffffff832aec6d>] mlxsw_sp_acl_ruleset_get+0x34d/0x7a0
[<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
[<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
[<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
[<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
[<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
[<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
[<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
[<ffffffff83ac6270>] netlink_unicast+0x540/0x820
[<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
[<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80
[<ffffffff8379d29a>] ___sys_sendmsg+0x13a/0x1e0
[<ffffffff8379d50c>] __sys_sendmsg+0x11c/0x1f0
[<ffffffff843b9ce0>] do_syscall_64+0x40/0xe0
unreferenced object 0xffff88816d2c0400 (size 1024):
comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
hex dump (first 32 bytes):
40 00 00 00 00 00 00 00 57 f6 38 be 00 00 00 00 @.......W.8.....
10 04 2c 6d 81 88 ff ff 10 04 2c 6d 81 88 ff ff ..,m......,m....
backtrace:
[<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
[<ffffffff81ab36c1>] __kmalloc_node+0x51/0x90
[<ffffffff81a8ed96>] kvmalloc_node+0xa6/0x1f0
[<ffffffff82827d03>] bucket_table_alloc.isra.0+0x83/0x460
[<ffffffff82828d2b>] rhashtable_init+0x43b/0x7c0
[<ffffffff832aed48>] mlxsw_sp_acl_ruleset_get+0x428/0x7a0
[<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
[<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
[<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
[<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
[<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
[<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
[<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
[<ffffffff83ac6270>] netlink_unicast+0x540/0x820
[<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
[<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80
[2]
# tc qdisc add dev swp1 clsact
# tc chain add dev swp1 ingress proto ip chain 1 flower dst_ip 0.0.0.0/32
# tc qdisc del dev swp1 clsact
# devlink dev reload pci/0000:06:00.0
Fixes:
|
||
Eric Dumazet
|
a54d51fb2d |
udp: fix busy polling
Generic sk_busy_loop_end() only looks at sk->sk_receive_queue
for presence of packets.
Problem is that for UDP sockets after blamed commit, some packets
could be present in another queue: udp_sk(sk)->reader_queue
In some cases, a busy poller could spin until timeout expiration,
even if some packets are available in udp_sk(sk)->reader_queue.
v3: - make sk_busy_loop_end() nicer (Willem)
v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats.
- add a sk_is_inet() check in sk_is_udp() (Willem feedback)
- add a sk_is_inet() check in sk_is_tcp().
Fixes:
|
||
Kuniyuki Iwashima
|
e3f9bed9be |
llc: Drop support for ETH_P_TR_802_2.
syzbot reported an uninit-value bug below. [0] llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2 (0x0011), and syzbot abused the latter to trigger the bug. write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16) llc_conn_handler() initialises local variables {saddr,daddr}.mac based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes them to __llc_lookup(). However, the initialisation is done only when skb->protocol is htons(ETH_P_802_2), otherwise, __llc_lookup_established() and __llc_lookup_listener() will read garbage. The missing initialisation existed prior to commit |
||
Zhengchao Shao
|
198bc90e0e |
tcp: make sure init the accept_queue's spinlocks once
When I run syz's reproduction C program locally, it causes the following issue: pvqspinlock: lock 0xffff9d181cd5c660 has corrupted value 0x0! WARNING: CPU: 19 PID: 21160 at __pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508) Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508) Code: 73 56 3a ff 90 c3 cc cc cc cc 8b 05 bb 1f 48 01 85 c0 74 05 c3 cc cc cc cc 8b 17 48 89 fe 48 c7 c7 30 20 ce 8f e8 ad 56 42 ff <0f> 0b c3 cc cc cc cc 0f 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffa8d200604cb8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9d1ef60e0908 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9d1ef60e0900 RBP: ffff9d181cd5c280 R08: 0000000000000000 R09: 00000000ffff7fff R10: ffffa8d200604b68 R11: ffffffff907dcdc8 R12: 0000000000000000 R13: ffff9d181cd5c660 R14: ffff9d1813a3f330 R15: 0000000000001000 FS: 00007fa110184640(0000) GS:ffff9d1ef60c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000011f65e000 CR4: 00000000000006f0 Call Trace: <IRQ> _raw_spin_unlock (kernel/locking/spinlock.c:186) inet_csk_reqsk_queue_add (net/ipv4/inet_connection_sock.c:1321) inet_csk_complete_hashdance (net/ipv4/inet_connection_sock.c:1358) tcp_check_req (net/ipv4/tcp_minisocks.c:868) tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2260) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) ip_local_deliver_finish (net/ipv4/ip_input.c:234) __netif_receive_skb_one_core (net/core/dev.c:5529) process_backlog (./include/linux/rcupdate.h:779) __napi_poll (net/core/dev.c:6533) net_rx_action (net/core/dev.c:6604) __do_softirq (./arch/x86/include/asm/jump_label.h:27) do_softirq (kernel/softirq.c:454 kernel/softirq.c:441) </IRQ> <TASK> __local_bh_enable_ip (kernel/softirq.c:381) __dev_queue_xmit (net/core/dev.c:4374) ip_finish_output2 (./include/net/neighbour.h:540 net/ipv4/ip_output.c:235) __ip_queue_xmit (net/ipv4/ip_output.c:535) __tcp_transmit_skb (net/ipv4/tcp_output.c:1462) tcp_rcv_synsent_state_process (net/ipv4/tcp_input.c:6469) tcp_rcv_state_process (net/ipv4/tcp_input.c:6657) tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1929) __release_sock (./include/net/sock.h:1121 net/core/sock.c:2968) release_sock (net/core/sock.c:3536) inet_wait_for_connect (net/ipv4/af_inet.c:609) __inet_stream_connect (net/ipv4/af_inet.c:702) inet_stream_connect (net/ipv4/af_inet.c:748) __sys_connect (./include/linux/file.h:45 net/socket.c:2064) __x64_sys_connect (net/socket.c:2073 net/socket.c:2070 net/socket.c:2070) do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) RIP: 0033:0x7fa10ff05a3d Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48 RSP: 002b:00007fa110183de8 EFLAGS: 00000202 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020000054 RCX: 00007fa10ff05a3d RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003 RBP: 00007fa110183e20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fa110184640 R13: 0000000000000000 R14: 00007fa10fe8b060 R15: 00007fff73e23b20 </TASK> The issue triggering process is analyzed as follows: Thread A Thread B tcp_v4_rcv //receive ack TCP packet inet_shutdown tcp_check_req tcp_disconnect //disconnect sock ... tcp_set_state(sk, TCP_CLOSE) inet_csk_complete_hashdance ... inet_csk_reqsk_queue_add inet_listen //start listen spin_lock(&queue->rskq_lock) inet_csk_listen_start ... reqsk_queue_alloc ... spin_lock_init spin_unlock(&queue->rskq_lock) //warning When the socket receives the ACK packet during the three-way handshake, it will hold spinlock. And then the user actively shutdowns the socket and listens to the socket immediately, the spinlock will be initialized. When the socket is going to release the spinlock, a warning is generated. Also the same issue to fastopenq.lock. Move init spinlock to inet_create and inet_accept to make sure init the accept_queue's spinlocks once. Fixes: |
||
Linus Torvalds
|
736b5545d3 |
Including fixes from bpf and netfilter.
Previous releases - regressions: - Revert "net: rtnetlink: Enslave device before bringing it up", breaks the case inverse to the one it was trying to fix - net: dsa: fix oob access in DSA's netdevice event handler dereference netdev_priv() before check its a DSA port - sched: track device in tcf_block_get/put_ext() only for clsact binder types - net: tls, fix WARNING in __sk_msg_free when record becomes full during splice and MORE hint set - sfp-bus: fix SFP mode detect from bitrate - drv: stmmac: prevent DSA tags from breaking COE Previous releases - always broken: - bpf: fix no forward progress in in bpf_iter_udp if output buffer is too small - bpf: reject variable offset alu on registers with a type of PTR_TO_FLOW_KEYS to prevent oob access - netfilter: tighten input validation - net: add more sanity check in virtio_net_hdr_to_skb() - rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid infinite loop - amt: do not use the portion of skb->cb area which may get clobbered - mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option Misc: - spring cleanup of inactive maintainers Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmWpnvoACgkQMUZtbf5S Irvskg/+Or5tETxOmpQXxnj6ECZyrSp0Jcyd7+TIcos/7JfPdn3Kebl004SG4h/s bwKDOIIP1iSjQ+0NFsPjyYIVd6wFuCElSB7npV5uQAT6ptXx7A4Ym68/rVxodI8T 6hiYV/mlPuZF8JjRhtp/VJL8sY1qnG7RIUB4oH3y9HQNfwZX0lIWChuUilHuWfbq zQ2Iu97tMkoIBjXrkIT3Qaj0aFxYbjCOrg9zy+FZ69a7Rmrswr//7amlCH6saNTx Ku7Wl8FXhe7O23OiM6GSl7AechSM1aJ5kOS3orseej0+aSp9eH3ekYGmbsQr6sjz ix/eZ7V7SUkJK3bEH5haeymk4TDV3lHE8SziMbosK4wVbHOyPwEmqCxppADYJLZs WycHZKcTBluFBOxknAofH7m5Hh0ToXkeTfpptSSGtRB4WncAOMsMapr3yS4WXg/q AnOo/tzCBgMrnSJtD/kjqgUiCk8vYoLc8lBR9K74l0zqI1sf13OfuTHvEgqIS6z1 Ir/ewlAV6fCH8gQbyzjKUVlyjZS4+vFv19xg/2GgLf+LdyzcCOxUZkND3/DE6+OA Dgf9gtABYU4hGXMUfTfml3KCBTF65QmY8dIh17zraNylYUHEJ2lI4D+sdiqWUrXb mXPBJh4nOPwIV5t2gT80skNwF3aWPr6l4ieY2codSbP04rO74S8= =YhQQ -----END PGP SIGNATURE----- Merge tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Previous releases - regressions: - Revert "net: rtnetlink: Enslave device before bringing it up", breaks the case inverse to the one it was trying to fix - net: dsa: fix oob access in DSA's netdevice event handler dereference netdev_priv() before check its a DSA port - sched: track device in tcf_block_get/put_ext() only for clsact binder types - net: tls, fix WARNING in __sk_msg_free when record becomes full during splice and MORE hint set - sfp-bus: fix SFP mode detect from bitrate - drv: stmmac: prevent DSA tags from breaking COE Previous releases - always broken: - bpf: fix no forward progress in in bpf_iter_udp if output buffer is too small - bpf: reject variable offset alu on registers with a type of PTR_TO_FLOW_KEYS to prevent oob access - netfilter: tighten input validation - net: add more sanity check in virtio_net_hdr_to_skb() - rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid infinite loop - amt: do not use the portion of skb->cb area which may get clobbered - mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option Misc: - spring cleanup of inactive maintainers" * tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits) i40e: Include types.h to some headers ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes selftests: mlxsw: qos_pfc: Remove wrong description mlxsw: spectrum_router: Register netdevice notifier before nexthop mlxsw: spectrum_acl_tcam: Fix stack corruption mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure ethtool: netlink: Add missing ethnl_ops_begin/complete selftests: bonding: Add more missing config options selftests: netdevsim: add a config file libbpf: warn on unexpected __arg_ctx type when rewriting BTF selftests/bpf: add tests confirming type logic in kernel for __arg_ctx bpf: enforce types for __arg_ctx-tagged arguments in global subprogs bpf: extract bpf_ctx_convert_map logic and make it more reusable libbpf: feature-detect arg:ctx tag support in kernel ipvs: avoid stat macros calls from preemptible context netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description netfilter: nf_tables: skip dead set elements in netlink dump netfilter: nf_tables: do not allow mismatch field size and set key length ... |
||
Marc Kleine-Budde
|
894d750831 |
net: netdev_queue: netdev_txq_completed_mb(): fix wake condition
netif_txq_try_stop() uses "get_desc >= start_thrs" as the check for
the call to netif_tx_start_queue().
Use ">=" i netdev_txq_completed_mb(), too.
Fixes:
|
||
Linus Torvalds
|
bf9ca811bb |
RDMA v6.8 merge window
Small cycle, with some typical driver updates - General code tidying in siw, hfi1, idrdma, usnic, hns rtrs and bnxt_re - Many small siw cleanups without an overeaching theme - Debugfs stats for hns - Fix a TX queue timeout in IPoIB and missed locking of the mcast list - Support more features of P7 devices in bnxt_re including a new work submission protocol - CQ interrupts for MANA - netlink stats for erdma - EFA multipath PCI support - Fix Incorrect MR invalidation in iser -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZaCQDQAKCRCFwuHvBreF YemEAQCTGebv0k2hbocDOmKml5awt8j9aDJX3aO7Zpfi0AYUtwEAzk+kgN4yAo+B Vinvpu171zry+QvmGJsXv2mtZkXH6QY= =HT3p -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Small cycle, with some typical driver updates: - General code tidying in siw, hfi1, idrdma, usnic, hns rtrs and bnxt_re - Many small siw cleanups without an overeaching theme - Debugfs stats for hns - Fix a TX queue timeout in IPoIB and missed locking of the mcast list - Support more features of P7 devices in bnxt_re including a new work submission protocol - CQ interrupts for MANA - netlink stats for erdma - EFA multipath PCI support - Fix Incorrect MR invalidation in iser" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits) RDMA/bnxt_re: Fix error code in bnxt_re_create_cq() RDMA/efa: Add EFA query MR support IB/iser: Prevent invalidating wrong MR RDMA/erdma: Add hardware statistics support RDMA/erdma: Introduce dma pool for hardware responses of CMDQ requests IB/iser: iscsi_iser.h: fix kernel-doc warning and spellos RDMA/mana_ib: Add CQ interrupt support for RAW QP RDMA/mana_ib: query device capabilities RDMA/mana_ib: register RDMA device with GDMA RDMA/bnxt_re: Fix the sparse warnings RDMA/bnxt_re: Fix the offset for GenP7 adapters for user applications RDMA/bnxt_re: Share a page to expose per CQ info with userspace RDMA/bnxt_re: Add UAPI to share a page with user space IB/ipoib: Fix mcast list locking RDMA/mlx5: Expose register c0 for RDMA device net/mlx5: E-Switch, expose eswitch manager vport net/mlx5: Manage ICM type of SW encap RDMA/mlx5: Support handling of SW encap ICM area net/mlx5: Introduce indirect-sw-encap ICM properties RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters ... |
||
Linus Torvalds
|
3e7aeb78ab |
Networking changes for 6.8.
Core & protocols ---------------- - Analyze and reorganize core networking structs (socks, netdev, netns, mibs) to optimize cacheline consumption and set up build time warnings to safeguard against future header changes. This improves TCP performances with many concurrent connections up to 40%. - Add page-pool netlink-based introspection, exposing the memory usage and recycling stats. This helps indentify bad PP users and possible leaks. - Refine TCP/DCCP source port selection to no longer favor even source port at connect() time when IP_LOCAL_PORT_RANGE is set. This lowers the time taken by connect() for hosts having many active connections to the same destination. - Refactor the TCP bind conflict code, shrinking related socket structs. - Refactor TCP SYN-Cookie handling, as a preparation step to allow arbitrary SYN-Cookie processing via eBPF. - Tune optmem_max for 0-copy usage, increasing the default value to 128KB and namespecifying it. - Allow coalescing for cloned skbs coming from page pools, improving RX performances with some common configurations. - Reduce extension header parsing overhead at GRO time. - Add bridge MDB bulk deletion support, allowing user-space to request the deletion of matching entries. - Reorder nftables struct members, to keep data accessed by the datapath first. - Introduce TC block ports tracking and use. This allows supporting multicast-like behavior at the TC layer. - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and classifiers (RSVP and tcindex). - More data-race annotations. - Extend the diag interface to dump TCP bound-only sockets. - Conditional notification of events for TC qdisc class and actions. - Support for WPAN dynamic associations with nearby devices, to form a sub-network using a specific PAN ID. - Implement SMCv2.1 virtual ISM device support. - Add support for Batman-avd mulicast packet type. BPF --- - Tons of verifier improvements: - BPF register bounds logic and range support along with a large test suite - log improvements - complete precision tracking support for register spills - track aligned STACK_ZERO cases as imprecise spilled registers. It improves the verifier "instructions processed" metric from single digit to 50-60% for some programs - support for user's global BPF subprogram arguments with few commonly requested annotations for a better developer experience - support tracking of BPF_JNE which helps cases when the compiler transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like - several fixes - Add initial TX metadata implementation for AF_XDP with support in mlx5 and stmmac drivers. Two types of offloads are supported right now, that is, TX timestamp and TX checksum offload. - Fix kCFI bugs in BPF all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y. - Change BPF verifier logic to validate global subprograms lazily instead of unconditionally before the main program, so they can be guarded using BPF CO-RE techniques. - Support uid/gid options when mounting bpffs. - Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id. - Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext. - Add BPF link_info support for uprobe multi link along with bpftool integration for the latter. - Support for VLAN tag in XDP hints. - Remove deprecated bpfilter kernel leftovers given the project is developed in user-space (https://github.com/facebook/bpfilter). Misc ---- - Support for parellel TC self-tests execution. - Increase MPTCP self-tests coverage. - Updated the bridge documentation, including several so-far undocumented features. - Convert all the net self-tests to run in unique netns, to avoid random failures due to conflict and allow concurrent runs. - Add TCP-AO self-tests. - Add kunit tests for both cfg80211 and mac80211. - Autogenerate Netlink families documentation from YAML spec. - Add yml-gen support for fixed headers and recursive nests, the tool can now generate user-space code for all genetlink families for which we have specs. - A bunch of additional module descriptions fixes. - Catch incorrect freeing of pages belonging to a page pool. Driver API ---------- - Rust abstractions for network PHY drivers; do not cover yet the full C API, but already allow implementing functional PHY drivers in rust. - Introduce queue and NAPI support in the netdev Netlink interface, allowing complete access to the device <> NAPIs <> queues relationship. - Introduce notifications filtering for devlink to allow control application scale to thousands of instances. - Improve PHY validation, requesting rate matching information for each ethtool link mode supported by both the PHY and host. - Add support for ethtool symmetric-xor RSS hash. - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD platform. - Expose pin fractional frequency offset value over new DPLL generic netlink attribute. - Convert older drivers to platform remove callback returning void. - Add support for PHY package MMD read/write. New hardware / drivers ---------------------- - Ethernet: - Octeon CN10K devices - Broadcom 5760X P7 - Qualcomm SM8550 SoC - Texas Instrument DP83TG720S PHY - Bluetooth: - IMC Networks Bluetooth radio Removed ------- - WiFi: - libertas 16-bit PCMCIA support - Atmel at76c50x drivers - HostAP ISA/PCMCIA style 802.11b driver - zd1201 802.11b USB dongles - Orinoco ISA/PCMCIA 802.11b driver - Aviator/Raytheon driver - Planet WL3501 driver - RNDIS USB 802.11b driver Drivers ------- - Ethernet high-speed NICs: - Intel (100G, ice, idpf): - allow one by one port representors creation and removal - add temperature and clock information reporting - add get/set for ethtool's header split ringparam - add again FW logging - adds support switchdev hardware packet mirroring - iavf: implement symmetric-xor RSS hash - igc: add support for concurrent physical and free-running timers - i40e: increase the allowable descriptors - nVidia/Mellanox: - Preparation for Socket-Direct multi-dev netdev. That will allow in future releases combining multiple PFs devices attached to different NUMA nodes under the same netdev - Broadcom (bnxt): - TX completion handling improvements - add basic ntuple filter support - reduce MSIX vectors usage for MQPRIO offload - add VXLAN support, USO offload and TX coalesce completion for P7 - Marvell Octeon EP: - xmit-more support - add PF-VF mailbox support and use it for FW notifications for VFs - Wangxun (ngbe/txgbe): - implement ethtool functions to operate pause param, ring param, coalesce channel number and msglevel - Netronome/Corigine (nfp): - add flow-steering support - support UDP segmentation offload - Ethernet NICs embedded, slower, virtual: - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver - stmmac: add support for HW-accelerated VLAN stripping - TI AM654x sw: add mqprio, frame preemption & coalescing - gve: add support for non-4k page sizes. - virtio-net: support dynamic coalescing moderation - nVidia/Mellanox Ethernet datacenter switches: - allow firmware upgrade without a reboot - more flexible support for bridge flooding via the compressed FID flooding mode - Ethernet embedded switches: - Microchip: - fine-tune flow control and speed configurations in KSZ8xxx - KSZ88X3: enable setting rmii reference - Renesas: - add jumbo frames support - Marvell: - 88E6xxx: add "eth-mac" and "rmon" stats support - Ethernet PHYs: - aquantia: add firmware load support - at803x: refactor the driver to simplify adding support for more chip variants - NXP C45 TJA11xx: Add MACsec offload support - Wifi: - MediaTek (mt76): - NVMEM EEPROM improvements - mt7996 Extremely High Throughput (EHT) improvements - mt7996 Wireless Ethernet Dispatcher (WED) support - mt7996 36-bit DMA support - Qualcomm (ath12k): - support for a single MSI vector - WCN7850: support AP mode - Intel (iwlwifi): - new debugfs file fw_dbg_clear - allow concurrent P2P operation on DFS channels - Bluetooth: - QCA2066: support HFP offload - ISO: more broadcast-related improvements - NXP: better recovery in case receiver/transmitter get out of sync Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35 98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF ucqPEcZB5cRE =1bW6 -----END PGP SIGNATURE----- Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "The most interesting thing is probably the networking structs reorganization and a significant amount of changes is around self-tests. Core & protocols: - Analyze and reorganize core networking structs (socks, netdev, netns, mibs) to optimize cacheline consumption and set up build time warnings to safeguard against future header changes This improves TCP performances with many concurrent connections up to 40% - Add page-pool netlink-based introspection, exposing the memory usage and recycling stats. This helps indentify bad PP users and possible leaks - Refine TCP/DCCP source port selection to no longer favor even source port at connect() time when IP_LOCAL_PORT_RANGE is set. This lowers the time taken by connect() for hosts having many active connections to the same destination - Refactor the TCP bind conflict code, shrinking related socket structs - Refactor TCP SYN-Cookie handling, as a preparation step to allow arbitrary SYN-Cookie processing via eBPF - Tune optmem_max for 0-copy usage, increasing the default value to 128KB and namespecifying it - Allow coalescing for cloned skbs coming from page pools, improving RX performances with some common configurations - Reduce extension header parsing overhead at GRO time - Add bridge MDB bulk deletion support, allowing user-space to request the deletion of matching entries - Reorder nftables struct members, to keep data accessed by the datapath first - Introduce TC block ports tracking and use. This allows supporting multicast-like behavior at the TC layer - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and classifiers (RSVP and tcindex) - More data-race annotations - Extend the diag interface to dump TCP bound-only sockets - Conditional notification of events for TC qdisc class and actions - Support for WPAN dynamic associations with nearby devices, to form a sub-network using a specific PAN ID - Implement SMCv2.1 virtual ISM device support - Add support for Batman-avd mulicast packet type BPF: - Tons of verifier improvements: - BPF register bounds logic and range support along with a large test suite - log improvements - complete precision tracking support for register spills - track aligned STACK_ZERO cases as imprecise spilled registers. This improves the verifier "instructions processed" metric from single digit to 50-60% for some programs - support for user's global BPF subprogram arguments with few commonly requested annotations for a better developer experience - support tracking of BPF_JNE which helps cases when the compiler transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like - several fixes - Add initial TX metadata implementation for AF_XDP with support in mlx5 and stmmac drivers. Two types of offloads are supported right now, that is, TX timestamp and TX checksum offload - Fix kCFI bugs in BPF all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y - Change BPF verifier logic to validate global subprograms lazily instead of unconditionally before the main program, so they can be guarded using BPF CO-RE techniques - Support uid/gid options when mounting bpffs - Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id - Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext - Add BPF link_info support for uprobe multi link along with bpftool integration for the latter - Support for VLAN tag in XDP hints - Remove deprecated bpfilter kernel leftovers given the project is developed in user-space (https://github.com/facebook/bpfilter) Misc: - Support for parellel TC self-tests execution - Increase MPTCP self-tests coverage - Updated the bridge documentation, including several so-far undocumented features - Convert all the net self-tests to run in unique netns, to avoid random failures due to conflict and allow concurrent runs - Add TCP-AO self-tests - Add kunit tests for both cfg80211 and mac80211 - Autogenerate Netlink families documentation from YAML spec - Add yml-gen support for fixed headers and recursive nests, the tool can now generate user-space code for all genetlink families for which we have specs - A bunch of additional module descriptions fixes - Catch incorrect freeing of pages belonging to a page pool Driver API: - Rust abstractions for network PHY drivers; do not cover yet the full C API, but already allow implementing functional PHY drivers in rust - Introduce queue and NAPI support in the netdev Netlink interface, allowing complete access to the device <> NAPIs <> queues relationship - Introduce notifications filtering for devlink to allow control application scale to thousands of instances - Improve PHY validation, requesting rate matching information for each ethtool link mode supported by both the PHY and host - Add support for ethtool symmetric-xor RSS hash - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD platform - Expose pin fractional frequency offset value over new DPLL generic netlink attribute - Convert older drivers to platform remove callback returning void - Add support for PHY package MMD read/write New hardware / drivers: - Ethernet: - Octeon CN10K devices - Broadcom 5760X P7 - Qualcomm SM8550 SoC - Texas Instrument DP83TG720S PHY - Bluetooth: - IMC Networks Bluetooth radio Removed: - WiFi: - libertas 16-bit PCMCIA support - Atmel at76c50x drivers - HostAP ISA/PCMCIA style 802.11b driver - zd1201 802.11b USB dongles - Orinoco ISA/PCMCIA 802.11b driver - Aviator/Raytheon driver - Planet WL3501 driver - RNDIS USB 802.11b driver Driver updates: - Ethernet high-speed NICs: - Intel (100G, ice, idpf): - allow one by one port representors creation and removal - add temperature and clock information reporting - add get/set for ethtool's header split ringparam - add again FW logging - adds support switchdev hardware packet mirroring - iavf: implement symmetric-xor RSS hash - igc: add support for concurrent physical and free-running timers - i40e: increase the allowable descriptors - nVidia/Mellanox: - Preparation for Socket-Direct multi-dev netdev. That will allow in future releases combining multiple PFs devices attached to different NUMA nodes under the same netdev - Broadcom (bnxt): - TX completion handling improvements - add basic ntuple filter support - reduce MSIX vectors usage for MQPRIO offload - add VXLAN support, USO offload and TX coalesce completion for P7 - Marvell Octeon EP: - xmit-more support - add PF-VF mailbox support and use it for FW notifications for VFs - Wangxun (ngbe/txgbe): - implement ethtool functions to operate pause param, ring param, coalesce channel number and msglevel - Netronome/Corigine (nfp): - add flow-steering support - support UDP segmentation offload - Ethernet NICs embedded, slower, virtual: - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver - stmmac: add support for HW-accelerated VLAN stripping - TI AM654x sw: add mqprio, frame preemption & coalescing - gve: add support for non-4k page sizes. - virtio-net: support dynamic coalescing moderation - nVidia/Mellanox Ethernet datacenter switches: - allow firmware upgrade without a reboot - more flexible support for bridge flooding via the compressed FID flooding mode - Ethernet embedded switches: - Microchip: - fine-tune flow control and speed configurations in KSZ8xxx - KSZ88X3: enable setting rmii reference - Renesas: - add jumbo frames support - Marvell: - 88E6xxx: add "eth-mac" and "rmon" stats support - Ethernet PHYs: - aquantia: add firmware load support - at803x: refactor the driver to simplify adding support for more chip variants - NXP C45 TJA11xx: Add MACsec offload support - Wifi: - MediaTek (mt76): - NVMEM EEPROM improvements - mt7996 Extremely High Throughput (EHT) improvements - mt7996 Wireless Ethernet Dispatcher (WED) support - mt7996 36-bit DMA support - Qualcomm (ath12k): - support for a single MSI vector - WCN7850: support AP mode - Intel (iwlwifi): - new debugfs file fw_dbg_clear - allow concurrent P2P operation on DFS channels - Bluetooth: - QCA2066: support HFP offload - ISO: more broadcast-related improvements - NXP: better recovery in case receiver/transmitter get out of sync" * tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits) lan78xx: remove redundant statement in lan78xx_get_eee lan743x: remove redundant statement in lan743x_ethtool_get_eee bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer() bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel() bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter() tcp: Revert no longer abort SYN_SENT when receiving some ICMP Revert "mlx5 updates 2023-12-20" Revert "net: stmmac: Enable Per DMA Channel interrupt" ipvlan: Remove usage of the deprecated ida_simple_xx() API ipvlan: Fix a typo in a comment net/sched: Remove ipt action tests net: stmmac: Use interrupt mode INTM=1 for per channel irq net: stmmac: Add support for TX/RX channel interrupt net: stmmac: Make MSI interrupt routine generic dt-bindings: net: snps,dwmac: per channel irq net: phy: at803x: make read_status more generic net: phy: at803x: add support for cdt cross short test for qca808x net: phy: at803x: refactor qca808x cable test get status function net: phy: at803x: generalize cdt fault length function net: ethernet: cortina: Drop TSO support ... |
||
Linus Torvalds
|
0c59ae1290 |
AFS fileserver rotation fix
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmWYJ6kACgkQ+7dXa6fL C2v6YBAAkDdqgWN96h2KOcd+El13Uxa3WNDjTHtzc0ZhjEDkzkU42sSF2yE0nerS 6kX18vibXC+TPnbBn1gOSGrVoFIC1kh/vUjrz/UQYfxXN19P8LE2wSdl+bC4nPT1 Qkrxkr+q4GSSJoYg9QUUAu0Hh2PvXMeDE/XyED6XiAkuDUbISO9yDeu+wo3wZM5L 1e8vRlg/2EQl2v1Crh5nC0tgJZbGULc2mCqi/rU5A9wdlKHFzwjU+2PTsbQNKE0m 0ueLblFeFRwBZpOfAUNNVAt3bwaSfhYpqUiiSldrU/JXhnx5CgY1kHzI3OPVedQt WMfp/epwO848i3qVM8dHJXc93NJeC3gTBK7gYRrH07MuK3Of1KRH3D8YBsE0/r0q NVcDQ6/eoni06CA8VMfSIEQ2+Q0m4xxUzAQURsOxRPY/FktzCKXMfpYTDZqbQfow SXrKmsPnMZe4DUnvdcTSU8B3+vybJH/JgEnZXRtCPOYNDSyMcPhKPG2ioOz4UV+M amQmpYfG4hzi1VmRrH57dwlXejBX16+zc9pLdZC5c0/phk3caYrJVMA8pwCOP4HM AvB5Yl6gH2aGj1kKjffL7nWnQ2QbD7VWUn98TqLPezOX7DwQHMMKvlfPnv6R87sy 0HMmj9VxCgOvGLOf1JdQoTxtb49ndM4Y5fPvKYK2awW5FkAacLM= =bHoG -----END PGP SIGNATURE----- Merge tag 'afs-fix-rotation-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs updates from David Howells: "The majority of the patches are aimed at fixing and improving the AFS filesystem's rotation over server IP addresses, but there are also some fixes from Oleg Nesterov for the use of read_seqbegin_or_lock(). - Fix fileserver probe handling so that the next round of probes doesn't break ongoing server/address rotation by clearing all the probe result tracking. This could occasionally cause the rotation algorithm to drop straight through, give a 'successful' result without actually emitting any RPC calls, leaving the reply buffer in an undefined state. Instead, detach the probe results into a separate struct and allocate a new one each time we start probing and update the pointer to it. Probes are also sent in order of address preference to try and improve the chance that the preferred one will complete first. - Fix server rotation so that it uses configurable address preferences across on the probes that have completed so far than ranking them by RTT as the latter doesn't necessarily give the best route. The preference list can be altered by writing into /proc/net/afs/addr_prefs. - Fix the handling of Read-Only (and Backup) volume callbacks as there is one per volume, not one per file, so if someone performs a command that, say, offlines the volume but doesn't change it, when it comes back online we don't spam the server with a status fetch for every vnode we're using. Instead, check the Creation timestamp in the VolSync record when prompted by a callback break. - Handle volume regression (ie. a RW volume being restored from a backup) by scrubbing all cache data for that volume. This is detected from the VolSync creation timestamp. - Adjust abort handling and abort -> error mapping to match better with what other AFS clients do. - Fix offline and busy volume state handling as they only apply to individual server instances and not entire volumes and the rotation algorithm should go and look at other servers if available. Also make it sleep briefly before each retry if all the volume instances are unavailable" * tag 'afs-fix-rotation-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (40 commits) afs: trace: Log afs_make_call(), including server address afs: Fix offline and busy message emission afs: Fix fileserver rotation afs: Overhaul invalidation handling to better support RO volumes afs: Parse the VolSync record in the reply of a number of RPC ops afs: Don't leave DONTUSE/NEWREPSITE servers out of server list afs: Fix comment in afs_do_lookup() afs: Apply server breaks to mmap'd files in the call processor afs: Move the vnode/volume validity checking code into its own file afs: Defer volume record destruction to a workqueue afs: Make it possible to find the volumes that are using a server afs: Combine the endpoint state bools into a bitmask afs: Keep a record of the current fileserver endpoint state afs: Dispatch vlserver probes in priority order afs: Dispatch fileserver probes in priority order afs: Mark address lists with configured priorities afs: Provide a way to configure address priorities afs: Remove the unimplemented afs_cmp_addr_list() afs: Add some more info to /proc/net/afs/servers rxrpc: Create a procfile to display outstanding client conn bundles ... |
||
Linus Torvalds
|
c604110e66 |
vfs-6.8.misc
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZZUxRQAKCRCRxhvAZXjc ov/QAQDzvge3oQ9MEymmOiyzzcF+HhAXBr+9oEsYJjFc1p0TsgEA61gXjZo7F1jY KBqd6znOZCR+Waj0kIVJRAo/ISRBqQc= =0bRl -----END PGP SIGNATURE----- Merge tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Add Jan Kara as VFS reviewer - Show correct device and inode numbers in proc/<pid>/maps for vma files on stacked filesystems. This is now easily doable thanks to the backing file work from the last cycles. This comes with selftests Cleanups: - Remove a redundant might_sleep() from wait_on_inode() - Initialize pointer with NULL, not 0 - Clarify comment on access_override_creds() - Rework and simplify eventfd_signal() and eventfd_signal_mask() helpers - Process aio completions in batches to avoid needless wakeups - Completely decouple struct mnt_idmap from namespaces. We now only keep the actual idmapping around and don't stash references to namespaces - Reformat maintainer entries to indicate that a given subsystem belongs to fs/ - Simplify fput() for files that were never opened - Get rid of various pointless file helpers - Rename various file helpers - Rename struct file members after SLAB_TYPESAFE_BY_RCU switch from last cycle - Make relatime_need_update() return bool - Use GFP_KERNEL instead of GFP_USER when allocating superblocks - Replace deprecated ida_simple_*() calls with their current ida_*() counterparts Fixes: - Fix comments on user namespace id mapping helpers. They aren't kernel doc comments so they shouldn't be using /** - s/Retuns/Returns/g in various places - Add missing parameter documentation on can_move_mount_beneath() - Rename i_mapping->private_data to i_mapping->i_private_data - Fix a false-positive lockdep warning in pipe_write() for watch queues - Improve __fget_files_rcu() code generation to improve performance - Only notify writer that pipe resizing has finished after setting pipe->max_usage otherwise writers are never notified that the pipe has been resized and hang - Fix some kernel docs in hfsplus - s/passs/pass/g in various places - Fix kernel docs in ntfs - Fix kcalloc() arguments order reported by gcc 14 - Fix uninitialized value in reiserfs" * tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits) reiserfs: fix uninit-value in comp_keys watch_queue: fix kcalloc() arguments order ntfs: dir.c: fix kernel-doc function parameter warnings fs: fix doc comment typo fs tree wide selftests/overlayfs: verify device and inode numbers in /proc/pid/maps fs/proc: show correct device and inode numbers in /proc/pid/maps eventfd: Remove usage of the deprecated ida_simple_xx() API fs: super: use GFP_KERNEL instead of GFP_USER for super block allocation fs/hfsplus: wrapper.c: fix kernel-doc warnings fs: add Jan Kara as reviewer fs/inode: Make relatime_need_update return bool pipe: wakeup wr_wait after setting max_usage file: remove __receive_fd() file: stop exposing receive_fd_user() fs: replace f_rcuhead with f_task_work file: remove pointless wrapper file: s/close_fd_get_file()/file_close_fd()/g Improve __fget_files_rcu() code generation (and thus __fget_light()) file: massage cleanup of files that failed to open fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() ... |
||
Pedro Tammela
|
405cd9fc6f |
net/sched: simplify tc_action_load_ops parameters
Instead of using two bools derived from a flags passed as arguments to the parent function of tc_action_load_ops, just pass the flags itself to tc_action_load_ops to simplify its parameters. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Jakub Kicinski
|
e63c1822ac |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c |
||
Jakub Kicinski
|
172b3fccf5 |
This pull request mainly brings support for dynamic associations in the
WPAN world. Thanks to the recent improvements it was possible to discover nearby devices, it is now also possible to associate with them to form a sub-network using a specific PAN ID. The support includes several functions, such as: * Requesting an association to a coordinator, waiting for the response * Sending a disassociation notification to a coordinator * Receiving an association request when we are coordinator, answering the request (for now all devices are accepted up to a limit, to be refined) * Sending a disassociation notification to a child * Users may request the list of associated devices (the parent and the children). Here are a few example of userspace calls that can be made: iwpan dev <dev> associate pan_id 2 coord $COORD iwpan dev <dev> list_associations iwpan dev <dev> disassociate ext_addr $COORD There are as well two patches from Uwe turning remove callbacks into void functions. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAmWCqiYACgkQJWrqGEe9 VoQxEwf+KzU7esLoTgtIRG+wIQyMaBm8qZ/YNdWwgEiD+kwbybdy258etavumIMN HSKxRypXqJwR+R4b9l6yhrLmwn83HNR0L1oq6sHooBD+Zr90Ac8NNgONcF30iRQS jSiQ3A7H4kTqEaSsZe3olLotVnhTrLvL/TCXt83FPst551Fj0VSDyRnrbETEmq5F s+WJcB8rGYWY6avv5CbWcUDOPPq7yiZlpgcA+qTIDFMUfD90p+xqUZQ2hZKlscMB Ia/FPW3gUufWFGd16d1WBaHD38DoLobfo/6Ou/Sufp+ErW17ZOGl4wNnViXlOpwD 0BUEO6vhtAx+83VYAC522PABWHkxyw== =THkg -----END PGP SIGNATURE----- Merge tag 'ieee802154-for-net-next-2023-12-20' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next Miquel Raynal says: ==================== This pull request mainly brings support for dynamic associations in the WPAN world. Thanks to the recent improvements it was possible to discover nearby devices, it is now also possible to associate with them to form a sub-network using a specific PAN ID. The support includes several functions, such as: * Requesting an association to a coordinator, waiting for the response * Sending a disassociation notification to a coordinator * Receiving an association request when we are coordinator, answering the request (for now all devices are accepted up to a limit, to be refined) * Sending a disassociation notification to a child * Users may request the list of associated devices (the parent and the children). Here are a few example of userspace calls that can be made: # iwpan dev <dev> associate pan_id 2 coord $COORD # iwpan dev <dev> list_associations # iwpan dev <dev> disassociate ext_addr $COORD There are as well two patches from Uwe turning remove callbacks into void functions. * tag 'ieee802154-for-net-next-2023-12-20' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next: mac802154: Avoid new associations while disassociating ieee802154: Avoid confusing changes after associating mac802154: Only allow PAN controllers to process association requests mac802154: Use the PAN coordinator parameter when stamping packets mac80254: Provide real PAN coordinator info in beacons ieee802154: Give the user the association list mac802154: Handle disassociation notifications from peers mac802154: Follow the number of associated devices ieee802154: Add support for limiting the number of associated devices mac802154: Handle association requests from peers mac802154: Handle disassociations ieee802154: Add support for user disassociation requests mac802154: Handle associating ieee802154: Add support for user association requests ieee802154: Internal PAN management ieee802154: Let PAN IDs be reset ieee802154: hwsim: Convert to platform remove callback returning void ieee802154: fakelb: Convert to platform remove callback returning void ==================== Link: https://lore.kernel.org/r/20231220095556.4d9cef91@xps-13 Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Dmitry Safonov
|
4c8530dc7d |
net/tcp: Only produce AO/MD5 logs if there are any keys
User won't care about inproper hash options in the TCP header if they
don't use neither TCP-AO nor TCP-MD5. Yet, those logs can add up in
syslog, while not being a real concern to the host admin:
> kernel: TCP: TCP segment has incorrect auth options set for XX.20.239.12.54681->XX.XX.90.103.80 [S]
Keep silent and avoid logging when there aren't any keys in the system.
Side-note: I also defined static_branch_tcp_*() helpers to avoid more
ifdeffery, going to remove more ifdeffery further with their help.
Reported-by: Christian Kujau <lists@nerdbynature.de>
Closes: https://lore.kernel.org/all/f6b59324-1417-566f-a976-ff2402718a8d@nerdbynature.de/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Fixes:
|
||
Pedro Tammela
|
c2a67de9bb |
net/sched: introduce ACT_P_BOUND return code
Bound actions always return '0' and as of today we rely on '0' being returned in order to properly skip bound actions in tcf_idr_insert_many. In order to further improve maintainability, introduce the ACT_P_BOUND return code. Actions are updated to return 'ACT_P_BOUND' instead of plain '0'. tcf_idr_insert_many is then updated to check for 'ACT_P_BOUND'. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231229132642.1489088-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Zhengchao Shao
|
b4c1d4d973 |
fib: remove unnecessary input parameters in fib_default_rule_add
When fib_default_rule_add is invoked, the value of the input parameter 'flags' is always 0. Rules uses kzalloc to allocate memory, so 'flags' has been initialized to 0. Therefore, remove the input parameter 'flags' in fib_default_rule_add. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240102071519.3781384-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Vladimir Oltean
|
8dc4c41000 |
xsk: make struct xsk_cb_desc available outside CONFIG_XDP_SOCKETS
The ice driver fails to build when CONFIG_XDP_SOCKETS is disabled.
drivers/net/ethernet/intel/ice/ice_base.c:533:21: error:
variable has incomplete type 'struct xsk_cb_desc'
struct xsk_cb_desc desc = {};
^
include/net/xsk_buff_pool.h:15:8: note:
forward declaration of 'struct xsk_cb_desc'
struct xsk_cb_desc;
^
Fixes:
|
||
David S. Miller
|
8a48a2dc24 |
bluetooth-next pull request for net-next:
- btnxpuart: Fix recv_buf return value - L2CAP: Fix responding with multiple rejects - Fix atomicity violation in {min,max}_key_size_set - ISO: Allow binding a PA sync socket - ISO: Reassociate a socket with an active BIS - ISO: Avoid creating child socket if PA sync is terminating - Add device 13d3:3572 IMC Networks Bluetooth Radio - Don't suspend when there are connections - Remove le_restart_scan work - Fix bogus check for re-auth not supported with non-ssp - lib: Add documentation to exported functions - Support HFP offload for QCA2066 -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmWF2P4ZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKZ3eEACgIk7eorrf+isIBZjq+qGL 0ESqGe2ZnRupnJs6fRFBY2L8jWkfZbURzUKXVpt0fVrQAb8qzrEzwD/ucApT38SY LO1lzLhPLfUpwXS4jW5pFvA+tLDS6sb9fBd/tp1ikaNavU3FTfWFeGJmJH8nubHc qZqOkBblm34yrdlEZwxiJxviVmbAsHHSu1AWXZ9yTO5Z5HRNUN0jxBYjOPVHkrdx SdP8J+icTMQ7ywoFEOLAnDaJi+bKMPwyGlz6opx75L0LkdbqjFwdu+1X6ZtjLsP3 c51RJtdJgBJY8TX5MmwMEakHQSbcGBUPeEMhZqrrzY1A/2dsb85T4acXgU1iV6Ws SkYFFeEWiBhV/WJn4a/mZzMASRzpyIfTEeVfbl2Qn23wHoBwacYhFGm/be60L0ar jwO/Fk5qaUTWBV5DQAF8gBK+pNxe4phGFNdWwu6v43IG9DfVdvLVjiO9dBqsvTnS 1StqENzmnhqCYJ5+3hyU2uqAYW4DopvuYD5EgMGEfaJ3jzzlWF34Gs4OVCWUBL02 qh1Zy/4VrduuY0FRRt9foj8Ug4IvM2+a84bx4ZhW6N6BhkHOsrKaXPGf0Ng0SRUj vH7xHelsruiE3HGc9pYZyDavXitRbvJxdF2tHD8cxpiGkUch9hTNqSaV+xkHxck6 4bAeyoVlAGKa/wyWvrdcTg== =wVdT -----END PGP SIGNATURE----- Merge tag 'for-net-next-2023-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - btnxpuart: Fix recv_buf return value - L2CAP: Fix responding with multiple rejects - Fix atomicity violation in {min,max}_key_size_set - ISO: Allow binding a PA sync socket - ISO: Reassociate a socket with an active BIS - ISO: Avoid creating child socket if PA sync is terminating - Add device 13d3:3572 IMC Networks Bluetooth Radio - Don't suspend when there are connections - Remove le_restart_scan work - Fix bogus check for re-auth not supported with non-ssp - lib: Add documentation to exported functions - Support HFP offload for QCA2066 ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
David S. Miller
|
a27359abc8 |
wireless-next patches for v6.8
The third "new features" pull request for v6.8. This is a smaller one to clear up our tree before the break and nothing really noteworthy this time. Major changes: stack * cfg80211: introduce cfg80211_ssid_eq() for SSID matching * cfg80211: support P2P operation on DFS channels * mac80211: allow 64-bit radiotap timestamps iwlwifi * AX210: allow concurrent P2P operation on DFS channels -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmWFbnIRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZs5hQf/aCvvTjqeRoMkmO+ZPFMSO+YquZNCJi1M TP8Fce2ALKj7woPad8vdJNNStMa9k4bu2NvShMXhoYM3xOA/4o0P9yeb5OfyYkTk Y6JF+SoBGzABtB3m/a3i5J19F+oC+6yKN6/OY8byfK4jqZdrAprc3qXwodC5zb9n blC16KKlldjoj5AWe/b6Vn/LI9P7mVhZIaWxI9IaktK0eIgfsfcgIZLuuMJPq5DJ NjvhmK++qCcTQrJo/4TMVoWmcZZKR3XzcSs++HYRELNCwcM2q9s07R4KkV81aB0t RpCaCWa2KVUCrKdk3FlnG5pS7A6US5KGP4g6sSQRnq8t3IYDUbDo5Q== =Bte8 -----END PGP SIGNATURE----- Merge tag 'wireless-next-2023-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.8 The third "new features" pull request for v6.8. This is a smaller one to clear up our tree before the break and nothing really noteworthy this time. Major changes: stack * cfg80211: introduce cfg80211_ssid_eq() for SSID matching * cfg80211: support P2P operation on DFS channels * mac80211: allow 64-bit radiotap timestamps iwlwifi * AX210: allow concurrent P2P operation on DFS channels ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Jamal Hadi Salim
|
ba24ea1291 |
net/sched: Retire ipt action
The tc ipt action was intended to run all netfilter/iptables target. Unfortunately it has not benefitted over the years from proper updates when netfilter changes, and for that reason it has remained rudimentary. Pinging a bunch of people that i was aware were using this indicates that removing it wont affect them. Retire it to reduce maintenance efforts. Buh-bye. Reviewed-by: Victor Noguiera <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
David S. Miller
|
109bf4cfe1 |
netfilter pull request 23-12-22
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmWFd9oACgkQ1V2XiooU IOTBog//auEj3LR5v2b5et6CLsuMESsCFkVko1AFdzzpu94sJ8TTkqlrW7SPSfPW CMNYz1EYFaRGK1GjfH0a49EUofeU/kPD/TmffmiYwuJlGyEIXp17ZIUQnGDrM6Mz UBHX8VrW73McJYuZJbxM4HX6Q3aFyshKy8iOPpffdFAQcCD5XtkYbW24Mzw6V4sI 8fBz6C/iT+5Jq1wGtvM2xmGkNYuMLJxbf9qOpinRZCXQV/z5IWda/bFnv3kpJ9mR x2ZYXHnzFEr2gVFIhZ7G6FvMgQAkkKGVQE+n9zVZ9V78czqBk9depdeLaq/RYGyx 8VYqJyaXHKQm7FqsBwSSxU964aQhi/zNrjF9SeJSfcRNhTmXgzdERndVWVjUPe9O rFLyQVoQ6JyKy5++1lL3+63cd6ZeGw8gUw8MCw9DX8rwamnNFunUsixKv4SLzQHA jfFgmAUq+HVWHXI4xmj+SDO8g+s7O3BpbZM09E3si6lwy4/LiQqHhV4sNgEVN/It hf76t/U6GRYZ0Hwy2dwQl8SZg1BxMSGFmzB7vyEZyGXaqxRZb1Jym4h9slOMKXWl gS5y4y0ZVTmzRCqJR+jUrMAKPCw/R964LrslDYwQpsQeP7nuUPeU5jH1VWD6kAbS Vc36uDRnfojoNiRcmAfuhvZIseM1+hI6UXCIsqYR6cekCFlBad8= =+Uvv -----END PGP SIGNATURE----- Merge tag 'nf-next-23-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== netfilter pull request 23-12-22 The following patchset contains Netfilter updates for net-next: 1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a race scenario with two concurrent processes running a dump-and-reset which exposes negative counters to userspace, from Phil Sutter. 2) Use GFP_KERNEL in pipapo GC, from Florian Westphal. 3) Reorder nf_flowtable struct members, place the read-mostly parts accessed by the datapath first. From Florian Westphal. 4) Set on dead flag for NFT_MSG_NEWSET in abort path, from Florian Westphal. 5) Support filtering zone in ctnetlink, from Felix Huettner. 6) Bail out if user tries to redefine an existing chain with different type in nf_tables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Ido Schimmel
|
cd4d7263d5 |
genetlink: Use internal flags for multicast groups
As explained in commit |
||
David S. Miller
|
a4255b2e5c |
netfilter pull request 23-12-20
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmWC/3kACgkQ1V2XiooU IOTDcg/+OJ4UwLrNF7GPjx3Bf76ntgkzqL8GCPYP2IMNgE7F9JBhv5t648tg0XOJ Pf3NAHtS0Trb8bCCwN9SWMKP2Zx/ntPlebrzp+SSlnkqzBRAl2550s+e8tcYKc9y S2XaQiAMOvcamMOxDbKQD2GcWqi05gEpE8w+ov5L1iXMhgFcHtPAm79H8XvqyDaj HKQ2B9b/4XxIexqiCnTWH4RLFq4+w3q1axUcv5GRkEFO/w3fouQ8f5FynjQOcSgp qD3KBVh6tJVTYj5OwhcvIi3BV/n+suiK9tcd0IarDlmUXY2MI0748W+9FLmHbMU6 cl2IhIrVEoyOrBoThlmV6Fq2qVRZlYq/mHfTqEfWLYaqJ2iZ1f+I5nG2Gx+oq45p 7cxSuvHN72QBhzLh1ry0tJItWGNfejnWzf4/71/eSL21wCxijoI2v2TOc8myONLZ qdiSyaU3Kz4blGqnRIMhMNArAkXohqEdfXrFfDSLi6lXBABgh/JmE0eJWXUgV/xU /PBrt+SM07NqUP02J63rvgehlfn5DEYsPt+b15Lnqu0BQNuTJYDRbu2TFNLx9TrR yASWXuqOB/f5mAos0xQT9wG6BTQvBTxgzvuAd9fC0oaAvAEa5JbojPLcNSXoecJO K5priJ0coMme/HAZNgfiw8d+hPWFGScIYFyz89meokIlCpEKdT0= =GlJr -----END PGP SIGNATURE----- Merge tag 'nf-23-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablu Neira Syuso says: ==================== netfilter pull request 23-12-20 The following patchset contains Netfilter fixes for net: 1) Skip set commit for deleted/destroyed sets, this might trigger double deactivation of expired elements. 2) Fix packet mangling from egress, set transport offset from mac header for netdev/egress. Both fixes address bugs already present in several releases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Greg Kroah-Hartman
|
f732ba4ac9 |
iucv: make iucv_bus const
Now that the driver core can properly handle constant struct bus_type, move the iucv_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Wenjia Zhang <wenjia@linux.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: linux-s390@vger.kernel.org Cc: netdev@vger.kernel.org Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Jonathan Corbet
|
144377c340 |
net: sock: remove excess structure-member documentation
Remove a couple of kerneldoc entries for struct members that do not exist, addressing these warnings: ./include/net/sock.h:548: warning: Excess struct member '__sk_flags_offset' description in 'sock' ./include/net/sock.h:548: warning: Excess struct member 'sk_padding' description in 'sock' Signed-off-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Radu Pirea (NXP OSS)
|
a73d8779d6 |
net: macsec: introduce mdo_insert_tx_tag
Offloading MACsec in PHYs requires inserting the SecTAG and the ICV in the ethernet frame. This operation will increase the frame size with up to 32 bytes. If the frames are sent at line rate, the PHY will not have enough room to insert the SecTAG and the ICV. Some PHYs use a hardware buffer to store a number of ethernet frames and, if it fills up, a pause frame is sent to the MAC to control the flow. This HW implementation does not need any modification in the stack. Other PHYs might offer to use a specific ethertype with some padding bytes present in the ethernet frame. This ethertype and its associated bytes will be replaced by the SecTAG and ICV. mdo_insert_tx_tag allows the PHY drivers to add any specific tag in the skb. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Radu Pirea (NXP OSS)
|
eb97b9bd38 |
net: macsec: documentation for macsec_context and macsec_ops
Add description for fields of struct macsec_context and struct macsec_ops. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Radu Pirea (NXP OSS)
|
b1c036e835 |
net: macsec: move sci_to_cpu to macsec header
Move sci_to_cpu to the MACsec header to use it in drivers. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> |