For multi-link operation(MLO) TDLS management
frames need to be transmitted on a specific link.
The TDLS setup request will add BSSID along with
peer address and userspace will pass the link-id
based on BSSID value to the driver(or mac80211).
Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.cb3d87c22812.Ia3d15ac4a9a182145bf2d418bcb3ddf4539cd0a7@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Support new firmware that can validate the validate bits in
sniffer mode, and advertise that fact and the result of the
checks in the U-SIG radiotap field.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230613155501.c20480aa1171.Icc0d077dae01d662ccb948823e196aa9c5c87976@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
An AP part of an AP MLD might be temporarily disabled, and might be
enabled later. Such a link should be included in the association
exchange, but should not be used until enabled.
Extend the NL80211_CMD_ASSOCIATE to also indicate disabled links.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.c4c61ee4c4a5.I784ef4a0d619fc9120514b5615458fbef3b3684a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As a preparation to support disabled/dormant links, add the
following function:
- ieee80211_vif_usable_links(): returns the bitmap of the links
that can be activated. Use this function in all the places that
the bitmap of the usable links is needed.
- ieee80211_vif_is_mld(): returns true iff the vif is an MLD.
Use this function in all the places where an indication that the
connection is a MLD is needed.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.86e3351da1fc.If6fe3a339fda2019f13f57ff768ecffb711b710a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There are cases in which we don't want the user to override the
smps mode, e.g. when SMPS should be disabled due to EMLSR. Add
a driver flag to disable SMPS overriding and don't override if
it is set.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.ef129e80556c.I74a298fdc86b87074c95228d3916739de1400597@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's quite a bit of code accessing sband iftype data
(HE, HE 6 GHz, EHT) and we always need to remember to use
the ieee80211_vif_type_p2p() helper. Add new helpers to
directly get it from the sband/vif rather than having to
call ieee80211_vif_type_p2p().
Convert most code with the following spatch:
@@
expression vif, sband;
@@
-ieee80211_get_he_iftype_cap(sband, ieee80211_vif_type_p2p(vif))
+ieee80211_get_he_iftype_cap_vif(sband, vif)
@@
expression vif, sband;
@@
-ieee80211_get_eht_iftype_cap(sband, ieee80211_vif_type_p2p(vif))
+ieee80211_get_eht_iftype_cap_vif(sband, vif)
@@
expression vif, sband;
@@
-ieee80211_get_he_6ghz_capa(sband, ieee80211_vif_type_p2p(vif))
+ieee80211_get_he_6ghz_capa_vif(sband, vif)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230604120651.db099f49e764.Ie892966c49e22c7b7ee1073bc684f142debfdc84@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Increase the size of S1G rate_info flags to support S1G and add
flags for new S1G MCS and the supported bandwidths. Also, include
S1G rate information to netlink STA rate message. Lastly, add
rate calculation function for S1G MCS.
Signed-off-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com>
Link: https://lore.kernel.org/r/20230518000723.991912-1-gilad.itzkovitch@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The second pull request for v6.5. We have support for three new
Realtek chipsets, all from different generations. Shows how active
Realtek development is right now, even older generations are being
worked on.
Note: We merged wireless into wireless-next to avoid complex conflicts
between the trees.
Major changes:
rtl8xxxu
* RTL8192FU support
rtw89
* RTL8851BE support
rtw88
* RTL8723DS support
ath11k
* Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
Advertisement (EMA) support in AP mode
iwlwifi
* support for segmented PNVM images and power tables
* new vendor entries for PPAG (platform antenna gain) feature
cfg80211/mac80211
* more Multi-Link Operation (MLO) support such as hardware restart
* fixes for a potential work/mutex deadlock and with it beginnings of
the previously discussed locking simplifications
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmSDHDERHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZsMDgf/YWmT/b6W4pYRltHRPYksxKfxASGf6mvZ
UXo5vEehCF4eY2vo4MXTknVJOjrvfbqK5uvWnUfqIYBMgCThBAlG1dz+4Ziiwtw5
QrSKG2rTQxRlI8ecbfe1Kd1TXr9bSvBAtoNAmElgxcRqG1GbiSqUBvcUox63AYuB
yozONCvdAOsrPJTcCX+ThP+ABayAonEZf7YeISIRU6q3QMkbVksHAGAe3TnIAPJq
8l5fQQmHwC0CxActOT2fH8WcJ3dyl5OTqjziRfjrh9T8QUJRp9Yl7zJhpYiBT116
aCVg/NRxAFT+RkKWdQ6jRHMVOAtzL2fe6fuVW0sk7e1/s5uzyyBxJw==
=hq6p
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.5
The second pull request for v6.5. We have support for three new
Realtek chipsets, all from different generations. Shows how active
Realtek development is right now, even older generations are being
worked on.
Note: We merged wireless into wireless-next to avoid complex conflicts
between the trees.
Major changes:
rtl8xxxu
- RTL8192FU support
rtw89
- RTL8851BE support
rtw88
- RTL8723DS support
ath11k
- Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
Advertisement (EMA) support in AP mode
iwlwifi
- support for segmented PNVM images and power tables
- new vendor entries for PPAG (platform antenna gain) feature
cfg80211/mac80211
- more Multi-Link Operation (MLO) support such as hardware restart
- fixes for a potential work/mutex deadlock and with it beginnings of
the previously discussed locking simplifications
* tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (162 commits)
wifi: rtlwifi: remove misused flag from HAL data
wifi: rtlwifi: remove unused dualmac control leftovers
wifi: rtlwifi: remove unused timer and related code
wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown
wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled
wifi: brcmfmac: Detect corner error case earlier with log
wifi: rtw89: 8852c: update RF radio A/B parameters to R63
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (3 of 3)
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (2 of 3)
wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (1 of 3)
wifi: rtw89: process regulatory for 6 GHz power type
wifi: rtw89: regd: update regulatory map to R64-R40
wifi: rtw89: regd: judge 6 GHz according to chip and BIOS
wifi: rtw89: refine clearing supported bands to check 2/5 GHz first
wifi: rtw89: 8851b: configure CRASH_TRIGGER feature for 8851B
wifi: rtw89: set TX power without precondition during setting channel
wifi: rtw89: debug: txpwr table access only valid page according to chip
wifi: rtw89: 8851b: enable hw_scan support
wifi: cfg80211: move scan done work to wiphy work
wifi: cfg80211: move sched scan stop to wiphy work
...
====================
Link: https://lore.kernel.org/r/87bkhohkbg.fsf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.
For UDP, a pending packet will not be emitted if the socket is closed
before it is flushed; with this change, it be flushed by ->splice_eof().
For TCP, it's not clear that MSG_MORE is actually effective.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kuniyuki Iwashima <kuniyu@amazon.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add an optional method, ->splice_eof(), to allow splice to indicate the
premature termination of a splice to struct file_operations and struct
proto_ops.
This is called if sendfile() or splice() encounters all of the following
conditions inside splice_direct_to_actor():
(1) the user did not set SPLICE_F_MORE (splice only), and
(2) an EOF condition occurred (->splice_read() returned 0), and
(3) we haven't read enough to fulfill the request (ie. len > 0 still), and
(4) we have already spliced at least one byte.
A further patch will modify the behaviour of SPLICE_F_MORE to always be
passed to the actor if either the user set it or we haven't yet read
sufficient data to fulfill the request.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: David Hildenbrand <david@redhat.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: linux-mm@kvack.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR.
Conflicts:
net/sched/sch_taprio.c
d636fc5dd6 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
dced11ef84 ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")
net/ipv4/sysctl_net_ipv4.c
e209fee411 ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294")
ccce324dab ("tcp: make the first N SYN RTO backoffs linear")
https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a work abstraction at the cfg80211 level that will always
hold the wiphy_lock() for any work executed and therefore also
can be canceled safely (without waiting) while holding that.
This improves on what we do now as with the new wiphy works we
don't have to worry about locking while cancelling them safely.
Also, don't let such works run while the device is suspended,
since they'll likely need to interact with the device. Flush
them before suspend though.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
rtm_tca_policy is used from net/sched/sch_api.c and net/sched/cls_api.c,
thus should be declared in an include file.
This fixes the following sparse warning:
net/sched/sch_api.c:1434:25: warning: symbol 'rtm_tca_policy' was not declared. Should it be static?
Fixes: e331473fee ("net/sched: cls_api: add missing validation of netlink attributes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add READ_ONCE()/WRITE_ONCE() on accesses to sk->sk_rxhash.
This also prevents a (smart ?) compiler to remove the condition in:
if (sk->sk_rxhash != newval)
sk->sk_rxhash = newval;
We need the condition to avoid dirtying a shared cache line.
Fixes: fec5e652e5 ("rfs: Receive Flow Steering")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fixes to debugfs registration
- Fix use-after-free in hci_remove_ltk/hci_remove_irk
- Fixes to ISO channel support
- Fix missing checks for invalid L2CAP DCID
- Fix l2cap_disconnect_req deadlock
- Add lock to protect HCI_UNREGISTER
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmR+ftMZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKQMXD/9NcuqbGmEzJspVA8bZ8gXD
L7a68QnacdIoqH56QstLhGPQsYH6dv9fwhpNX6AN8/j8UG8DnDXQtHyfm4gZzfYA
h8GP7+ZQIEiHivIxiamrJnQ1Ii+KYEV3NGyS43YBuuPi9LcTFR0Km42xA0GqOnDU
Hz3/n5v342479TjJPNJkFPmcUGViRaLXtKhzcBzmSykUW+SVuIuD03yxuAJcojf5
rlPYA7yho7k8BAWkcYxWAP3v9fzQVa3nz8rQO2rG+poi4La2mmqRHykuSCXmzvBX
SbZwvzqgquqgQiFLpRIo/nwnVwPu3NYK6dQzlXPqiaxfM6qAtRttwQWNnOT+UxEu
VVGk6fD9iKjo9dttq+lTSY3LI/SXWAHYByIBzjx883hJYf1YvDAMSlMlzo029xL6
BHu3hMTDhosP8sG5wFdR2KzBmUd1W/ZcwOG0UP8PjshZgrOZ3uej9p3MrocKAys7
uGOBFmGzwOaQLXJQLbd4djE5l6zLOxSCV/0OLIWQw7VFQiHb66NzN6wenYEkDnxM
j2pFAlzp4RKHHCjU3dfaE90c0ede116e9nhjAlzmUOxggg6aCxCrCkMNOI8NlZ4v
oukYWq66RWYA/J4S80OLepITtBRPVn3JFxOXss5xESFfEnzL2nRZ5gm8jJJGULU4
x6tKTHaomO99FcH0ZFlZMw==
=jMWO
-----END PGP SIGNATURE-----
Merge tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fixes to debugfs registration
- Fix use-after-free in hci_remove_ltk/hci_remove_irk
- Fixes to ISO channel support
- Fix missing checks for invalid L2CAP DCID
- Fix l2cap_disconnect_req deadlock
- Add lock to protect HCI_UNREGISTER
* tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Add missing checks for invalid DCID
Bluetooth: ISO: use correct CIS order in Set CIG Parameters event
Bluetooth: ISO: don't try to remove CIG if there are bound CIS left
Bluetooth: Fix l2cap_disconnect_req deadlock
Bluetooth: hci_qca: fix debugfs registration
Bluetooth: fix debugfs registration
Bluetooth: hci_sync: add lock to protect HCI_UNREGISTER
Bluetooth: Fix use-after-free in hci_remove_ltk/hci_remove_irk
Bluetooth: ISO: Fix CIG auto-allocation to select configurable CIG
Bluetooth: ISO: consider right CIS when removing CIG at cleanup
====================
Link: https://lore.kernel.org/r/20230606003454.2392552-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156.
The Source Routing Header (SRH) has the following format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len | Routing Type | Segments Left |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CmprI | CmprE | Pad | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
. .
. Addresses[1..n] .
. .
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The originator of an SRH places the first hop's IPv6 address in the IPv6
header's IPv6 Destination Address and the second hop's IPv6 address as
the first address in Addresses[1..n].
The CmprI and CmprE fields indicate the number of prefix octets that are
shared with the IPv6 Destination Address. When CmprI or CmprE is not 0,
Addresses[1..n] are compressed as follows:
1..n-1 : (16 - CmprI) bytes
n : (16 - CmprE) bytes
Segments Left indicates the number of route segments remaining. When the
value is not zero, the SRH is forwarded to the next hop. Its address
is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6
Destination Address.
When Segment Left is greater than or equal to 2, the size of SRH is not
changed because Addresses[1..n-1] are decompressed and recompressed with
CmprI.
OTOH, when Segment Left changes from 1 to 0, the new SRH could have a
different size because Addresses[1..n-1] are decompressed with CmprI and
recompressed with CmprE.
Let's say CmprI is 15 and CmprE is 0. When we receive SRH with Segment
Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has
16 bytes. When Segment Left is 1, Addresses[1..n-1] is decompressed to
16 bytes and not recompressed. Finally, the new SRH will need more room
in the header, and the size is (16 - 1) * (n - 1) bytes.
Here the max value of n is 255 as Segment Left is u8, so in the worst case,
we have to allocate 3825 bytes in the skb headroom. However, now we only
allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7
bytes). If the decompressed size overflows the room, skb_push() hits BUG()
below [0].
Instead of allocating the fixed buffer for every packet, let's allocate
enough headroom only when we receive SRH with Segment Left 1.
[0]:
skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo
kernel BUG at net/core/skbuff.c:200!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec0047 #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:skb_panic (net/core/skbuff.c:200)
Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffc90000003da0 EFLAGS: 00000246
RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540
RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff
R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800
R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190
FS: 00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
<IRQ>
skb_push (net/core/skbuff.c:210)
ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718)
ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5))
ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483)
__netif_receive_skb_one_core (net/core/dev.c:5494)
process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934)
__napi_poll (net/core/dev.c:6496)
net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696)
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572)
do_softirq (kernel/softirq.c:472 kernel/softirq.c:459)
</IRQ>
<TASK>
__local_bh_enable_ip (kernel/softirq.c:396)
__dev_queue_xmit (net/core/dev.c:4272)
ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134)
rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914)
sock_sendmsg (net/socket.c:724 net/socket.c:747)
__sys_sendto (net/socket.c:2144)
__x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7f453a138aea
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea
RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003
RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b
</TASK>
Modules linked in:
Fixes: 8610c7c6e3 ("net: ipv6: add support for rpl sr exthdr")
Reported-by: Max VA
Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-death
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230605180617.67284-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We need to teach the low level driver about the EML capability which
includes information for EMLSR / EMLMR operation.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230504134511.828474-11-gregory.greenman@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The GRO control block (NAPI_GRO_CB) is currently at its maximum size.
This commit reduces its size by putting two groups of fields that are
used only at different times into a union.
Specifically, the fields frag0 and frag0_len are the fields that make up
the frag0 optimisation mechanism, which is used during the initial
parsing of the SKB.
The fields last and age are used after the initial parsing, while the
SKB is stored in the GRO list, waiting for other packets to arrive.
There was one location in dev_gro_receive that modified the frag0 fields
after setting last and age. I changed this accordingly without altering
the code behaviour.
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230601161407.GA9253@debian
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The order of CIS handle array in Set CIG Parameters response shall match
the order of the CIS_ID array in the command (Core v5.3 Vol 4 Part E Sec
7.8.97). We send CIS_IDs mainly in the order of increasing CIS_ID (but
with "last" CIS first if it has fixed CIG_ID). In handling of the
reply, we currently assume this is also the same as the order of
hci_conn in hdev->conn_hash, but that is not true.
Match the correct hci_conn to the correct handle by matching them based
on the CIG+CIS combination. The CIG+CIS combination shall be unique for
ISO_LINK hci_conn at state >= BT_BOUND, which we maintain in
hci_le_set_cig_params.
Fixes: 26afbd826e ("Bluetooth: Add initial implementation of CIS connections")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Since commit ec6cef9cd9 ("Bluetooth: Fix SMP channel registration for
unconfigured controllers") the debugfs interface for unconfigured
controllers will be created when the controller is configured.
There is however currently nothing preventing a controller from being
configured multiple time (e.g. setting the device address using btmgmt)
which results in failed attempts to register the already registered
debugfs entries:
debugfs: File 'features' in directory 'hci0' already present!
debugfs: File 'manufacturer' in directory 'hci0' already present!
debugfs: File 'hci_version' in directory 'hci0' already present!
...
debugfs: File 'quirk_simultaneous_discovery' in directory 'hci0' already present!
Add a controller flag to avoid trying to register the debugfs interface
more than once.
Fixes: ec6cef9cd9 ("Bluetooth: Fix SMP channel registration for unconfigured controllers")
Cc: stable@vger.kernel.org # 4.0
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
When the HCI_UNREGISTER flag is set, no jobs should be scheduled. Fix
potential race when HCI_UNREGISTER is set after the flag is tested in
hci_cmd_sync_queue.
Fixes: 0b94f2651f ("Bluetooth: hci_sync: Fix queuing commands when HCI_UNREGISTER is set")
Signed-off-by: Zhengping Jiang <jiangzp@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Save a bit a space, and could help future sysctls to
use the same pattern.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
skip_notify_on_dev_down ctl table expects this field
to be an int (4 bytes), not a bool (1 byte).
Because proc_dou8vec_minmax() was added in 5.13,
this patch converts skip_notify_on_dev_down to an int.
Following patch then converts the field to u8 and use proc_dou8vec_minmax().
Fixes: 7c6bb7d2fa ("net/ipv6: Add knob to skip DELROUTE message on device down")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Callers of flowi4_update_output() never try to update ->flowi4_tos:
* ip_route_connect() updates ->flowi4_tos with its own current
value.
* ip_route_newports() has two users: tcp_v4_connect() and
dccp_v4_connect. Both initialise fl4 with ip_route_connect(), which
in turn sets ->flowi4_tos with RT_TOS(inet_sk(sk)->tos) and
->flowi4_scope based on SOCK_LOCALROUTE.
Then ip_route_newports() updates ->flowi4_tos with
RT_CONN_FLAGS(sk), which is the same as RT_TOS(inet_sk(sk)->tos),
unless SOCK_LOCALROUTE is set on the socket. In that case, the
lowest order bit is set to 1, to eventually inform
ip_route_output_key_hash() to restrict the scope to RT_SCOPE_LINK.
This is equivalent to properly setting ->flowi4_scope as
ip_route_connect() did.
* ip_vs_xmit.c initialises ->flowi4_tos with memset(0), then calls
flowi4_update_output() with tos=0.
* sctp_v4_get_dst() uses the same RT_CONN_FLAGS_TOS() when
initialising ->flowi4_tos and when calling flowi4_update_output().
In the end, ->flowi4_tos never changes. So let's just drop the tos
parameter. This will simplify the conversion of ->flowi4_tos from __u8
to dscp_t.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this commit, all the GIDs ("0 4294967294") can be written to the
"net.ipv4.ping_group_range" sysctl.
Note that 4294967295 (0xffffffff) is an invalid GID (see gid_valid() in
include/linux/uidgid.h), and an attempt to register this number will cause
-EINVAL.
Prior to this commit, only up to GID 2147483647 could be covered.
Documentation/networking/ip-sysctl.rst had "0 4294967295" as an example
value, but this example was wrong and causing -EINVAL.
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Co-developed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the offending fixes commit I mistakenly removed the reply message of
the port new command. I was under impression it is a new port
notification, partly due to the "notify" in the name of the helper
function. Bring the code sending reply with new port message back, this
time putting it directly to devlink_nl_cmd_port_new_doit()
Fixes: c496daeb86 ("devlink: remove duplicate port notification")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531142025.2605001-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the blamed commit, the member key is longer 4-byte aligned. On
platforms that do not support unaligned access, e.g., MIPS32R2 with
unaligned_action set to 1, this will trigger a crash when accessing
an IPv6 pneigh_entry, as the key is cast to an in6_addr pointer.
Change the type of the key to u32 to make it aligned.
Fixes: 62dd93181a ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.")
Signed-off-by: Qingfang DENG <qingfang.deng@siflower.com.cn>
Link: https://lore.kernel.org/r/20230601015432.159066-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In this patch, we mainly try to handle sending a compressed ack
correctly if it's deferred.
Here are more details in the old logic:
When sack compression is triggered in the tcp_compressed_ack_kick(),
if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED
and then defer to the release cb phrase. Later once user releases
the sock, tcp_delack_timer_handler() should send a ack as expected,
which, however, cannot happen due to lack of ICSK_ACK_TIMER flag.
Therefore, the receiver would not sent an ack until the sender's
retransmission timeout. It definitely increases unnecessary latency.
Fixes: 5d9f4262b7 ("tcp: add SACK compression")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: fuyuanli <fuyuanli@didiglobal.com>
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Offloading drivers may report some additional statistics counters, some
of them even suggested by 802.1Q, like TransmissionOverrun.
In my opinion we don't have to limit ourselves to reporting counters
only globally to the Qdisc/interface, especially if the device has more
detailed reporting (per traffic class), since the more detailed info is
valuable for debugging and can help identifying who is exceeding its
time slot.
But on the other hand, some devices may not be able to report both per
TC and global stats.
So we end up reporting both ways, and use the good old ethtool_put_stat()
strategy to determine which statistics are supported by this NIC.
Statistics which aren't set are simply not reported to netlink. For this
reason, we need something dynamic (a nlattr nest) to be reported through
TCA_STATS_APP, and not something daft like the fixed-size and
inextensible struct tc_codel_xstats. A good model for xstats which are a
nlattr nest rather than a fixed struct seems to be cake.
# Global stats
$ tc -s qdisc show dev eth0 root
# Per-tc stats
$ tc -s class show dev eth0
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inspired from struct flow_cls_offload :: cmd, in order for taprio to be
able to report statistics (which is future work), it seems that we need
to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO)
multiplexing, and pass the command as part of the common portion of the
muxed structure.
Since we already have an "enable" variable in tc_taprio_qopt_offload,
refactor all drivers to check for "cmd" instead of "enable", and reject
every other command except "replace" and "destroy" - to be future proof.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> # for lan966x
Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_gro_complete() function only updates the skb fields related to GRO
and it always returns zero. All the 3 drivers which are using it
do not check for the return value either.
Change it to return void instead which simplifies its callers as
error handing becomes unnecessary.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the 'FLOW_DISSECTOR_KEY_META' key with a new 'l2_miss' field and
populate it from a field with the same name in the tc skb extension.
This field is set by the bridge driver for packets that incur an FDB or
MDB miss.
The next patch will extend the flower classifier to be able to match on
layer 2 misses.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move port_fn_hw_addr_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient. Introduce devlink_port_ops and a set
of functions that allow drivers to pass ops pointer during
port registration.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the
frequent and parallel code path of all queues. So, r/w into this
single shared variable by many threads on different CPUs creates a
lot caching and memory overhead, hence perf regression. And, it's
not accurate due to the high volume concurrent r/w.
For example, a workload is iperf with 128 threads, and with RPS
enabled. We saw perf regression of 25% with the previous patch
adding the counters. And this patch eliminates the regression.
Since the error path of mana_poll_rx_cq() already has warnings, so
keeping the counter and convert it to a per-queue variable is not
necessary. So, just remove this counter from this high frequency
code path.
Also, remove the tx_cqes counter for the same reason. We have
warnings & other counters for errors on that path, and don't need
to count every normal cqe processing.
Cc: stable@vger.kernel.org
Fixes: bd7fc6e195 ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1685115537-31675-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZHEm+wAKCRDbK58LschI
gyIKAQCqO7B4sIu8hYVxBTwfHV2tIuXSMSCV4P9e78NUOPcO2QEAvLP/WVSjB0Bm
vpyTKKM22SpZvPe/jSp52j6t20N+qAc=
=HFxD
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-05-26
We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 76 files changed, 2729 insertions(+), 1003 deletions(-).
The main changes are:
1) Add the capability to destroy sockets in BPF through a new kfunc,
from Aditi Ghag.
2) Support O_PATH fds in BPF_OBJ_PIN and BPF_OBJ_GET commands,
from Andrii Nakryiko.
3) Add capability for libbpf to resize datasec maps when backed via mmap,
from JP Kobryn.
4) Move all the test kfuncs for CI out of the kernel and into bpf_testmod,
from Jiri Olsa.
5) Big batch of xsk selftest improvements to prep for multi-buffer testing,
from Magnus Karlsson.
6) Show the target_{obj,btf}_id in tracing link's fdinfo and dump it
via bpftool, from Yafang Shao.
7) Various misc BPF selftest improvements to work with upcoming LLVM 17,
from Yonghong Song.
8) Extend bpftool to specify netdevice for resolving XDP hints,
from Larysa Zaremba.
9) Document masking in shift operations for the insn set document,
from Dave Thaler.
10) Extend BPF selftests to check xdp_feature support for bond driver,
from Lorenzo Bianconi.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
bpf: Fix bad unlock balance on freeze_mutex
libbpf: Ensure FD >= 3 during bpf_map__reuse_fd()
libbpf: Ensure libbpf always opens files with O_CLOEXEC
selftests/bpf: Check whether to run selftest
libbpf: Change var type in datasec resize func
bpf: drop unnecessary bpf_capable() check in BPF_MAP_FREEZE command
libbpf: Selftests for resizing datasec maps
libbpf: Add capability for resizing datasec maps
selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests
libbpf: Add opts-based bpf_obj_pin() API and add support for path_fd
bpf: Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
libbpf: Start v1.3 development cycle
bpf: Validate BPF object in BPF_OBJ_PIN before calling LSM
bpftool: Specify XDP Hints ifname when loading program
selftests/bpf: Add xdp_feature selftest for bond device
selftests/bpf: Test bpf_sock_destroy
selftests/bpf: Add helper to get port using getsockname
bpf: Add bpf_sock_destroy kfunc
bpf: Add kfunc filter function to 'struct btf_kfunc_id_set'
bpf: udp: Implement batching for sockets iterator
...
====================
Link: https://lore.kernel.org/r/20230526222747.17775-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently TCP_SYNQ_INTERVAL is defined but never used.
According to "git log -S TCP_SYNQ_INTERVAL net-next/main" it seems
the last references to TCP_SYNQ_INTERVAL were removed by 2015
commit fa76ce7328 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>