Commit Graph

73770 Commits

Author SHA1 Message Date
David S. Miller
8340eef98d Merge tag 'ieee802154-for-net-2023-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan
Stefan Schmidt says:

====================
An update from ieee802154 for your *net* tree:

Two small fixes and MAINTAINERS update this time.

Azeem Shaikh ensured consistent use of strscpy through the tree and fixed
the usage in our trace.h.

Chen Aotian fixed a potential memory leak in the hwsim simulator for
ieee802154.

Miquel Raynal updated the MAINATINERS file with the new team git tree
locations and patchwork URLs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-20 09:32:33 +01:00
Benjamin Coddington
d9615d166c NFS: add sysfs shutdown knob
Within each nfs_server sysfs tree, add an entry named "shutdown".  Writing
1 to this file will set the cl_shutdown bit on the rpc_clnt structs
associated with that mount.  If cl_shutdown is set, the task scheduler
immediately returns -EIO for new tasks.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 15:08:12 -04:00
Benjamin Coddington
e13b549319 NFS: Add sysfs links to sunrpc clients for nfs_clients
For the general and state management nfs_client under each mount, create
symlinks to their respective rpc_client sysfs entries.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 15:04:13 -04:00
Kuniyuki Iwashima
6db5dd2bf4 ipv6: exthdrs: Remove redundant skb_headlen() check in ip6_parse_tlv().
ipv6_destopt_rcv() and ipv6_parse_hopopts() pulls these data

  - Hop-by-Hop/Destination Options Header : 8
  - Hdr Ext Len                           : skb_transport_header(skb)[1] << 3

and calls ip6_parse_tlv(), so it need not check if skb_headlen() is less
than skb_transport_offset(skb) + (skb_transport_header(skb)[1] << 3).

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-19 11:32:58 -07:00
Kuniyuki Iwashima
b83d50f431 ipv6: exthdrs: Reload hdr only when needed in ipv6_srh_rcv().
We need not reload hdr in ipv6_srh_rcv() unless we call
pskb_expand_head().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-19 11:32:58 -07:00
Kuniyuki Iwashima
0d2e27b858 ipv6: exthdrs: Replace pskb_pull() with skb_pull() in ipv6_srh_rcv().
ipv6_rthdr_rcv() pulls these data

  - Segment Routing Header : 8
  - Hdr Ext Len            : skb_transport_header(skb)[1] << 3

needed by ipv6_srh_rcv(), so pskb_pull() in ipv6_srh_rcv() never
fails and can be replaced with skb_pull().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-19 11:32:58 -07:00
Kuniyuki Iwashima
6facbca52d ipv6: rpl: Remove redundant multicast tests in ipv6_rpl_srh_rcv().
ipv6_rpl_srh_rcv() checks if ipv6_hdr(skb)->daddr or ohdr->rpl_segaddr[i]
is the multicast address with ipv6_addr_type().

We have the same check for ipv6_hdr(skb)->daddr in ipv6_rthdr_rcv(), so we
need not recheck it in ipv6_rpl_srh_rcv().

Also, we should use ipv6_addr_is_multicast() for ohdr->rpl_segaddr[i]
instead of ipv6_addr_type().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-19 11:32:58 -07:00
Kuniyuki Iwashima
ac9d8a66e4 ipv6: rpl: Remove pskb(_may)?_pull() in ipv6_rpl_srh_rcv().
As Eric Dumazet pointed out [0], ipv6_rthdr_rcv() pulls these data

  - Segment Routing Header : 8
  - Hdr Ext Len            : skb_transport_header(skb)[1] << 3

needed by ipv6_rpl_srh_rcv().  We can remove pskb_may_pull() and
replace pskb_pull() with skb_pull() in ipv6_rpl_srh_rcv().

Link: https://lore.kernel.org/netdev/CANn89iLboLwLrHXeHJucAqBkEL_S0rJFog68t7wwwXO-aNf5Mg@mail.gmail.com/ [0]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-19 11:32:58 -07:00
Chuck Lever
75eb6af7ac SUNRPC: Add a TCP-with-TLS RPC transport class
Use the new TLS handshake API to enable the SunRPC client code
to request a TLS handshake. This implements support for RFC 9289,
only on TCP sockets.

Upper layers such as NFS use RPC-with-TLS to protect in-transit
traffic.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:28:10 -04:00
Chuck Lever
dea034b963 SUNRPC: Capture CMSG metadata on client-side receive
kTLS sockets use CMSG to report decryption errors and the need
for session re-keying.

For RPC-with-TLS, an "application data" message contains a ULP
payload, and that is passed along to the RPC client. An "alert"
message triggers connection reset. Everything else is discarded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:26:54 -04:00
Chuck Lever
0d3ca07ffd SUNRPC: Ignore data_ready callbacks during TLS handshakes
The RPC header parser doesn't recognize TLS handshake traffic, so it
will close the connection prematurely with an error. To avoid that,
shunt the transport's data_ready callback when there is a TLS
handshake in progress.

The XPRT_SOCK_IGNORE_RECV flag will be toggled by code added in a
subsequent patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:19:16 -04:00
Chuck Lever
120726526e SUNRPC: Add RPC client support for the RPC_AUTH_TLS auth flavor
The new authentication flavor is used only to discover peer support
for RPC-over-TLS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:18:36 -04:00
Chuck Lever
97d1c83c3f SUNRPC: Trace the rpc_create_args
Pass the upper layer's rpc_create_args to the rpc_clnt_new()
tracepoint so additional parts of the upper layer's request can be
recorded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:17:56 -04:00
Chuck Lever
5000531912 SUNRPC: Plumb an API for setting transport layer security
Add an initial set of policies along with fields for upper layers to
pass the requested policy down to the transport layer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:16:59 -04:00
NeilBrown
626590ea4c SUNRPC: attempt to reach rpcbind with an abstract socket name
NFS is primarily name-spaced using network namespaces.  However it
contacts rpcbind (and gss_proxy) using AF_UNIX sockets which are
name-spaced using the mount namespaces.  This requires a container using
NFSv3 (the form that requires rpcbind) to manage both network and mount
namespaces, which can seem an unnecessary burden.

As NFS is primarily a network service it makes sense to use network
namespaces as much as possible, and to prefer to communicate with an
rpcbind running in the same network namespace.  This can be done, while
preserving the benefits of AF_UNIX sockets, by using an abstract socket
address.

An abstract address has a nul at the start of sun_path, and a length
that is exactly the complete size of the sockaddr_un up to the end of
the name, NOT including any trailing nul (which is not part of the
address).
Abstract addresses are local to a network namespace - regular AF_UNIX
path names a resolved in the mount namespace ignoring the network
namespace.

This patch causes rpcb to first try an abstract address before
continuing with regular AF_UNIX and then IP addresses.  This ensures
backwards compatibility.

Choosing the name needs some care as the same address will be configured
for rpcbind, and needs to be built in to libtirpc for this enhancement
to be fully successful.  There is no formal standard for choosing
abstract addresses.  The defacto standard appears to be to use a path
name similar to what would be used for a filesystem AF_UNIX address -
but with a leading nul.

In that case
   "\0/var/run/rpcbind.sock"
seems like the best choice.  However at this time /var/run is deprecated
in favour of /run, so
   "\0/run/rpcbind.sock"
might be better.
Though as we are deliberately moving away from using the filesystem it
might seem more sensible to explicitly break the connection and just
have
   "\0rpcbind.socket"
using the same name as the systemd unit file..

This patch chooses the second option, which seems least likely to raise
objections.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:12:22 -04:00
NeilBrown
4388ce05fa SUNRPC: support abstract unix socket addresses
An "abtract" address for an AF_UNIX socket start with a nul and can
contain any bytes for the given length, but traditionally doesn't
contain other nuls.  When reported, the leading nul is replaced by '@'.

sunrpc currently rejects connections to these addresses and reports them
as an empty string.  To provide support for future use of these
addresses, allow them for outgoing connections and report them more
usefully.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-06-19 12:12:22 -04:00
Ben Greear
5a0702aac0 wifi: mac80211: add eht_capa debugfs field
Output looks like this:

[root@ct523c-0b29 ~]# cat /debug/ieee80211/wiphy6/netdev\:wlan6/stations/50\:28\:4a\:bd\:f4\:a7/eht_capa
EHT supported
MAC-CAP: 0x82 0x00
PHY-CAP: 0x0c 0x00 0x00 0x00 0x00 0x48 0x00 0x00 0x00
		OM-CONTROL
		MAX-MPDU-LEN: 11454
		242-TONE-RU-GT20MHZ
		NDP-4-EHT-LFT-32-GI
		BEAMFORMEE-80-NSS: 0
		BEAMFORMEE-160-NSS: 0
		BEAMFORMEE-320-NSS: 0
		SOUNDING-DIM-80-NSS: 0
		SOUNDING-DIM-160-NSS: 0
		SOUNDING-DIM-320-NSS: 0
		MAX_NC: 0
		PPE_THRESHOLD_PRESENT
		NOMINAL_PKT_PAD: 0us
		MAX-NUM-SUPP-EHT-LTF: 1
		SUPP-EXTRA-EHT-LTF
		MCS15-SUPP-MASK: 0

		EHT bw <= 80 MHz, max NSS for MCS 8-9: Rx=2, Tx=2
		EHT bw <= 80 MHz, max NSS for MCS 10-11: Rx=2, Tx=2
		EHT bw <= 80 MHz, max NSS for MCS 12-13: Rx=2, Tx=2
		EHT bw <= 160 MHz, max NSS for MCS 8-9: Rx=0, Tx=0
		EHT bw <= 160 MHz, max NSS for MCS 10-11: Rx=0, Tx=0
		EHT bw <= 160 MHz, max NSS for MCS 12-13: Rx=0, Tx=0
		EHT bw <= 320 MHz, max NSS for MCS 8-9: Rx=0, Tx=0
		EHT bw <= 320 MHz, max NSS for MCS 10-11: Rx=0, Tx=0
		EHT bw <= 320 MHz, max NSS for MCS 12-13: Rx=0, Tx=0
EHT PPE Thresholds: 0xc1 0x0e 0xe0 0x00 0x00

Signed-off-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20230517184428.999384-1-greearb@candelatech.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 17:34:55 +02:00
Terin Stock
d7fce52fdf ipvs: align inner_mac_header for encapsulation
When using encapsulation the original packet's headers are copied to the
inner headers. This preserves the space for an inner mac header, which
is not used by the inner payloads for the encapsulation types supported
by IPVS. If a packet is using GUE or GRE encapsulation and needs to be
segmented, flow can be passed to __skb_udp_tunnel_segment() which
calculates a negative tunnel header length. A negative tunnel header
length causes pskb_may_pull() to fail, dropping the packet.

This can be observed by attaching probes to ip_vs_in_hook(),
__dev_queue_xmit(), and __skb_udp_tunnel_segment():

    perf probe --add '__dev_queue_xmit skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'
    perf probe --add '__skb_udp_tunnel_segment:7 tnl_hlen'
    perf probe -m ip_vs --add 'ip_vs_in_hook skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'

These probes the headers and tunnel header length for packets which
traverse the IPVS encapsulation path. A TCP packet can be forced into
the segmentation path by being smaller than a calculated clamped MSS,
but larger than the advertised MSS.

    probe:ip_vs_in_hook: inner_mac_header=0x0 inner_network_header=0x0 mac_header=0x44 network_header=0x52
    probe:ip_vs_in_hook: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:dev_queue_xmit: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:__skb_udp_tunnel_segment_L7: tnl_hlen=-2

When using veth-based encapsulation, the interfaces are set to be
mac-less, which does not preserve space for an inner mac header. This
prevents this issue from occurring.

In our real-world testing of sending a 32KB file we observed operation
time increasing from ~75ms for veth-based encapsulation to over 1.5s
using IPVS encapsulation due to retries from dropped packets.

This changeset modifies the packet on the encapsulation path in
ip_vs_tunnel_xmit() and ip_vs_tunnel_xmit_v6() to remove the inner mac
header offset. This fixes UDP segmentation for both encapsulation types,
and corrects the inner headers for any IPIP flows that may use it.

Fixes: 84c0d5e96f ("ipvs: allow tunneling with gue encapsulation")
Signed-off-by: Terin Stock <terin@cloudflare.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-19 16:01:07 +02:00
Andrii Nakryiko
6c3eba1c5e bpf: Centralize permissions checks for all BPF map types
This allows to do more centralized decisions later on, and generally
makes it very explicit which maps are privileged and which are not
(e.g., LRU_HASH and LRU_PERCPU_HASH, which are privileged HASH variants,
as opposed to unprivileged HASH and HASH_PERCPU; now this is explicit
and easy to verify).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230613223533.3689589-4-andrii@kernel.org
2023-06-19 14:04:04 +02:00
Johannes Berg
cf0b045ebf wifi: mac80211: check EHT basic MCS/NSS set
Check that all the NSS in the EHT basic MCS/NSS set
are actually supported, otherwise disable EHT for the
connection.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.737827c906c9.I0c11a3cd46ab4dcb774c11a5bbc30aecfb6fce11@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:44 +02:00
Benjamin Berg
5461707a52 wifi: cfg80211: search all RNR elements for colocated APs
An AP reporting colocated APs may send more than one reduced neighbor
report element. As such, iterate all elements instead of only parsing
the first one when looking for colocated APs.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.ffe2c014f478.I372a4f96c88f7ea28ac39e94e0abfc465b5330d4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:44 +02:00
Benjamin Berg
8dcc91c446 wifi: cfg80211: stop parsing after allocation failure
The error handling code would break out of the loop incorrectly,
causing the rest of the message to be misinterpreted. Fix this by
also jumping out of the surrounding while loop, which will trigger
the error detection code.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.0ffac98475cf.I6f5c08a09f5c9fced01497b95a9841ffd1b039f8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:43 +02:00
Johannes Berg
c870d66f1b wifi: update multi-link element STA reconfig
Update the MLE STA reconfig sub-type to 802.11be D3.0
format, which includes the operation update field.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.2e1383b31f07.I8055a111c8fcf22e833e60f5587a4d8d21caca5b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:43 +02:00
Johannes Berg
92bf4dd358 wifi: mac80211: agg-tx: prevent start/stop race
There were crashes reported in this code, and the timer_shutdown()
warning in one of the previous patches indicates that the timeout
timer for the AP response (addba_resp_timer) is still armed while
we're stopping the aggregation session.

After a very long deliberation of the code, so far the only way I
could find that might cause this would be the following sequence:
 - session start requested
 - session start indicated to driver, but driver returns
   IEEE80211_AMPDU_TX_START_DELAY_ADDBA
 - session stop requested, sets HT_AGG_STATE_WANT_STOP
 - session stop worker runs ___ieee80211_stop_tx_ba_session(),
   sets HT_AGG_STATE_STOPPING

From here on, the order doesn't matter exactly, but:

 1. driver calls ieee80211_start_tx_ba_cb_irqsafe(),
    setting HT_AGG_STATE_START_CB
 2. driver calls ieee80211_stop_tx_ba_cb_irqsafe(),
    setting HT_AGG_STATE_STOP_CB
 3. the worker will run ieee80211_start_tx_ba_cb() for
    HT_AGG_STATE_START_CB
 4. the worker will run ieee80211_stop_tx_ba_cb() for
    HT_AGG_STATE_STOP_CB

(the order could also be 1./3./2./4.)

This will cause ieee80211_start_tx_ba_cb() to send out the AddBA
request frame to the AP and arm the timer, but we're already in
the middle of stopping and so the ieee80211_stop_tx_ba_cb() will
no longer assume it needs to stop anything.

Prevent this by checking for WANT_STOP/STOPPING in the start CB,
and warn if we're sending a frame on a stopping session.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.e5b52777462a.I0b2ed6658e81804279f5d7c9c1918cb1f6626bf2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:43 +02:00
Johannes Berg
6f2db6588b wifi: mac80211: agg-tx: add a few locking assertions
This is all true today, but difficult to understand since
the callers are in other files etc. Add two new lockdep
assertions to make things easier to read.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.7f03dec6a90b.I762c11e95da005b80fa0184cb1173b99ec362acf@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:43 +02:00
Ilan Peer
8eb8dd2ffb wifi: mac80211: Support link removal using Reconfiguration ML element
Add support for handling link removal indicated by the
Reconfiguration Multi-Link element.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.d8a046dc0c1a.I4dcf794da2a2d9f4e5f63a4b32158075d27c0660@changeid
[use cfg80211_links_removed() API instead]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:12:43 +02:00
Benjamin Berg
79973d5cfd wifi: mac80211: add set_active_links variant not locking sdata
There are cases where keeping sdata locked for an operation. Add a
variant that does not take sdata lock to permit these usecases.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:11:35 +02:00
Benjamin Berg
ff32b4506f wifi: mac80211: add ___ieee80211_disconnect variant not locking sdata
There are cases where keeping sdata locked for an operation. Add a
variant that does not take sdata lock to permit these usecases.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 13:11:25 +02:00
Veerendranath Jakkam
065563b20a wifi: cfg80211/nl80211: Add support to indicate STA MLD setup links removal
STA MLD setup links may get removed if AP MLD remove the corresponding
affiliated APs with Multi-Link reconfiguration as described in
P802.11be_D3.0, section 35.3.6.2.2 Removing affiliated APs. Currently,
there is no support to notify such operation to cfg80211 and userspace.

Add support for the drivers to indicate STA MLD setup links removal to
cfg80211 and notify the same to userspace. Upon receiving such
indication from the driver, clear the MLO links information of the
removed links in the WDEV.

Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/20230317142153.237900-1-quic_vjakkam@quicinc.com
[rename function and attribute, fix kernel-doc]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:08:40 +02:00
Benjamin Berg
a0ed50112b wifi: cfg80211: do not scan disabled links on 6GHz
If a link is disabled on 6GHz, we should not send a probe request on the
channel to resolve it. Simply skip such RNR entries so that the link is
ignored.

Userspace can still see the link in the RNR and may generate an ML probe
request in order to associate to the (currently) disabled link.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.4f7384006471.Iff8f1081e76a298bd25f9468abb3a586372cddaa@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:08:40 +02:00
Benjamin Berg
2481b5da9c wifi: cfg80211: handle BSS data contained in ML probe responses
The basic multi-link element within an multi-link probe response will
contain full information about BSSes that are part of an MLD AP. This
BSS information may be used to associate with a link of an MLD AP
without having received a beacon from the BSS itself.

This patch adds parsing of the data and adding/updating the BSS using
the received elements. Doing this means that userspace can discover the
BSSes using an ML probe request and request association on these links.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.29593bd0ae1f.Ic9a67b8f022360aa202b870a932897a389171b14@changeid
[swap loop conditions smatch complained about]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:08:28 +02:00
Benjamin Berg
dc92e54c30 wifi: cfg80211: use structs for TBTT information access
Make the data access a bit nicer overall by using structs. There is a
small change here to also accept a TBTT information length of eight
bytes as we do not require the 20 MHz PSD information.

This also fixes a bug reading the short SSID on big endian machines.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.4c3f8901c1bc.Ic3e94fd6e1bccff7948a252ad3bb87e322690a17@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:30 +02:00
Benjamin Berg
eb142608e2 wifi: cfg80211: use a struct for inform_single_bss data
The argument is getting quite large, so use a struct internally to pass
around the information.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214436.831ab8a87b6f.I3bcc83d90f41d6f8a47b39528575dad0a9ec3564@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:30 +02:00
Benjamin Berg
891d4d5831 wifi: cfg80211: Always ignore ML element
The element should never be inherited, so always exclude it.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214435.c0e17989b4ed.I7cecb5ab7cd6919e61839b50ce5156904b41d7d8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:30 +02:00
Benjamin Berg
eeec7574ec wifi: ieee80211: add helper to validate ML element type and size
The helper functions to retrieve the EML capabilities and medium
synchronization delay both assume that the type is correct. Instead of
assuming the length is correct and still checking the type, add a new
helper to check both and don't do any verification.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214435.1b50e7a3b3cf.I9385514d8eb6d6d3c82479a6fa732ef65313e554@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:30 +02:00
Ilan Peer
dbd3966368 wifi: mac80211: Include Multi-Link in CRC calculation
Include the Multi-Link elements found in beacon frames
in the CRC calculation, as these elements are intended
to reflect changes in the AP MLD state.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230618214435.ae8246b93d85.Ia64b45198de90ff7f70abcc997841157f148ea40@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:30 +02:00
Johannes Berg
e8c2af660b wifi: cfg80211: fix regulatory disconnect with OCB/NAN
Since regulatory disconnect was added, OCB and NAN interface
types were added, which made it completely unusable for any
driver that allowed OCB/NAN. Add OCB/NAN (though NAN doesn't
do anything, we don't have any info) and also remove all the
logic that opts out, so it won't be broken again if/when new
interface types are added.

Fixes: 6e0bd6c35b ("cfg80211: 802.11p OCB mode handling")
Fixes: cb3b7d8765 ("cfg80211: add start / stop NAN commands")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20230616222844.2794d1625a26.I8e78a3789a29e6149447b3139df724a6f1b46fc3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Johannes Berg
b22552fcaf wifi: cfg80211: fix regulatory disconnect for non-MLO
The multi-link loop here broke disconnect when multi-link
operation (MLO) isn't active for a given interface, since
in that case valid_links is 0 (indicating no links, i.e.
no MLO.)

Fix this by taking that into account properly and skipping
the link only if there are valid_links in the first place.

Cc: stable@vger.kernel.org
Fixes: 7b0a0e3c3a ("wifi: cfg80211: do some rework towards MLO link APIs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20230616222844.eb073d650c75.I72739923ef80919889ea9b50de9e4ba4baa836ae@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Ilan Peer
e2efec97c3 wifi: mac80211: Rename ieee80211_mle_sta_prof_size_ok()
Rename it to ieee80211_mle_basic_sta_prof_size_ok() as it
validates the size of the station profile included in
Basic Multi-Link element.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.9bdfd263974f.I7bebd26894f33716e93cc7da576ef3215e0ba727@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Ilan Peer
cf36cdef10 wifi: mac80211: Add support for parsing Reconfiguration Multi Link element
Parse Reconfiguration Multi Link IE.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.6eeb6c9a4a6e.I1cb137da9b3c712fc7c7949a6dec9e314b5d7f63@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Ilan Peer
a286de1aa3 wifi: mac80211: Rename multi_link
As a preparation to support Reconfiguration Multi Link
element, rename 'multi_link' and 'multi_link_len' fields
in 'struct ieee802_11_elems' to 'ml_basic' and 'ml_basic_len'.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.b11370d3066a.I34280ae3728597056a6a2f313063962206c0d581@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Benjamin Berg
a76236de58 wifi: mac80211: use cfg80211 defragmentation helper
Use the shared functionality rather than copying it into mac80211.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.7dcbf82baade.Ic68d1f547cb75d66037abdbb0f066db20ff41ba3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Benjamin Berg
f837a653a0 wifi: cfg80211: add element defragmentation helper
This is already needed within mac80211 and support is also needed by
cfg80211 to parse ML elements.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.29c3ebeed10d.I009c049289dd0162c2e858ed8b68d2875a672ed6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Benjamin Berg
39432f8a37 wifi: cfg80211: drop incorrect nontransmitted BSS update code
The removed code ran for any BSS that was not included in the MBSSID
element in order to update it. However, instead of using the correct
inheritance rules, it would simply copy the elements from the
transmitting AP. The result is that we would report incorrect elements
in this case.

After some discussions, it seems that there are likely not even APs
actually using this feature. Either way, removing the code decreases
complexity and makes the cfg80211 behaviour more correct.

Fixes: 0b8fb8235b ("cfg80211: Parsing of Multiple BSSID information in scanning")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.cfd6d8db1f26.Ia1044902b86cd7d366400a4bfb93691b8f05d68c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Benjamin Berg
dfd9aa3e7a wifi: cfg80211: rewrite merging of inherited elements
The cfg80211_gen_new_ie function merges the IEs using inheritance rules.
Rewrite this function to fix issues around inheritance rules. In
particular, vendor elements do not require any special handling, as they
are either all inherited or overridden by the subprofile.
Also, add fragmentation handling as this may be needed in some cases.

This also changes the function to not require making a copy. The new
version could be optimized a bit by explicitly tracking which IEs have
been handled already rather than looking that up again every time.

Note that a small behavioural change is the removal of the SSID special
handling. This should be fine for the MBSSID element, as the SSID must
be included in the subelement.

Fixes: 0b8fb8235b ("cfg80211: Parsing of Multiple BSSID information in scanning")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.bc6152e146db.I2b5f3bc45085e1901e5b5192a674436adaf94748@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:29 +02:00
Benjamin Berg
03e7e493f1 wifi: cfg80211: ignore invalid TBTT info field types
The TBTT information field type must be zero. This is only changed in
the 802.11be draft specification where the value 1 is used to indicate
that only the MLD parameters are included.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.7865606ffe94.I7ff28afb875d1b4c39acd497df8490a7d3628e3f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Benjamin Berg
108d202298 wifi: mac80211: use new inform_bss callback
Doing this simplifies the code somewhat, as iteration over the
nontransmitted BSSs is not required anymore. Also, mac80211 should
not be iterating over the nontrans_list as it should only be accessed
while the bss_lock is held.

It also simplifies parsing of the IEs somewhat, as cfg80211 already
extracts the IEs and passes them to the callback.

Note that the only user left requiring parsing a specific BSS is the
association code if a beacon is required by the hardware.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.39ebfe2f9e59.Ia012b08e0feed8ec431b666888b459f6366f7bd1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Benjamin Berg
5db25290b7 wifi: cfg80211: add inform_bss op to update BSS
This new function is called from within the inform_bss(_frame)_data
functions in order for the driver to update data that it is tracking.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094949.8d7781b0f965.I80041183072b75c081996a1a5a230b34aff5c668@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Benjamin Berg
6b7c93c143 wifi: cfg80211: keep bss_lock held when informing
It is reasonable to hold bss_lock for a little bit longer after
cfg80211_bss_update is done. Right now, this does not make any big
difference, but doing so in preparation for the next patch which adds
a call to the driver.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.61701884ff0d.I3358228209eb6766202aff04d1bae0b8fdff611f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Benjamin Berg
c2edd30132 wifi: cfg80211: move regulatory_hint_found_beacon to be earlier
These calls do not require any locking, so move them in preparation for
the next patches.

A minor change/bugfix is to not hint a beacon for nontransmitted BSSes

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.a5bf3558eae9.I33c7465d983c8bef19deb7a533ee475a16f91774@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Emmanuel Grumbach
40e38c8dfc wifi: mac80211: feed the link_id to cfg80211_ch_switch_started_notify
For now, fix this only in station mode. We'll need to fix
the AP mode later.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.41e662ba1d68.I8faae5acb45c58cfeeb6bc6247aedbdaf9249d32@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Anjaneyulu
05050a2bc0 wifi: mac80211: add consistency check for compat chandef
Add NULL check for compat variable to avoid crash in
cfg80211_chandef_compatible() if it got called with
some mixed up channel context where not all the users
compatible with each other, which shouldn't happen.

Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.ae0f10dfd36b.Iea98c74aeb87bf6ef49f6d0c8687bba0dbea2abd@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Benjamin Berg
276311d581 wifi: mac80211: stop passing cbss to parser
In both of these cases (config_link, prep_channel) it is not needed
to parse the MBSSID data for a nontransmitted BSS. In the config_link
case the frame does not contain any MBSSID element and inheritance
rules are only needed for the ML STA profile. While in the
prep_channel case the IEs have already been processed by cfg80211 and
are already exploded.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.66d2605ff0ad.I7cdd1d390e7b0735c46204231a9e636d45b7f1e4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Mukesh Sisodiya
05995d05aa wifi: mac80211: Extend AID element addition for TDLS frames
Extend AID element addition in TDLS setup request and response
frames to add it when HE or EHT capabilities are supported.

Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.483bf44ce684.Ia2387eb24c06fa41febc213923160bedafce2085@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Abhishek Naik
71b3b7ac3e wifi: mac80211: Add HE and EHT capa elements in TDLS frames
Add HE and EHT capabilities IE in TDLS setup request,
response, confirm and discovery response frames.

Signed-off-by: Abhishek Naik <abhishek.naik@intel.com>
Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.c77128828b0d.Ied2d8800847c759718c2c35e8f6c0902afd6bca1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:28 +02:00
Abhishek Naik
8cc07265b6 wifi: mac80211: handle TDLS data frames with MLO
If the device is associated with an AP MLD, then TDLS data frames
should have
 - A1 = peer address,
 - A2 = own MLD address (since the peer may now know about MLO), and
 - A3 = BSSID.

Change the code to do that.

Signed-off-by: Abhishek Naik <abhishek.naik@intel.com>
Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.4bf648b63dfd.I98ef1dabd14b74a92120750f7746a7a512011701@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:27 +02:00
Mukesh Sisodiya
78a7ea370d wifi: mac80211: handle TDLS negotiation with MLO
Userspace can now select the link to use for TDLS management
frames (indicating e.g. which BSSID should be used), use the
link_id received from cfg80211 to build the frames.

Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.ce1fc230b505.Ie773c5679805001f5a52680d68d9ce0232c57648@changeid
[Benjamin fixed some locking]
Co-developed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
[fix sta mutex locking too]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:27 +02:00
Mukesh Sisodiya
c6112046b1 wifi: cfg80211: make TDLS management link-aware
For multi-link operation(MLO) TDLS management
frames need to be transmitted on a specific link.
The TDLS setup request will add BSSID along with
peer address and userspace will pass the link-id
based on BSSID value to the driver(or mac80211).

Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230616094948.cb3d87c22812.Ia3d15ac4a9a182145bf2d418bcb3ddf4539cd0a7@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:27 +02:00
Gustavo A. R. Silva
71e7552c90 wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point()
-Wstringop-overflow is legitimately warning us about extra_size
pontentially being zero at some point, hence potenially ending
up _allocating_ zero bytes of memory for extra pointer and then
trying to access such object in a call to copy_from_user().

Fix this by adding a sanity check to ensure we never end up
trying to allocate zero bytes of data for extra pointer, before
continue executing the rest of the code in the function.

Address the following -Wstringop-overflow warning seen when built
m68k architecture with allyesconfig configuration:
                 from net/wireless/wext-core.c:11:
In function '_copy_from_user',
    inlined from 'copy_from_user' at include/linux/uaccess.h:183:7,
    inlined from 'ioctl_standard_iw_point' at net/wireless/wext-core.c:825:7:
arch/m68k/include/asm/string.h:48:25: warning: '__builtin_memset' writing 1 or more bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
   48 | #define memset(d, c, n) __builtin_memset(d, c, n)
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/uaccess.h:153:17: note: in expansion of macro 'memset'
  153 |                 memset(to + (n - res), 0, res);
      |                 ^~~~~~
In function 'kmalloc',
    inlined from 'kzalloc' at include/linux/slab.h:694:9,
    inlined from 'ioctl_standard_iw_point' at net/wireless/wext-core.c:819:10:
include/linux/slab.h:577:16: note: at offset 1 into destination object of size 0 allocated by '__kmalloc'
  577 |         return __kmalloc(size, flags);
      |                ^~~~~~~~~~~~~~~~~~~~~~

This help with the ongoing efforts to globally enable
-Wstringop-overflow.

Link: https://github.com/KSPP/linux/issues/315
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZItSlzvIpjdjNfd8@work
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:27 +02:00
Nicolas Cavallari
6e21e7b8cd wifi: mac80211: Remove "Missing iftype sband data/EHT cap" spam
In mesh mode, ieee80211_chandef_he_6ghz_oper() is called by
mesh_matches_local() for every received mesh beacon.

On a 6 GHz mesh of a HE-only phy, this spams that the hardware does not
have EHT capabilities, even if the received mesh beacon does not have an
EHT element.

Unlike HE, not supporting EHT in the 6 GHz band is not an error so do
not print anything in this case.

Fixes: 5dca295dd7 ("mac80211: Add initial support for EHT and 320 MHz channels")

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230614132648.28995-1-nicolas.cavallari@green-communications.fr
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:26 +02:00
Ilan Peer
a8df1f580f wifi: mac80211: Add debugfs entry to report dormant links
Add debugfs entry to report dormant (valid but disabled) links.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230611121219.7fa5f022adfb.Iff6fa3e1a3b00ae726612f9d5a31f7fe2fcbfc68@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:05:19 +02:00
Ilan Peer
6d543b34db wifi: mac80211: Support disabled links during association
When the association is complete, do not configure disabled
links, and track them as part of the interface data.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.c194fabeb81a.Iaefdef5ba0492afe9a5ede14c68060a4af36e444@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:04:49 +02:00
Johannes Berg
d5a17cfb98 Merge wireless into wireless-next
There are some locking changes that will later otherwise
cause conflicts, so merge wireless into wireless-next to
avoid those.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19 12:04:16 +02:00
Jakub Kicinski
2dc6af8be0 gro: move the tc_ext comparison to a helper
The double ifdefs (one for the variable declaration and
one around the code) are quite aesthetically displeasing.
Factor this code out into a helper for easier wrapping.

This will become even more ugly when another skb ext
comparison is added in the future.

The resulting machine code looks the same, the compiler
seems to try to use %rax more and some blocks more around
but I haven't spotted minor differences.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18 18:08:35 +01:00
Chuck Lever
88770b8de3 svcrdma: Fix stale comment
Commit 7d81ee8722 ("svcrdma: Single-stage RDMA Read") changed the
behavior of svc_rdma_recvfrom() but neglected to update the
documenting comment.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-18 12:09:08 -04:00
Eric Dumazet
3515440df4 ipv6: also use netdev_hold() in ip6_route_check_nh()
In blamed commit, we missed the fact that ip6_validate_gw()
could change dev under us from ip6_route_check_nh()

In this fix, I use GFP_ATOMIC in order to not pass too many additional
arguments to ip6_validate_gw() and ip6_route_check_nh() only
for a rarely used debug feature.

syzbot reported:

refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 0 PID: 5006 at lib/refcount.c:31 refcount_warn_saturate+0x1d7/0x1f0 lib/refcount.c:31
Modules linked in:
CPU: 0 PID: 5006 Comm: syz-executor403 Not tainted 6.4.0-rc5-syzkaller-01229-g97c5209b3d37 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
RIP: 0010:refcount_warn_saturate+0x1d7/0x1f0 lib/refcount.c:31
Code: 05 fb 8e 51 0a 01 e8 98 95 38 fd 0f 0b e9 d3 fe ff ff e8 ac d9 70 fd 48 c7 c7 00 d3 a6 8a c6 05 d8 8e 51 0a 01 e8 79 95 38 fd <0f> 0b e9 b4 fe ff ff 48 89 ef e8 1a d7 c3 fd e9 5c fe ff ff 0f 1f
RSP: 0018:ffffc900039df6b8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888026d71dc0 RSI: ffffffff814c03b7 RDI: 0000000000000001
RBP: ffff888146a505fc R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 1ffff9200073bedc
R13: 00000000ffffffef R14: ffff888146a505fc R15: ffff8880284eb5a8
FS: 0000555556c88300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004585c0 CR3: 000000002b1b1000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_dec include/linux/refcount.h:344 [inline]
refcount_dec include/linux/refcount.h:359 [inline]
ref_tracker_free+0x539/0x820 lib/ref_tracker.c:236
netdev_tracker_free include/linux/netdevice.h:4097 [inline]
netdev_put include/linux/netdevice.h:4114 [inline]
netdev_put include/linux/netdevice.h:4110 [inline]
fib6_nh_init+0xb96/0x1bd0 net/ipv6/route.c:3624
ip6_route_info_create+0x10f3/0x1980 net/ipv6/route.c:3791
ip6_route_add+0x28/0x150 net/ipv6/route.c:3835
ipv6_route_ioctl+0x3fc/0x570 net/ipv6/route.c:4459
inet6_ioctl+0x246/0x290 net/ipv6/af_inet6.c:569
sock_do_ioctl+0xcc/0x230 net/socket.c:1189
sock_ioctl+0x1f8/0x680 net/socket.c:1306
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]

Fixes: 70f7457ad6 ("net: create device lookup API with reference tracking")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18 14:27:45 +01:00
Arjun Roy
7a7f094635 tcp: Use per-vma locking for receive zerocopy
Per-VMA locking allows us to lock a struct vm_area_struct without
taking the process-wide mmap lock in read mode.

Consider a process workload where the mmap lock is taken constantly in
write mode. In this scenario, all zerocopy receives are periodically
blocked during that period of time - though in principle, the memory
ranges being used by TCP are not touched by the operations that need
the mmap write lock. This results in performance degradation.

Now consider another workload where the mmap lock is never taken in
write mode, but there are many TCP connections using receive zerocopy
that are concurrently receiving. These connections all take the mmap
lock in read mode, but this does induce a lot of contention and atomic
ops for this process-wide lock. This results in additional CPU
overhead caused by contending on the cache line for this lock.

However, with per-vma locking, both of these problems can be avoided.

As a test, I ran an RPC-style request/response workload with 4KB
payloads and receive zerocopy enabled, with 100 simultaneous TCP
connections. I measured perf cycles within the
find_tcp_vma/mmap_read_lock/mmap_read_unlock codepath, with and
without per-vma locking enabled.

When using process-wide mmap semaphore read locking, about 1% of
measured perf cycles were within this path. With per-VMA locking, this
value dropped to about 0.45%.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18 11:16:00 +01:00
Chuck Lever
00a87e5d1d SUNRPC: Address RCU warning in net/sunrpc/svc.c
$ make C=1 W=1 net/sunrpc/svc.o
make[1]: Entering directory 'linux/obj/manet.1015granger.net'
  GEN     Makefile
  CALL    linux/server-development/scripts/checksyscalls.sh
  DESCEND objtool
  INSTALL libsubcmd_headers
  DESCEND bpf/resolve_btfids
  INSTALL libsubcmd_headers
  CC [M]  net/sunrpc/svc.o
  CHECK   linux/server-development/net/sunrpc/svc.c
linux/server-development/net/sunrpc/svc.c:1225:9: warning: incorrect type in argument 1 (different address spaces)
linux/server-development/net/sunrpc/svc.c:1225:9:    expected struct spinlock [usertype] *lock
linux/server-development/net/sunrpc/svc.c:1225:9:    got struct spinlock [noderef] __rcu *
linux/server-development/net/sunrpc/svc.c:1227:40: warning: incorrect type in argument 1 (different address spaces)
linux/server-development/net/sunrpc/svc.c:1227:40:    expected struct spinlock [usertype] *lock
linux/server-development/net/sunrpc/svc.c:1227:40:    got struct spinlock [noderef] __rcu *
make[1]: Leaving directory 'linux/obj/manet.1015granger.net'

Warning introduced by commit 913292c97d ("sched.h: Annotate
sighand_struct with __rcu").

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:07 -04:00
Azeem Shaikh
a9156d7e7d SUNRPC: Use sysfs_emit in place of strlcpy/sprintf
Part of an effort to remove strlcpy() tree-wide [1].

Direct replacement is safe here since the getter in kernel_params_ops
handles -errno return [2].

[1] https://github.com/KSPP/linux/issues/89
[2] https://elixir.bootlin.com/linux/v6.4-rc6/source/include/linux/moduleparam.h#L52

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:07 -04:00
Chuck Lever
6c53da5d66 SUNRPC: Remove transport class dprintk call sites
Remove a couple of dprintk call sites that are of little value.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:07 -04:00
Chuck Lever
02cea33f56 SUNRPC: Fix comments for transport class registration
The preceding block comment before svc_register_xprt_class() is
not related to that function.

While we're here, add proper documenting comments for these two
publicly-visible functions.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:07 -04:00
Chuck Lever
b55c63332e svcrdma: Remove an unused argument from __svc_rdma_put_rw_ctxt()
Clean up.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:07 -04:00
Chuck Lever
a23c76e92d svcrdma: trace cc_release calls
This event brackets the svcrdma_post_* trace points. If this trace
event is enabled but does not appear as expected, that indicates a
chunk_ctxt leak.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:06 -04:00
Chuck Lever
91f8ce2846 svcrdma: Convert "might sleep" comment into a code annotation
Try to catch incorrect calling contexts mechanically rather than by
code review.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:06 -04:00
Chuck Lever
f8335a212a SUNRPC: Move initialization of rq_stime
Micro-optimization: Call ktime_get() only when ->xpo_recvfrom() has
given us a full RPC message to process. rq_stime isn't used
otherwise, so this avoids pointless work.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:06 -04:00
Chuck Lever
5581cf8efc SUNRPC: Optimize page release in svc_rdma_sendto()
Now that we have bulk page allocation and release APIs, it's more
efficient to use those than it is for nfsd threads to wait for send
completions. Previous patches have eliminated the calls to
wait_for_completion() and complete(), in order to avoid scheduler
overhead.

Now release pages-under-I/O in the send completion handler using
the efficient bulk release API.

I've measured a 7% reduction in cumulative CPU utilization in
svc_rdma_sendto(), svc_rdma_wc_send(), and svc_xprt_release(). In
particular, using release_pages() instead of complete() cuts the
time per svc_rdma_wc_send() call by two-thirds. This helps improve
scalability because svc_rdma_wc_send() is single-threaded per
connection.

Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:06 -04:00
Chuck Lever
baf6d18b11 svcrdma: Prevent page release when nothing was received
I noticed that svc_rqst_release_pages() was still unnecessarily
releasing a page when svc_rdma_recvfrom() returns zero.

Fixes: a53d5cb064 ("svcrdma: Avoid releasing a page in svc_xprt_release()")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17 13:18:04 -04:00
mfreemon@cloudflare.com
b650d953cd tcp: enforce receive buffer memory limits by allowing the tcp window to shrink
Under certain circumstances, the tcp receive buffer memory limit
set by autotuning (sk_rcvbuf) is increased due to incoming data
packets as a result of the window not closing when it should be.
This can result in the receive buffer growing all the way up to
tcp_rmem[2], even for tcp sessions with a low BDP.

To reproduce:  Connect a TCP session with the receiver doing
nothing and the sender sending small packets (an infinite loop
of socket send() with 4 bytes of payload with a sleep of 1 ms
in between each send()).  This will cause the tcp receive buffer
to grow all the way up to tcp_rmem[2].

As a result, a host can have individual tcp sessions with receive
buffers of size tcp_rmem[2], and the host itself can reach tcp_mem
limits, causing the host to go into tcp memory pressure mode.

The fundamental issue is the relationship between the granularity
of the window scaling factor and the number of byte ACKed back
to the sender.  This problem has previously been identified in
RFC 7323, appendix F [1].

The Linux kernel currently adheres to never shrinking the window.

In addition to the overallocation of memory mentioned above, the
current behavior is functionally incorrect, because once tcp_rmem[2]
is reached when no remediations remain (i.e. tcp collapse fails to
free up any more memory and there are no packets to prune from the
out-of-order queue), the receiver will drop in-window packets
resulting in retransmissions and an eventual timeout of the tcp
session.  A receive buffer full condition should instead result
in a zero window and an indefinite wait.

In practice, this problem is largely hidden for most flows.  It
is not applicable to mice flows.  Elephant flows can send data
fast enough to "overrun" the sk_rcvbuf limit (in a single ACK),
triggering a zero window.

But this problem does show up for other types of flows.  Examples
are websockets and other type of flows that send small amounts of
data spaced apart slightly in time.  In these cases, we directly
encounter the problem described in [1].

RFC 7323, section 2.4 [2], says there are instances when a retracted
window can be offered, and that TCP implementations MUST ensure
that they handle a shrinking window, as specified in RFC 1122,
section 4.2.2.16 [3].  All prior RFCs on the topic of tcp window
management have made clear that sender must accept a shrunk window
from the receiver, including RFC 793 [4] and RFC 1323 [5].

This patch implements the functionality to shrink the tcp window
when necessary to keep the right edge within the memory limit by
autotuning (sk_rcvbuf).  This new functionality is enabled with
the new sysctl: net.ipv4.tcp_shrink_window

Additional information can be found at:
https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/

[1] https://www.rfc-editor.org/rfc/rfc7323#appendix-F
[2] https://www.rfc-editor.org/rfc/rfc7323#section-2.4
[3] https://www.rfc-editor.org/rfc/rfc1122#page-91
[4] https://www.rfc-editor.org/rfc/rfc793
[5] https://www.rfc-editor.org/rfc/rfc1323

Signed-off-by: Mike Freemon <mfreemon@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-17 09:53:53 +01:00
Petr Oros
a52305a81d devlink: report devlink_port_type_warn source device
devlink_port_type_warn is scheduled for port devlink and warning
when the port type is not set. But from this warning it is not easy
found out which device (driver) has no devlink port set.

[ 3709.975552] Type was not set for devlink port.
[ 3709.975579] WARNING: CPU: 1 PID: 13092 at net/devlink/leftover.c:6775 devlink_port_type_warn+0x11/0x20
[ 3709.993967] Modules linked in: openvswitch nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink bluetooth rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs vhost_net vhost vhost_iotlb tap tun bridge stp llc qrtr intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal mlx5_ib intel_powerclamp coretemp dell_wmi ledtrig_audio sparse_keymap ipmi_ssif kvm_intel ib_uverbs rfkill ib_core video kvm iTCO_wdt acpi_ipmi intel_vsec irqbypass ipmi_si iTCO_vendor_support dcdbas ipmi_devintf mei_me ipmi_msghandler rapl mei intel_cstate isst_if_mmio isst_if_mbox_pci dell_smbios intel_uncore isst_if_common i2c_i801 dell_wmi_descriptor wmi_bmof i2c_smbus intel_pch_thermal pcspkr acpi_power_meter xfs libcrc32c sd_mod sg nvme_tcp mgag200 i2c_algo_bit nvme_fabrics drm_shmem_helper drm_kms_helper nvme syscopyarea ahci sysfillrect sysimgblt nvme_core fb_sys_fops crct10dif_pclmul libahci mlx5_core sfc crc32_pclmul nvme_common drm
[ 3709.994030]  crc32c_intel mtd t10_pi mlxfw libata tg3 mdio megaraid_sas psample ghash_clmulni_intel pci_hyperv_intf wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse
[ 3710.108431] CPU: 1 PID: 13092 Comm: kworker/1:1 Kdump: loaded Not tainted 5.14.0-319.el9.x86_64 #1
[ 3710.108435] Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.8.2 09/14/2022
[ 3710.108437] Workqueue: events devlink_port_type_warn
[ 3710.108440] RIP: 0010:devlink_port_type_warn+0x11/0x20
[ 3710.108443] Code: 84 76 fe ff ff 48 c7 03 20 0e 1a ad 31 c0 e9 96 fd ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 48 c7 c7 18 24 4e ad e8 ef 71 62 ff <0f> 0b c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 87
[ 3710.108445] RSP: 0018:ff3b6d2e8b3c7e90 EFLAGS: 00010282
[ 3710.108447] RAX: 0000000000000000 RBX: ff366d6580127080 RCX: 0000000000000027
[ 3710.108448] RDX: 0000000000000027 RSI: 00000000ffff86de RDI: ff366d753f41f8c8
[ 3710.108449] RBP: ff366d658ff5a0c0 R08: ff366d753f41f8c0 R09: ff3b6d2e8b3c7e18
[ 3710.108450] R10: 0000000000000001 R11: 0000000000000023 R12: ff366d753f430600
[ 3710.108451] R13: ff366d753f436900 R14: 0000000000000000 R15: ff366d753f436905
[ 3710.108452] FS:  0000000000000000(0000) GS:ff366d753f400000(0000) knlGS:0000000000000000
[ 3710.108453] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3710.108454] CR2: 00007f1c57bc74e0 CR3: 000000111d26a001 CR4: 0000000000773ee0
[ 3710.108456] PKRU: 55555554
[ 3710.108457] Call Trace:
[ 3710.108458]  <TASK>
[ 3710.108459]  process_one_work+0x1e2/0x3b0
[ 3710.108466]  ? rescuer_thread+0x390/0x390
[ 3710.108468]  worker_thread+0x50/0x3a0
[ 3710.108471]  ? rescuer_thread+0x390/0x390
[ 3710.108473]  kthread+0xdd/0x100
[ 3710.108477]  ? kthread_complete_and_exit+0x20/0x20
[ 3710.108479]  ret_from_fork+0x1f/0x30
[ 3710.108485]  </TASK>
[ 3710.108486] ---[ end trace 1b4b23cd0c65d6a0 ]---

After patch:
[  402.473064] ice 0000:41:00.0: Type was not set for devlink port.
[  402.473064] ice 0000:41:00.1: Type was not set for devlink port.

Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230615095447.8259-1-poros@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17 00:31:14 -07:00
Lin Ma
f60ce8a48b net: mctp: remove redundant RTN_UNICAST check
Current mctp_newroute() contains two exactly same check against
rtm->rtm_type

static int mctp_newroute(...)
{
...
    if (rtm->rtm_type != RTN_UNICAST) { // (1)
        NL_SET_ERR_MSG(extack, "rtm_type must be RTN_UNICAST");
        return -EINVAL;
    }
...
    if (rtm->rtm_type != RTN_UNICAST) // (2)
        return -EINVAL;
...
}

This commits removes the (2) check as it is redundant.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230615152240.1749428-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17 00:25:24 -07:00
David Howells
9f8d0dc0ec kcm: Fix unnecessary psock unreservation.
kcm_write_msgs() calls unreserve_psock() to release its hold on the
underlying TCP socket if it has run out of things to transmit, but if we
have nothing in the write queue on entry (e.g. because someone did a
zero-length sendmsg), we don't actually go into the transmission loop and
as a consequence don't call reserve_psock().

Fix this by skipping the call to unreserve_psock() if we didn't reserve a
psock.

Fixes: c31a25e1db ("kcm: Send multiple frags in one sendmsg()")
Reported-by: syzbot+dd1339599f1840e4cc65@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000a61ffe05fe0c3d08@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: syzbot+dd1339599f1840e4cc65@syzkaller.appspotmail.com
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20787.1686828722@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17 00:08:27 -07:00
Azeem Shaikh
cd91250306 ieee802154: Replace strlcpy with strscpy
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

Direct replacement is safe here since the return values
from the helper macros are ignored by the callers.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230613003326.3538391-1-azeemshaikh38@gmail.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2023-06-16 22:14:24 +02:00
David Howells
5a6f687360 ip, ip6: Fix splice to raw and ping sockets
Splicing to SOCK_RAW sockets may set MSG_SPLICE_PAGES, but in such a case,
__ip_append_data() will call skb_splice_from_iter() to access the 'from'
data, assuming it to point to a msghdr struct with an iter, instead of
using the provided getfrag function to access it.

In the case of raw_sendmsg(), however, this is not the case and 'from' will
point to a raw_frag_vec struct and raw_getfrag() will be the frag-getting
function.  A similar issue may occur with rawv6_sendmsg().

Fix this by ignoring MSG_SPLICE_PAGES if getfrag != ip_generic_getfrag as
ip_generic_getfrag() expects "from" to be a msghdr*, but the other getfrags
don't.  Note that this will prevent MSG_SPLICE_PAGES from being effective
for udplite.

This likely affects ping sockets too.  udplite looks like it should be okay
as it expects "from" to be a msghdr.

Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: syzbot+d8486855ef44506fd675@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000ae4cbf05fdeb8349@google.com/
Fixes: 2dc334f1a6 ("splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()")
Tested-by: syzbot+d8486855ef44506fd675@syzkaller.appspotmail.com
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/1410156.1686729856@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-16 11:45:16 -07:00
Sebastian Andrzej Siewior
f015b900bc xfrm: Linearize the skb after offloading if needed.
With offloading enabled, esp_xmit() gets invoked very late, from within
validate_xmit_xfrm() which is after validate_xmit_skb() validates and
linearizes the skb if the underlying device does not support fragments.

esp_output_tail() may add a fragment to the skb while adding the auth
tag/ IV. Devices without the proper support will then send skb->data
points to with the correct length so the packet will have garbage at the
end. A pcap sniffer will claim that the proper data has been sent since
it parses the skb properly.

It is not affected with INET_ESP_OFFLOAD disabled.

Linearize the skb after offloading if the sending hardware requires it.
It was tested on v4, v6 has been adopted.

Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2023-06-16 10:29:50 +02:00
Piotr Gardocki
ad72c4a06a net: add check for current MAC address in dev_set_mac_address
In some cases it is possible for kernel to come with request
to change primary MAC address to the address that is already
set on the given interface.

Add proper check to return fast from the function in these cases.

An example of such case is adding an interface to bonding
channel in balance-alb mode:
modprobe bonding mode=balance-alb miimon=100 max_bonds=1
ip link set bond0 up
ifenslave bond0 <eth>

Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 22:54:54 -07:00
Breno Leitao
e1d001fa5b net: ioctl: Use kernel memory on protocol ioctl callbacks
Most of the ioctls to net protocols operates directly on userspace
argument (arg). Usually doing get_user()/put_user() directly in the
ioctl callback.  This is not flexible, because it is hard to reuse these
functions without passing userspace buffers.

Change the "struct proto" ioctls to avoid touching userspace memory and
operate on kernel buffers, i.e., all protocol's ioctl callbacks is
adapted to operate on a kernel memory other than on userspace (so, no
more {put,get}_user() and friends being called in the ioctl callback).

This changes the "struct proto" ioctl format in the following way:

    int                     (*ioctl)(struct sock *sk, int cmd,
-                                        unsigned long arg);
+                                        int *karg);

(Important to say that this patch does not touch the "struct proto_ops"
protocols)

So, the "karg" argument, which is passed to the ioctl callback, is a
pointer allocated to kernel space memory (inside a function wrapper).
This buffer (karg) may contain input argument (copied from userspace in
a prep function) and it might return a value/buffer, which is copied
back to userspace if necessary. There is not one-size-fits-all format
(that is I am using 'may' above), but basically, there are three type of
ioctls:

1) Do not read from userspace, returns a result to userspace
2) Read an input parameter from userspace, and does not return anything
  to userspace
3) Read an input from userspace, and return a buffer to userspace.

The default case (1) (where no input parameter is given, and an "int" is
returned to userspace) encompasses more than 90% of the cases, but there
are two other exceptions. Here is a list of exceptions:

* Protocol RAW:
   * cmd = SIOCGETVIFCNT:
     * input and output = struct sioc_vif_req
   * cmd = SIOCGETSGCNT
     * input and output = struct sioc_sg_req
   * Explanation: for the SIOCGETVIFCNT case, userspace passes the input
     argument, which is struct sioc_vif_req. Then the callback populates
     the struct, which is copied back to userspace.

* Protocol RAW6:
   * cmd = SIOCGETMIFCNT_IN6
     * input and output = struct sioc_mif_req6
   * cmd = SIOCGETSGCNT_IN6
     * input and output = struct sioc_sg_req6

* Protocol PHONET:
  * cmd == SIOCPNADDRESOURCE | SIOCPNDELRESOURCE
     * input int (4 bytes)
  * Nothing is copied back to userspace.

For the exception cases, functions sock_sk_ioctl_inout() will
copy the userspace input, and copy it back to kernel space.

The wrapper that prepare the buffer and put the buffer back to user is
sk_ioctl(), so, instead of calling sk->sk_prot->ioctl(), the callee now
calls sk_ioctl(), which will handle all cases.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230609152800.830401-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 22:33:26 -07:00
Jakub Kicinski
173780ff18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/linux/mlx5/driver.h
  617f5db1a6 ("RDMA/mlx5: Fix affinity assignment")
  dc13180824 ("net/mlx5: Enable devlink port for embedded cpu VF vports")
https://lore.kernel.org/all/20230613125939.595e50b8@canb.auug.org.au/

tools/testing/selftests/net/mptcp/mptcp_join.sh
  47867f0a7e ("selftests: mptcp: join: skip check if MIB counter not supported")
  425ba80312 ("selftests: mptcp: join: support RM_ADDR for used endpoints or not")
  45b1a1227a ("mptcp: introduces more address related mibs")
  0639fa230a ("selftests: mptcp: add explicit check for new mibs")
https://lore.kernel.org/netdev/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 22:19:41 -07:00
Kuniyuki Iwashima
b144fcaf46 dccp: Print deprecation notice.
DCCP was marked as Orphan in the MAINTAINERS entry 2 years ago in commit
054c4610bd ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS").  It says
we haven't heard from the maintainer for five years, so DCCP is not well
maintained for 7 years now.

Recently DCCP only receives updates for bugs, and major distros disable it
by default.

Removing DCCP would allow for better organisation of TCP fields to reduce
the number of cache lines hit in the fast path.

Let's add a deprecation notice when DCCP socket is created and schedule its
removal to 2025.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 15:08:59 -07:00
Kuniyuki Iwashima
be28c14ac8 udplite: Print deprecation notice.
Recently syzkaller reported a 7-year-old null-ptr-deref [0] that occurs
when a UDP-Lite socket tries to allocate a buffer under memory pressure.

Someone should have stumbled on the bug much earlier if UDP-Lite had been
used in a real app.  Also, we do not always need a large UDP-Lite workload
to hit the bug since UDP and UDP-Lite share the same memory accounting
limit.

Removing UDP-Lite would simplify UDP code removing a bunch of conditionals
in fast path.

Let's add a deprecation notice when UDP-Lite socket is created and schedule
its removal to 2025.

Link: https://lore.kernel.org/netdev/20230523163305.66466-1-kuniyu@amazon.com/ [0]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 15:08:58 -07:00
Lin Ma
44194cb1b6 net: tipc: resize nlattr array to correct size
According to nla_parse_nested_deprecated(), the tb[] is supposed to the
destination array with maxtype+1 elements. In current
tipc_nl_media_get() and __tipc_nl_media_set(), a larger array is used
which is unnecessary. This patch resize them to a proper size.

Fixes: 1e55417d8f ("tipc: add media set to new netlink api")
Fixes: 46f15c6794 ("tipc: add media get/dump to new netlink api")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20230614120604.1196377-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 14:59:17 -07:00
Jakub Kicinski
ed3c9a2fca net: tls: make the offload check helper take skb not socket
All callers of tls_is_sk_tx_device_offloaded() currently do
an equivalent of:

 if (skb->sk && tls_is_skb_tx_device_offloaded(skb->sk))

Have the helper accept skb and do the skb->sk check locally.
Two drivers have local static inlines with similar wrappers
already.

While at it change the ifdef condition to TLS_DEVICE.
Only TLS_DEVICE selects SOCK_VALIDATE_XMIT, so the two are
equivalent. This makes removing the duplicated IS_ENABLED()
check in funeth more obviously correct.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Dimitris Michailidis <dmichail@fungible.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15 09:01:05 +01:00
Jakub Kicinski
48eed027d3 netpoll: allocate netdev tracker right away
Commit 5fa5ae6058 ("netpoll: add net device refcount tracker to struct netpoll")
was part of one of the initial netdev tracker introduction patches.
It added an explicit netdev_tracker_alloc() for netpoll, presumably
because the flow of the function is somewhat odd.
After most of the core networking stack was converted to use
the tracking hold() variants, netpoll's call to old dev_hold()
stands out a bit.

np is allocated by the caller and ready to use, we can use
netdev_hold() here, even tho np->ndev will only be set to
ndev inside __netpoll_setup().

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15 08:21:11 +01:00
Jakub Kicinski
70f7457ad6 net: create device lookup API with reference tracking
New users of dev_get_by_index() and dev_get_by_name() keep
getting added and it would be nice to steer them towards
the APIs with reference tracking.

Add variants of those calls which allocate the reference
tracker and use them in a couple of places.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15 08:21:11 +01:00
Vlad Buslov
c9a82bec02 net/sched: cls_api: Fix lockup on flushing explicitly created chain
Mingshuai Ren reports:

When a new chain is added by using tc, one soft lockup alarm will be
 generated after delete the prio 0 filter of the chain. To reproduce
 the problem, perform the following steps:
(1) tc qdisc add dev eth0 root handle 1: htb default 1
(2) tc chain add dev eth0
(3) tc filter del dev eth0 chain 0 parent 1: prio 0
(4) tc filter add dev eth0 chain 0 parent 1:

Fix the issue by accounting for additional reference to chains that are
explicitly created by RTM_NEWCHAIN message as opposed to implicitly by
RTM_NEWTFILTER message.

Fixes: 726d061286 ("net: sched: prevent insertion of new classifiers during chain flush")
Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Link: https://lore.kernel.org/r/20230612093426.2867183-1-vladbu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14 23:03:16 -07:00
Xin Long
89da780aa4 rtnetlink: move validate_linkmsg out of do_setlink
This patch moves validate_linkmsg() out of do_setlink() to its callers
and deletes the early validate_linkmsg() call in __rtnl_newlink(), so
that it will not call validate_linkmsg() twice in either of the paths:

  - __rtnl_newlink() -> do_setlink()
  - __rtnl_newlink() -> rtnl_newlink_create() -> rtnl_create_link()

Additionally, as validate_linkmsg() is now only called with a real
dev, we can remove the NULL check for dev in validate_linkmsg().

Note that we moved validate_linkmsg() check to the places where it has
not done any changes to the dev, as Jakub suggested.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/cf2ef061e08251faf9e8be25ff0d61150c030475.1686585334.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14 22:34:13 -07:00
Lin Ma
361b6889ae net/handshake: remove fput() that causes use-after-free
A reference underflow is found in TLS handshake subsystem that causes a
direct use-after-free. Part of the crash log is like below:

[    2.022114] ------------[ cut here ]------------
[    2.022193] refcount_t: underflow; use-after-free.
[    2.022288] WARNING: CPU: 0 PID: 60 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
[    2.022432] Modules linked in:
[    2.022848] RIP: 0010:refcount_warn_saturate+0xbe/0x110
[    2.023231] RSP: 0018:ffffc900001bfe18 EFLAGS: 00000286
[    2.023325] RAX: 0000000000000000 RBX: 0000000000000007 RCX: 00000000ffffdfff
[    2.023438] RDX: 0000000000000000 RSI: 00000000ffffffea RDI: 0000000000000001
[    2.023555] RBP: ffff888004c20098 R08: ffffffff82b392c8 R09: 00000000ffffdfff
[    2.023693] R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888004c200d8
[    2.023813] R13: 0000000000000000 R14: ffff888004c20000 R15: ffffc90000013ca8
[    2.023930] FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[    2.024062] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.024161] CR2: ffff888003601000 CR3: 0000000002a2e000 CR4: 00000000000006f0
[    2.024275] Call Trace:
[    2.024322]  <TASK>
[    2.024367]  ? __warn+0x7f/0x130
[    2.024430]  ? refcount_warn_saturate+0xbe/0x110
[    2.024513]  ? report_bug+0x199/0x1b0
[    2.024585]  ? handle_bug+0x3c/0x70
[    2.024676]  ? exc_invalid_op+0x18/0x70
[    2.024750]  ? asm_exc_invalid_op+0x1a/0x20
[    2.024830]  ? refcount_warn_saturate+0xbe/0x110
[    2.024916]  ? refcount_warn_saturate+0xbe/0x110
[    2.024998]  __tcp_close+0x2f4/0x3d0
[    2.025065]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[    2.025168]  tcp_close+0x1f/0x70
[    2.025231]  inet_release+0x33/0x60
[    2.025297]  sock_release+0x1f/0x80
[    2.025361]  handshake_req_cancel_test2+0x100/0x2d0
[    2.025457]  kunit_try_run_case+0x4c/0xa0
[    2.025532]  kunit_generic_run_threadfn_adapter+0x15/0x20
[    2.025644]  kthread+0xe1/0x110
[    2.025708]  ? __pfx_kthread+0x10/0x10
[    2.025780]  ret_from_fork+0x2c/0x50

One can enable CONFIG_NET_HANDSHAKE_KUNIT_TEST config to reproduce above
crash.

The root cause of this bug is that the commit 1ce77c998f
("net/handshake: Unpin sock->file if a handshake is cancelled") adds one
additional fput() function. That patch claims that the fput() is used to
enable sock->file to be freed even when user space never calls DONE.

However, it seems that the intended DONE routine will never give an
additional fput() of ths sock->file. The existing two of them are just
used to balance the reference added in sockfd_lookup().

This patch revert the mentioned commit to avoid the use-after-free. The
patched kernel could successfully pass the KUNIT test and boot to shell.

[    0.733613]     # Subtest: Handshake API tests
[    0.734029]     1..11
[    0.734255]         KTAP version 1
[    0.734542]         # Subtest: req_alloc API fuzzing
[    0.736104]         ok 1 handshake_req_alloc NULL proto
[    0.736114]         ok 2 handshake_req_alloc CLASS_NONE
[    0.736559]         ok 3 handshake_req_alloc CLASS_MAX
[    0.737020]         ok 4 handshake_req_alloc no callbacks
[    0.737488]         ok 5 handshake_req_alloc no done callback
[    0.737988]         ok 6 handshake_req_alloc excessive privsize
[    0.738529]         ok 7 handshake_req_alloc all good
[    0.739036]     # req_alloc API fuzzing: pass:7 fail:0 skip:0 total:7
[    0.739444]     ok 1 req_alloc API fuzzing
[    0.740065]     ok 2 req_submit NULL req arg
[    0.740436]     ok 3 req_submit NULL sock arg
[    0.740834]     ok 4 req_submit NULL sock->file
[    0.741236]     ok 5 req_lookup works
[    0.741621]     ok 6 req_submit max pending
[    0.741974]     ok 7 req_submit multiple
[    0.742382]     ok 8 req_cancel before accept
[    0.742764]     ok 9 req_cancel after accept
[    0.743151]     ok 10 req_cancel after done
[    0.743510]     ok 11 req_destroy works
[    0.743882] # Handshake API tests: pass:11 fail:0 skip:0 total:11
[    0.744205] # Totals: pass:17 fail:0 skip:0 total:17

Acked-by: Chuck Lever <chuck.lever@oracle.com>
Fixes: 1ce77c998f ("net/handshake: Unpin sock->file if a handshake is cancelled")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://lore.kernel.org/r/20230613083204.633896-1-linma@zju.edu.cn
Link: https://lore.kernel.org/r/20230614015249.987448-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14 22:26:37 -07:00
Jakub Kicinski
37cec6ed8d A couple of straggler fixes, mostly in the stack:
* fix fragmentation for multi-link related elements
  * fix callback copy/paste error
  * fix multi-link locking
  * remove double-locking of wiphy mutex
  * transmit only on active links, not all
  * activate links in the correct order
  * don't remove links that weren't added
  * disable soft-IRQs for LQ lock in iwlwifi
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmSJcf0ACgkQ10qiO8sP
 aAC77Q//TOZSUoAFnMqsA23SXwN8CeNQC5yBmYwxVMcsqBTO6+7k0NphpFGJUcLA
 OG8Leo6gwJdhLFYt7bl7RfHGSQrAZcNeYB9pv9J96vGQGnCComAAANEWSo8OBT2a
 03yYrfcKfkXAUNm2dxKvwmi3D3/VPyOgj+O6LNEs1DogHw0V6GdthW3J6s/vl6RU
 MPCQqlIFY9j20mXEKPFMaIZ8fyQKh38xa5YttGmeFrSUKYSljWUqqUooSMIkeyS4
 D5mYdzbsqCiihnN1FenEjkBUe2eS6BzxL+KVLaY2vth4tQytGeasvCaGcLcB83nc
 BxGR0rbEkrwp7nBqE4ZpMmhzHG3hpWus2+hJtMWsQku7qzE/vMh4qv2s2+QUVk/3
 jCXGv233bIgvQ2d1SUqp7CenGjJ0eBfKKRVzM+Hyiz+V6kWsugMxNaBmi59JVB7w
 5JilT85LfV2cRJgHtkDY7kMpDWnVYfwenvSywoXaRdVuKiowMUhZ9P19wLE0gn7K
 qtKIaLnkrLE2QHdqlxcuyMPBLhfga2+qXuo94SIYMFNURW7jjJcSVlN8ZVqqBvvp
 ib51XCyx/95zAr1Vyly/Pc7puuCMiiQk0ZhQBgqFPrnjs37JIzHDNo4Cq6H9+FlY
 0EncP/akjy8t7PsBdTmQv1UG3wq4EG5Wmh+wLDpa5QKKs2IqcCQ=
 =IPqO
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2023-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
A couple of straggler fixes, mostly in the stack:
 - fix fragmentation for multi-link related elements
 - fix callback copy/paste error
 - fix multi-link locking
 - remove double-locking of wiphy mutex
 - transmit only on active links, not all
 - activate links in the correct order
 - don't remove links that weren't added
 - disable soft-IRQs for LQ lock in iwlwifi

* tag 'wireless-2023-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression
  wifi: mac80211: fragment per STA profile correctly
  wifi: mac80211: Use active_links instead of valid_links in Tx
  wifi: cfg80211: remove links only on AP
  wifi: mac80211: take lock before setting vif links
  wifi: cfg80211: fix link del callback to call correct handler
  wifi: mac80211: fix link activation settings order
  wifi: cfg80211: fix double lock bug in reg_wdev_chan_valid()
====================

Link: https://lore.kernel.org/r/20230614075502.11765-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14 21:28:59 -07:00
Edwin Peer
fa0e21fa44 rtnetlink: extend RTEXT_FILTER_SKIP_STATS to IFLA_VF_INFO
This filter already exists for excluding IPv6 SNMP stats. Extend its
definition to also exclude IFLA_VF_INFO stats in RTM_GETLINK.

This patch constitutes a partial fix for a netlink attribute nesting
overflow bug in IFLA_VFINFO_LIST. By excluding the stats when the
requester doesn't need them, the truncation of the VF list is avoided.

While it was technically only the stats added in commit c5a9f6f0ab
("net/core: Add drop counters to VF statistics") breaking the camel's
back, the appreciable size of the stats data should never have been
included without due consideration for the maximum number of VFs
supported by PCI.

Fixes: 3b766cd832 ("net/core: Add reading VF statistics through the PF netdevice")
Fixes: c5a9f6f0ab ("net/core: Add drop counters to VF statistics")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Cc: Edwin Peer <espeer@gmail.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20230611105108.122586-1-gal@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14 13:28:26 +02:00
Azeem Shaikh
f3c21ed9ce wifi: mac80211: Replace strlcpy with strscpy
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

Direct replacement is safe here since LOCAL_ASSIGN is only used by
TRACE macros and the return values are ignored.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230613003404.3538524-1-azeemshaikh38@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:32:19 +02:00
Azeem Shaikh
0ffe85885b wifi: cfg80211: replace strlcpy() with strscpy()
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

Direct replacement is safe here since WIPHY_ASSIGN is only used by
TRACE macros and the return values are ignored.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230612232301.2572316-1-azeemshaikh38@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:32:19 +02:00
Ilan Peer
4cacadc0db wifi: mac80211: Fix permissions for valid_links debugfs entry
The entry should be a read only one and not a write only one. Fix it.

Fixes: 3d90110292 ("wifi: mac80211: implement link switching")
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230611121219.c75316990411.I1565a7fcba8a37f83efffb0cc6b71c572b896e94@changeid
[remove x16 change since it doesn't work yet]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:31:55 +02:00
Ilan Peer
6cf963edbb wifi: cfg80211: Support association to AP MLD with disabled links
An AP part of an AP MLD might be temporarily disabled, and might be
enabled later. Such a link should be included in the association
exchange, but should not be used until enabled.

Extend the NL80211_CMD_ASSOCIATE to also indicate disabled links.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.c4c61ee4c4a5.I784ef4a0d619fc9120514b5615458fbef3b3684a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:21:17 +02:00
Ilan Peer
f1871abd27 wifi: mac80211: Add getter functions for vif MLD state
As a preparation to support disabled/dormant links, add the
following function:

- ieee80211_vif_usable_links(): returns the bitmap of the links
  that can be activated. Use this function in all the places that
  the bitmap of the usable links is needed.

- ieee80211_vif_is_mld(): returns true iff the vif is an MLD.
  Use this function in all the places where an indication that the
  connection is a MLD is needed.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.86e3351da1fc.If6fe3a339fda2019f13f57ff768ecffb711b710a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Miri Korenblit
bc1be54d7e wifi: mac80211: allow disabling SMPS debugfs controls
There are cases in which we don't want the user to override the
smps mode, e.g. when SMPS should be disabled due to EMLSR. Add
a driver flag to disable SMPS overriding and don't override if
it is set.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.ef129e80556c.I74a298fdc86b87074c95228d3916739de1400597@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Johannes Berg
0e966d9a35 wifi: mac80211: don't update rx_stats.last_rate for NDP
If we get an NDP (null data packet), there's reason to
believe the peer is just sending it to probe, and that
would happen at a low rate. Don't track this packet for
purposes of last RX rate reporting.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.8af46c4ac094.I13d9d5019addeaa4aff3c8a05f56c9f5a86b1ebd@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Benjamin Berg
556f16b834 wifi: mac80211: fix CSA processing while scanning
The channel switch parsing code would simply return if a scan is
in-progress. Supposedly, this was because channel switch announcements
from other APs should be ignored.

For the beacon case, the function is already only called if we are
associated with the sender. For the action frame cases, add the
appropriate check whether the frame is coming from the AP we are
associated with. Finally, drop the scanning check from
ieee80211_sta_process_chanswitch.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.3366e9302468.I6c7e0b58c33b7fb4c675374cfe8c3a5cddcec416@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Johannes Berg
b580a372b8 wifi: mac80211: mlme: clarify WMM messages
These messages apply to a single link only, use link_info()
to indicate that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.21a6bece4313.I08118e5e851fae2f9e43f8a58d3b6217709bf578@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Anjaneyulu
c6968d4fc9 wifi: mac80211: pass roc->sdata to drv_cancel_remain_on_channel()
In suspend flow "sdata" is NULL, destroy all roc's which are started.
pass "roc->sdata" to drv_cancel_remain_on_channel() to avoid NULL
dereference and destroy that roc

Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.c678187a308c.Ic11578778655e273931efc5355d570a16465d1be@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:08 +02:00
Johannes Berg
4c2d68f798 wifi: mac80211: include key action/command in tracing
We trace the key information and all, but not whether the key
is added or removed - add that information.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.546e86e216df.Ie3bf9009926f8fa154dde52b0c02537ff7edae36@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 12:20:07 +02:00
Johannes Berg
1ec7291e24 wifi: mac80211: add helpers to access sband iftype data
There's quite a bit of code accessing sband iftype data
(HE, HE 6 GHz, EHT) and we always need to remember to use
the ieee80211_vif_type_p2p() helper. Add new helpers to
directly get it from the sband/vif rather than having to
call ieee80211_vif_type_p2p().

Convert most code with the following spatch:

    @@
    expression vif, sband;
    @@
    -ieee80211_get_he_iftype_cap(sband, ieee80211_vif_type_p2p(vif))
    +ieee80211_get_he_iftype_cap_vif(sband, vif)

    @@
    expression vif, sband;
    @@
    -ieee80211_get_eht_iftype_cap(sband, ieee80211_vif_type_p2p(vif))
    +ieee80211_get_eht_iftype_cap_vif(sband, vif)

    @@
    expression vif, sband;
    @@
    -ieee80211_get_he_6ghz_capa(sband, ieee80211_vif_type_p2p(vif))
    +ieee80211_get_he_6ghz_capa_vif(sband, vif)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230604120651.db099f49e764.Ie892966c49e22c7b7ee1073bc684f142debfdc84@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 11:57:29 +02:00
Gilad Itzkovitch
2ad66fcb2f wifi: cfg80211: S1G rate information and calculations
Increase the size of S1G rate_info flags to support S1G and add
flags for new S1G MCS and the supported bandwidths. Also, include
S1G rate information to netlink STA rate message. Lastly, add
rate calculation function for S1G MCS.

Signed-off-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com>
Link: https://lore.kernel.org/r/20230518000723.991912-1-gilad.itzkovitch@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-14 11:57:26 +02:00
Peilin Ye
84ad0af0bc net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized
in ingress_init() to point to net_device::miniq_ingress.  ingress Qdiscs
access this per-net_device pointer in mini_qdisc_pair_swap().  Similar
for clsact Qdiscs and miniq_egress.

Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER
requests (thanks Hillf Danton for the hint), when replacing ingress or
clsact Qdiscs, for example, the old Qdisc ("@old") could access the same
miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"),
causing race conditions [1] including a use-after-free bug in
mini_qdisc_pair_swap() reported by syzbot:

 BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901
...
 Call Trace:
  <TASK>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
  print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319
  print_report mm/kasan/report.c:430 [inline]
  kasan_report+0x11c/0x130 mm/kasan/report.c:536
  mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
  tcf_chain_head_change_item net/sched/cls_api.c:495 [inline]
  tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509
  tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline]
  tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline]
  tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266
...

@old and @new should not affect each other.  In other words, @old should
never modify miniq_{in,e}gress after @new, and @new should not update
@old's RCU state.

Fixing without changing sch_api.c turned out to be difficult (please
refer to Closes: for discussions).  Instead, make sure @new's first call
always happen after @old's last call (in {ingress,clsact}_destroy()) has
finished:

In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests,
and call qdisc_destroy() for @old before grafting @new.

Introduce qdisc_refcount_dec_if_one() as the counterpart of
qdisc_refcount_inc_nz() used for filter requests.  Introduce a
non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check,
just like qdisc_put() etc.

Depends on patch "net/sched: Refactor qdisc_graft() for ingress and
clsact Qdiscs".

[1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under
TC_H_ROOT (no longer possible after commit c7cfbd1150 ("net/sched:
sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8
transmission queues:

  Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2),
  then adds a flower filter X to A.

  Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and
  b2) to replace A, then adds a flower filter Y to B.

 Thread 1               A's refcnt   Thread 2
  RTM_NEWQDISC (A, RTNL-locked)
   qdisc_create(A)               1
   qdisc_graft(A)                9

  RTM_NEWTFILTER (X, RTNL-unlocked)
   __tcf_qdisc_find(A)          10
   tcf_chain0_head_change(A)
   mini_qdisc_pair_swap(A) (1st)
            |
            |                         RTM_NEWQDISC (B, RTNL-locked)
         RCU sync                2     qdisc_graft(B)
            |                    1     notify_and_destroy(A)
            |
   tcf_block_release(A)          0    RTM_NEWTFILTER (Y, RTNL-unlocked)
   qdisc_destroy(A)                    tcf_chain0_head_change(B)
   tcf_chain0_head_change_cb_del(A)    mini_qdisc_pair_swap(B) (2nd)
   mini_qdisc_pair_swap(A) (3rd)                |
           ...                                 ...

Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to
its mini Qdisc, b1.  Then, A calls mini_qdisc_pair_swap() again during
ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress
packets on eth0 will not find filter Y in sch_handle_ingress().

This is just one of the possible consequences of concurrently accessing
miniq_{in,e}gress pointers.

Fixes: 7a096d579e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops")
Fixes: 87f373921c ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops")
Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/
Cc: Hillf Danton <hdanton@sina.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14 10:31:39 +02:00
Peilin Ye
2d5f6a8d7a net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs
Grafting ingress and clsact Qdiscs does not need a for-loop in
qdisc_graft().  Refactor it.  No functional changes intended.

Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14 10:31:39 +02:00
Paul Blakey
41f2c7c342 net/sched: act_ct: Fix promotion of offloaded unreplied tuple
Currently UNREPLIED and UNASSURED connections are added to the nf flow
table. This causes the following connection packets to be processed
by the flow table which then skips conntrack_in(), and thus such the
connections will remain UNREPLIED and UNASSURED even if reply traffic
is then seen. Even still, the unoffloaded reply packets are the ones
triggering hardware update from new to established state, and if
there aren't any to triger an update and/or previous update was
missed, hardware can get out of sync with sw and still mark
packets as new.

Fix the above by:
1) Not skipping conntrack_in() for UNASSURED packets, but still
   refresh for hardware, as before the cited patch.
2) Try and force a refresh by reply-direction packets that update
   the hardware rules from new to established state.
3) Remove any bidirectional flows that didn't failed to update in
   hardware for re-insertion as bidrectional once any new packet
   arrives.

Fixes: 6a9bad0069 ("net/sched: act_ct: offload UDP NEW connections")
Co-developed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14 09:56:50 +02:00
Justin Chen
2bddad9ec6 ethtool: ioctl: account for sopass diff in set_wol
sopass won't be set if wolopt doesn't change. This means the following
will fail to set the correct sopass.
ethtool -s eth0 wol s sopass 11:22:33:44:55:66
ethtool -s eth0 wol s sopass 22:44:55:66:77:88

Make sure we call into the driver layer set_wol if sopass is different.

Fixes: 55b24334c0 ("ethtool: ioctl: improve error checking for set_wol")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Link: https://lore.kernel.org/r/1686605822-34544-1-git-send-email-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13 22:05:52 -07:00
David Howells
c31a25e1db kcm: Send multiple frags in one sendmsg()
Rewrite the AF_KCM transmission loop to send all the fragments in a single
skb or frag_list-skb in one sendmsg() with MSG_SPLICE_PAGES set.  The list
of fragments in each skb is conveniently a bio_vec[] that can just be
attached to a BVEC iter.

Note: I'm working out the size of each fragment-skb by adding up bv_len for
all the bio_vecs in skb->frags[] - but surely this information is recorded
somewhere?  For the skbs in head->frag_list, this is equal to
skb->data_len, but not for the head.  head->data_len includes all the tail
frags too.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 21:13:23 -07:00
David Howells
264ba53fac kcm: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
When transmitting data, call down into the transport socket using sendmsg
with MSG_SPLICE_PAGES to indicate that content should be spliced rather
than using sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 21:13:23 -07:00
David Howells
de17c68573 tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES)
Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(MSG_SPLICE_PAGES)
rather than a loop calling tcp_sendpage().  sendpage() will be removed in
the future.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Sitnicki <jakub@cloudflare.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 21:13:23 -07:00
David Howells
5df5dd03a8 sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
When transmitting data, call down into TCP using sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
performing sendpage calls to transmit header, data pages and trailer.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna@kernel.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 21:13:23 -07:00
Zahari Doychev
7cfffd5fed net: flower: add support for matching cfm fields
Add support to the tc flower classifier to match based on fields in CFM
information elements like level and opcode.

tc filter add dev ens6 ingress protocol 802.1q \
	flower vlan_id 698 vlan_ethtype 0x8902 cfm mdl 5 op 46 \
	action drop

Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 17:01:45 -07:00
Zahari Doychev
d7ad70b5ef net: flow_dissector: add support for cfm packets
Add support for dissecting cfm packets. The cfm packet header
fields maintenance domain level and opcode can be dissected.

Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12 17:01:45 -07:00
Chuck Lever
c4b50cdf9d svcrdma: Revert 2a1e4f21d8 ("svcrdma: Normalize Send page handling")
Get rid of the completion wait in svc_rdma_sendto(), and release
pages in the send completion handler again. A subsequent patch will
handle releasing those pages more efficiently.

Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:36 -04:00
Chuck Lever
a944209c11 SUNRPC: Revert 579900670a ("svcrdma: Remove unused sc_pages field")
Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:36 -04:00
Chuck Lever
6be7afcd92 SUNRPC: Revert cc93ce9529 ("svcrdma: Retain the page backing rq_res.head[0].iov_base")
Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:35 -04:00
Chuck Lever
ac3c32bbdb svcrdma: Clean up allocation of svc_rdma_rw_ctxt
The physical device's favored NUMA node ID is available when
allocating a rw_ctxt. Use that value instead of relying on the
assumption that the memory allocation happens to be running on a
node close to the device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:35 -04:00
Chuck Lever
ed51b42610 svcrdma: Clean up allocation of svc_rdma_send_ctxt
The physical device's favored NUMA node ID is available when
allocating a send_ctxt. Use that value instead of relying on the
assumption that the memory allocation happens to be running on a
node close to the device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:35 -04:00
Chuck Lever
c5d68d25bd svcrdma: Clean up allocation of svc_rdma_recv_ctxt
The physical device's favored NUMA node ID is available when
allocating a recv_ctxt. Use that value instead of relying on the
assumption that the memory allocation happens to be running on a
node close to the device.

This clean up eliminates the hack of destroying recv_ctxts that
were not created by the receive CQ thread -- recv_ctxts are now
always allocated on a "good" node.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:35 -04:00
Chuck Lever
fe2b401e55 svcrdma: Allocate new transports on device's NUMA node
The physical device's NUMA node ID is available when allocating an
svc_xprt for an incoming connection. Use that value to ensure the
svc_xprt structure is allocated on the NUMA node closest to the
device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-12 12:16:34 -04:00
Eric Dumazet
5882efff88 tcp: remove size parameter from tcp_stream_alloc_skb()
Now all tcp_stream_alloc_skb() callers pass @size == 0, we can
remove this parameter.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 11:38:54 +01:00
Eric Dumazet
b4a2439713 tcp: remove some dead code
Now all skbs in write queue do not contain any payload in skb->head,
we can remove some dead code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 11:38:54 +01:00
Eric Dumazet
fbf934068f tcp: let tcp_send_syn_data() build headless packets
tcp_send_syn_data() is the last component in TCP transmit
path to put payload in skb->head.

Switch it to use page frags, so that we can remove dead
code later.

This allows to put more payload than previous implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 11:38:54 +01:00
Jakub Kicinski
500e1340d1 net: ethtool: don't require empty header nests
Ethtool currently requires a header nest (which is used to carry
the common family options) in all requests including dumps.

  $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
  lib.ynl.NlError: Netlink error: Invalid argument
  nl_len = 64 (48) nl_flags = 0x300 nl_type = 2
	error: -22      extack: {'msg': 'request header missing'}

  $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get \
           --json '{"header":{}}';  )
  [{'combined-count': 1,
    'combined-max': 1,
    'header': {'dev-index': 2, 'dev-name': 'enp1s0'}}]

Requiring the header nest to always be there may seem nice
from the consistency perspective, but it's not serving any
practical purpose. We shouldn't burden the user like this.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 11:32:45 +01:00
Jakub Kicinski
5ab8c41cef netlink: support extack in dump ->start()
Commit 4a19edb60d ("netlink: Pass extack to dump handlers")
added extack support to netlink dumps. It was focused on rtnl
and since rtnl does not use ->start(), ->done() callbacks
it ignored those. Genetlink on the other hand uses ->start()
extensively, for parsing and input validation.

Pass the extact in via struct netlink_dump_control and link
it to cb for the time of ->start(). Both struct netlink_dump_control
and extack itself live on the stack so we can't keep the same
extack for the duration of the dump. This means that the extack
visible in ->start() and each ->dump() callbacks will be different.
Corner cases like reporting a warning message in DONE across dump
calls are still not supported.

We could put the extack (for dumps) in the socket struct,
but layering makes it slightly awkward (extack pointer is decided
before the DO / DUMP split).

The genetlink dump error extacks are now surfaced:

  $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
  lib.ynl.NlError: Netlink error: Invalid argument
  nl_len = 64 (48) nl_flags = 0x300 nl_type = 2
	error: -22	extack: {'msg': 'request header missing'}

Previously extack was missing:

  $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
  lib.ynl.NlError: Netlink error: Invalid argument
  nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
	error: -22

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 11:32:44 +01:00
Alexander Mikhalitsyn
97154bcf4d af_unix: Kconfig: make CONFIG_UNIX bool
Let's make CONFIG_UNIX a bool instead of a tristate.
We've decided to do that during discussion about SCM_PIDFD patchset [1].

[1] https://lore.kernel.org/lkml/20230524081933.44dc8bea@kernel.org/

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Cc: Luca Boccassi <bluca@debian.org>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 10:45:50 +01:00
Alexander Mikhalitsyn
7b26952a91 net: core: add getsockopt SO_PEERPIDFD
Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
This thing is direct analog of SO_PEERCRED which allows to get plain PID.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Cc: Luca Boccassi <bluca@debian.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Tested-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 10:45:50 +01:00
Alexander Mikhalitsyn
5e2ff6704a scm: add SO_PASSPIDFD and SCM_PIDFD
Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
but it contains pidfd instead of plain pid, which allows programmers not
to care about PID reuse problem.

We mask SO_PASSPIDFD feature if CONFIG_UNIX is not builtin because
it depends on a pidfd_prepare() API which is not exported to the kernel
modules.

Idea comes from UAPI kernel group:
https://uapi-group.org/kernel-features/

Big thanks to Christian Brauner and Lennart Poettering for productive
discussions about this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Cc: Luca Boccassi <bluca@debian.org>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Tested-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 10:45:49 +01:00
Aaron Conole
e069ba07e6 net: openvswitch: add support for l4 symmetric hashing
Since its introduction, the ovs module execute_hash action allowed
hash algorithms other than the skb->l4_hash to be used.  However,
additional hash algorithms were not implemented.  This means flows
requiring different hash distributions weren't able to use the
kernel datapath.

Now, introduce support for symmetric hashing algorithm as an
alternative hash supported by the ovs module using the flow
dissector.

Output of flow using l4_sym hash:

    recirc_id(0),in_port(3),eth(),eth_type(0x0800),
    ipv4(dst=64.0.0.0/192.0.0.0,proto=6,frag=no), packets:30473425,
    bytes:45902883702, used:0.000s, flags:SP.,
    actions:hash(sym_l4(0)),recirc(0xd)

Some performance testing with no GRO/GSO, two veths, single flow:

    hash(l4(0)):      4.35 GBits/s
    hash(l4_sym(0)):  4.24 GBits/s

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 09:46:30 +01:00
Vladimir Oltean
2b84960fc5 net/sched: taprio: report class offload stats per TXQ, not per TC
The taprio Qdisc creates child classes per netdev TX queue, but
taprio_dump_class_stats() currently reports offload statistics per
traffic class. Traffic classes are groups of TXQs sharing the same
dequeue priority, so this is incorrect and we shouldn't be bundling up
the TXQ stats when reporting them, as we currently do in enetc.

Modify the API from taprio to drivers such that they report TXQ offload
stats and not TC offload stats.

There is no change in the UAPI or in the global Qdisc stats.

Fixes: 6c1adb650c ("net/sched: taprio: add netlink reporting for offload statistics counters")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 09:43:30 +01:00
Herbert Xu
842665a900 xfrm: Use xfrm_state selector for BEET input
For BEET the inner address and therefore family is stored in the
xfrm_state selector.  Use that when decapsulating an input packet
instead of incorrectly relying on a non-existent tunnel protocol.

Fixes: 5f24f41e8e ("xfrm: Remove inner/outer modes from input path")
Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2023-06-12 10:36:48 +02:00
Dan Carpenter
75e6def3b2 sctp: fix an error code in sctp_sf_eat_auth()
The sctp_sf_eat_auth() function is supposed to enum sctp_disposition
values and returning a kernel error code will cause issues in the
caller.  Change -ENOMEM to SCTP_DISPOSITION_NOMEM.

Fixes: 65b07e5d0d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 09:36:27 +01:00
Dan Carpenter
a0067dfcd9 sctp: handle invalid error codes without calling BUG()
The sctp_sf_eat_auth() function is supposed to return enum sctp_disposition
values but if the call to sctp_ulpevent_make_authkey() fails, it returns
-ENOMEM.

This results in calling BUG() inside the sctp_side_effects() function.
Calling BUG() is an over reaction and not helpful.  Call WARN_ON_ONCE()
instead.

This code predates git.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 09:36:27 +01:00
Jiapeng Chong
26e35370b9 net/sched: act_pedit: Use kmemdup() to replace kmalloc + memcpy
./net/sched/act_pedit.c:245:21-28: WARNING opportunity for kmemdup.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5478
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12 09:31:09 +01:00
Benjamin Berg
d094482c99 wifi: mac80211: fragment per STA profile correctly
When fragmenting the ML per STA profile, the element ID should be
IEEE80211_MLE_SUBELEM_PER_STA_PROFILE rather than WLAN_EID_FRAGMENT.

Change the helper function to take the to be used element ID and pass
the appropriate value for each of the fragmentation levels.

Fixes: 81151ce462 ("wifi: mac80211: support MLO authentication/association with one link")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230611121219.9b5c793d904b.I7dad952bea8e555e2f3139fbd415d0cd2b3a08c3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-12 09:52:52 +02:00
Chuck Lever
703d752155 NFSD: Hoist rq_vec preparation into nfsd_read() [step two]
Now that the preparation of an rq_vec has been removed from the
generic read path, nfsd_splice_read() no longer needs to reset
rq_next_page.

nfsd4_encode_read() calls nfsd_splice_read() directly. As far as I
can ascertain, resetting rq_next_page for NFSv4 splice reads is
unnecessary because rq_next_page is already set correctly.

Moreover, resetting it might even be incorrect if previous
operations in the COMPOUND have already consumed at least a page of
the send buffer. I would expect that the result would be encoding
the READ payload over previously-encoded results.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-11 16:37:46 -04:00
David S. Miller
65d8bd81aa netfilter pull request 23-06-08
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmSCMbUACgkQ1V2XiooU
 IOR4Vw//UPJALibmSrS4+yN7nHM3Bwy40ZVCYZkskFuotDU4pkVO6/GKZKZAF9ZM
 jgYXb5in61lnatyLlolZeg2lbZM4XmKxi4hZa7hXy2oHvp7p/CkIHri2Sn/dSTT1
 fv+/LYeQFc6w3o+z7uUMWG+WruDLyFREAEw5A2vbVLOAvLCsjPYxSWOHOBOuMgg/
 dpRMlcgkBDRycy2a9wVhCSDxB/pwv1ksk5Ev8TIC+ACOI4WCauzFvPlO4IrbjiGK
 GG4avni4q2R1ia9EbVE4OZiaFYOQyABS8XbJX32vcMN2BVoRCnmacJNNo9Q0jpM8
 rRjThsDsKPQARIEB0/HAccO++afrtQWZPQehUNskmd/d3g79tk+3k/hgrP64anz9
 C7GuWTT40Xaq7nccsrrOPfHHODYM5IZilw6kVgkmfFUQ0m0mD1A79Mo4XCRxpOAG
 adMI9Zy+1tGPfZVTvb1WxMEyk52agYfUc9obxomiTIdK8/5HEK3c5Z2oMOGiTjH/
 5Msj6VNTGRjhfFXKiLLk/cb0nhWbuAORwTHahvFLqYa2m1/Gpf+Kvt4jkqCjtmy+
 TfOlYPfEo1+qVRSWy8pcGKhMX5H6oYXwff1s4Bj0ycoxVIIJF3qeIdKSP64AGxh+
 yahIw0GfvBDA3ZuSYcHIYeEuAP7yFRwL6lX4P/Rz5PNxsj+2YxY=
 =H0/i
 -----END PGP SIGNATURE-----

Merge tag 'nf-23-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

netfilter pull request 23-06-08

Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes for net:

1) Add commit and abort set operation to pipapo set abort path.

2) Bail out immediately in case of ENOMEM in nfnetlink batch.

3) Incorrect error path handling when creating a new rule leads to
   dangling pointer in set transaction list.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10 19:57:03 +01:00
Dmitry Mastykin
b403643d15 netlabel: fix shift wrapping bug in netlbl_catmap_setlong()
There is a shift wrapping bug in this code on 32-bit architectures.
NETLBL_CATMAP_MAPTYPE is u64, bitmap is unsigned long.
Every second 32-bit word of catmap becomes corrupted.

Signed-off-by: Dmitry Mastykin <dmastykin@astralinux.ru>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10 19:54:06 +01:00
Eric Dumazet
d457a0e329 net: move gso declarations and functions to their own files
Move declarations into include/net/gso.h and code into net/core/gso.c

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:11:41 -07:00
Geliang Tang
6ba7ce8990 mptcp: unify pm set_flags interfaces
This patch unifies the three PM set_flags() interfaces:

mptcp_pm_nl_set_flags() in mptcp/pm_netlink.c for the in-kernel PM and
mptcp_userspace_pm_set_flags() in mptcp/pm_userspace.c for the
userspace PM.

They'll be switched in the common PM infterface mptcp_pm_set_flags() in
mptcp/pm.c based on whether token is NULL or not.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
f40be0db0b mptcp: unify pm get_flags_and_ifindex_by_id
This patch unifies the three PM get_flags_and_ifindex_by_id() interfaces:

mptcp_pm_nl_get_flags_and_ifindex_by_id() in mptcp/pm_netlink.c for the
in-kernel PM and mptcp_userspace_pm_get_flags_and_ifindex_by_id() in
mptcp/pm_userspace.c for the userspace PM.

They'll be switched in the common PM infterface
mptcp_pm_get_flags_and_ifindex_by_id() in mptcp/pm.c based on whether
mptcp_pm_is_userspace() or not.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
9bbec87ecf mptcp: unify pm get_local_id interfaces
This patch unifies the three PM get_local_id() interfaces:

mptcp_pm_nl_get_local_id() in mptcp/pm_netlink.c for the in-kernel PM and
mptcp_userspace_pm_get_local_id() in mptcp/pm_userspace.c for the
userspace PM.

They'll be switched in the common PM infterface mptcp_pm_get_local_id()
in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not.

Also put together the declarations of these three functions in protocol.h.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Geliang Tang
dc886bce75 mptcp: export local_address
Rename local_address() with "mptcp_" prefix and export it in protocol.h.

This function will be re-used in the common PM code (pm.c) in the
following commit.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10 00:05:59 -07:00
Jakub Kicinski
cde11936cf wireless-next patches for v6.5
The second pull request for v6.5. We have support for three new
 Realtek chipsets, all from different generations. Shows how active
 Realtek development is right now, even older generations are being
 worked on.
 
 Note: We merged wireless into wireless-next to avoid complex conflicts
 between the trees.
 
 Major changes:
 
 rtl8xxxu
 
 * RTL8192FU support
 
 rtw89
 
 * RTL8851BE support
 
 rtw88
 
 * RTL8723DS support
 
 ath11k
 
 * Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
   Advertisement (EMA) support in AP mode
 
 iwlwifi
 
 * support for segmented PNVM images and power tables
 
 * new vendor entries for PPAG (platform antenna gain) feature
 
 cfg80211/mac80211
 
 * more Multi-Link Operation (MLO) support such as hardware restart
 
 * fixes for a potential work/mutex deadlock and with it beginnings of
   the previously discussed locking simplifications
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmSDHDERHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZsMDgf/YWmT/b6W4pYRltHRPYksxKfxASGf6mvZ
 UXo5vEehCF4eY2vo4MXTknVJOjrvfbqK5uvWnUfqIYBMgCThBAlG1dz+4Ziiwtw5
 QrSKG2rTQxRlI8ecbfe1Kd1TXr9bSvBAtoNAmElgxcRqG1GbiSqUBvcUox63AYuB
 yozONCvdAOsrPJTcCX+ThP+ABayAonEZf7YeISIRU6q3QMkbVksHAGAe3TnIAPJq
 8l5fQQmHwC0CxActOT2fH8WcJ3dyl5OTqjziRfjrh9T8QUJRp9Yl7zJhpYiBT116
 aCVg/NRxAFT+RkKWdQ6jRHMVOAtzL2fe6fuVW0sk7e1/s5uzyyBxJw==
 =hq6p
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.5

The second pull request for v6.5. We have support for three new
Realtek chipsets, all from different generations. Shows how active
Realtek development is right now, even older generations are being
worked on.

Note: We merged wireless into wireless-next to avoid complex conflicts
between the trees.

Major changes:

rtl8xxxu
 - RTL8192FU support

rtw89
 - RTL8851BE support

rtw88
 - RTL8723DS support

ath11k
 - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
   Advertisement (EMA) support in AP mode

iwlwifi
 - support for segmented PNVM images and power tables
 - new vendor entries for PPAG (platform antenna gain) feature

cfg80211/mac80211
 - more Multi-Link Operation (MLO) support such as hardware restart
 - fixes for a potential work/mutex deadlock and with it beginnings of
   the previously discussed locking simplifications

* tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (162 commits)
  wifi: rtlwifi: remove misused flag from HAL data
  wifi: rtlwifi: remove unused dualmac control leftovers
  wifi: rtlwifi: remove unused timer and related code
  wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown
  wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled
  wifi: brcmfmac: Detect corner error case earlier with log
  wifi: rtw89: 8852c: update RF radio A/B parameters to R63
  wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (3 of 3)
  wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (2 of 3)
  wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (1 of 3)
  wifi: rtw89: process regulatory for 6 GHz power type
  wifi: rtw89: regd: update regulatory map to R64-R40
  wifi: rtw89: regd: judge 6 GHz according to chip and BIOS
  wifi: rtw89: refine clearing supported bands to check 2/5 GHz first
  wifi: rtw89: 8851b: configure CRASH_TRIGGER feature for 8851B
  wifi: rtw89: set TX power without precondition during setting channel
  wifi: rtw89: debug: txpwr table access only valid page according to chip
  wifi: rtw89: 8851b: enable hw_scan support
  wifi: cfg80211: move scan done work to wiphy work
  wifi: cfg80211: move sched scan stop to wiphy work
  ...
====================

Link: https://lore.kernel.org/r/87bkhohkbg.fsf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09 23:26:56 -07:00
Lorenzo Stoakes
4c630f3074 mm/gup: remove vmas parameter from pin_user_pages()
We are now in a position where no caller of pin_user_pages() requires the
vmas parameter at all, so eliminate this parameter from the function and
all callers.

This clears the way to removing the vmas parameter from GUP altogether.

Link: https://lkml.kernel.org/r/195a99ae949c9f5cb589d2222b736ced96ec199a.1684350871.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>	[qib]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>	[drivers/media]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:26 -07:00
Ilan Peer
7b3b9ac899 wifi: mac80211: Use active_links instead of valid_links in Tx
Fix few places on the Tx path where the valid_links were used instead
of active links.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.e24832691fc8.I9ac10dc246d7798a8d26b1a94933df5668df63fc@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:31:08 +02:00
Johannes Berg
34d4e3eb67 wifi: cfg80211: remove links only on AP
Since links are only controlled by userspace via cfg80211
in AP mode, also only remove them from the driver in that
case.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.ed65b94916fa.I2458c46888284cc5ce30715fe642bc5fc4340c8f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:30:53 +02:00
Benjamin Berg
15846f95ab wifi: mac80211: take lock before setting vif links
ieee80211_vif_set_links requires the sdata->local->mtx lock to be held.
Add the appropriate locking around the calls in both the link add and
remove handlers.

This causes a warning when e.g. ieee80211_link_release_channel is called
via ieee80211_link_stop from ieee80211_vif_update_links.

Fixes: 0d8c4a3c86 ("wifi: mac80211: implement add/del interface link callbacks")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.fa0c6597fdad.I83dd70359f6cda30f86df8418d929c2064cf4995@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:30:32 +02:00
Benjamin Berg
1ff56684fa wifi: cfg80211: fix link del callback to call correct handler
The wrapper function was incorrectly calling the add handler instead of
the del handler. This had no negative side effect as the default
handlers are essentially identical.

Fixes: f2a0290b2d ("wifi: cfg80211: add optional link add/remove callbacks")
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.ebd00e000459.Iaff7dc8d1cdecf77f53ea47a0e5080caa36ea02a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:30:16 +02:00
Johannes Berg
01605ad6c3 wifi: mac80211: fix link activation settings order
In the normal MLME code we always call
ieee80211_mgd_set_link_qos_params() before
ieee80211_link_info_change_notify() and some drivers,
notably iwlwifi, rely on that as they don't do anything
(but store the data) in their conf_tx.

Fix the order here to be the same as in the normal code
paths, so this isn't broken.

Fixes: 3d90110292 ("wifi: mac80211: implement link switching")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230608163202.a2a86bba2f80.Iac97e04827966d22161e63bb6e201b4061e9651b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:30:03 +02:00
Dan Carpenter
996c3117da wifi: cfg80211: fix double lock bug in reg_wdev_chan_valid()
The locking was changed recently so now the caller holds the wiphy_lock()
lock.  Taking the lock inside the reg_wdev_chan_valid() function will
lead to a deadlock.

Fixes: f7e60032c6 ("wifi: cfg80211: fix locking in regulatory disconnect")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/40c4114a-6cb4-4abf-b013-300b598aba65@moroto.mountain
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-09 13:28:41 +02:00
Lee Jones
04c55383fa net/sched: cls_u32: Fix reference counter leak leading to overflow
In the event of a failure in tcf_change_indev(), u32_set_parms() will
immediately return without decrementing the recently incremented
reference counter.  If this happens enough times, the counter will
rollover and the reference freed, leading to a double free which can be
used to do 'bad things'.

In order to prevent this, move the point of possible failure above the
point where the reference counter is incremented.  Also save any
meaningful return values to be applied to the return data at the
appropriate point in time.

This issue was caught with KASAN.

Fixes: 705c709126 ("net: sched: cls_u32: no need to call tcf_exts_change for newly allocated struct")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 11:40:17 +01:00
Zhengchao Shao
be3618d965 net/sched: taprio: fix slab-out-of-bounds Read in taprio_dequeue_from_txq
As shown in [1], out-of-bounds access occurs in two cases:
1)when the qdisc of the taprio type is used to replace the previously
configured taprio, count and offset in tc_to_txq can be set to 0. In this
case, the value of *txq in taprio_next_tc_txq() will increases
continuously. When the number of accessed queues exceeds the number of
queues on the device, out-of-bounds access occurs.
2)When packets are dequeued, taprio can be deleted. In this case, the tc
rule of dev is cleared. The count and offset values are also set to 0. In
this case, out-of-bounds access is also caused.

Now the restriction on the queue number is added.

[1] https://groups.google.com/g/syzkaller-bugs/c/_lYOKgkBVMg
Fixes: 2f530df76c ("net/sched: taprio: give higher priority to higher TCs in software dequeue mode")
Reported-by: syzbot+04afcb3d2c840447559a@syzkaller.appspotmail.com
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Tested-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:48:14 +01:00
Max Tottenham
6c02568fd1 net/sched: act_pedit: Parse L3 Header for L4 offset
Instead of relying on skb->transport_header being set correctly, opt
instead to parse the L3 header length out of the L3 headers for both
IPv4/IPv6 when the Extended Layer Op for tcp/udp is used. This fixes a
bug if GRO is disabled, when GRO is disabled skb->transport_header is
set by __netif_receive_skb_core() to point to the L3 header, it's later
fixed by the upper protocol layers, but act_pedit will receive the SKB
before the fixups are completed. The existing behavior causes the
following to edit the L3 header if GRO is disabled instead of the UDP
header:

    tc filter add dev eth0 ingress protocol ip flower ip_proto udp \
 dst_ip 192.168.1.3 action pedit ex munge udp set dport 18053

Also re-introduce a rate-limited warning if we were unable to extract
the header offset when using the 'ex' interface.

Fixes: 71d0ed7079 ("net/act_pedit: Support using offset relative to
the conventional network headers")
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Reviewed-by: Josh Hunt <johunt@akamai.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202305261541.N165u9TZ-lkp@intel.com/
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:34:27 +01:00
Ivan Mikhaylov
790071347a net/ncsi: change from ndo_set_mac_address to dev_set_mac_address
Change ndo_set_mac_address to dev_set_mac_address because
dev_set_mac_address provides a way to notify network layer about MAC
change. In other case, services may not aware about MAC change and keep
using old one which set from network adapter driver.

As example, DHCP client from systemd do not update MAC address without
notification from net subsystem which leads to the problem with acquiring
the right address from DHCP server.

Fixes: cb10c7c0df ("net/ncsi: Add NCSI Broadcom OEM command")
Cc: stable@vger.kernel.org # v6.0+ 2f38e84 net/ncsi: make one oem_gma function for all mfr id
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:32:51 +01:00
Ivan Mikhaylov
74b449b98d net/ncsi: make one oem_gma function for all mfr id
Make the one Get Mac Address function for all manufacturers and change
this call in handlers accordingly.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:32:51 +01:00
Maciej Żenczykowski
1166a530a8 xfrm: fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets
Before Linux v5.8 an AF_INET6 SOCK_DGRAM (udp/udplite) socket
with SOL_UDP, UDP_ENCAP, UDP_ENCAP_ESPINUDP{,_NON_IKE} enabled
would just unconditionally use xfrm4_udp_encap_rcv(), afterwards
such a socket would use the newly added xfrm6_udp_encap_rcv()
which only handles IPv6 packets.

Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Benedict Wong <benedictwong@google.com>
Cc: Yan Yan <evitayan@google.com>
Fixes: 0146dca70b ("xfrm: add support for UDPv6 encapsulation of ESP")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2023-06-09 08:16:34 +02:00
David Howells
3dc8976c7a tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES
Convert tls_device_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather
than directly splicing in the pages itself.  With that, the tls_iter_offset
union is no longer necessary and can be replaced with an iov_iter pointer
and the zc_page argument to tls_push_data() can also be removed.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
24763c9c09 tls/device: Support MSG_SPLICE_PAGES
Make TLS's device sendmsg() support MSG_SPLICE_PAGES.  This causes pages to
be spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
45e5be844a tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES
Convert tls_sw_sendpage() and tls_sw_sendpage_locked() to use sendmsg()
with MSG_SPLICE_PAGES rather than directly splicing in the pages itself.

[!] Note that tls_sw_sendpage_locked() appears to have the wrong locking
    upstream.  I think the caller will only hold the socket lock, but it
    should hold tls_ctx->tx_lock too.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
fe1e81d4f7 tls/sw: Support MSG_SPLICE_PAGES
Make TLS's sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
951ace9951 kcm: Use splice_eof() to flush
Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Cong Wang <cong.wang@bytedance.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
1d7e4538a5 ipv4, ipv6: Use splice_eof() to flush
Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

For UDP, a pending packet will not be emitted if the socket is closed
before it is flushed; with this change, it be flushed by ->splice_eof().

For TCP, it's not clear that MSG_MORE is actually effective.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kuniyuki Iwashima <kuniyu@amazon.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
d4c1e80b0d tls/device: Use splice_eof() to flush
Allow splice to end a TLS record after prematurely ending a splice/sendfile
due to getting an EOF condition (->splice_read() returned 0) after splice
had called TLS with a sendmsg() with MSG_MORE set when the user didn't set
MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
df720d288d tls/sw: Use splice_eof() to flush
Allow splice to end a TLS record after prematurely ending a splice/sendfile
due to getting an EOF condition (->splice_read() returned 0) after splice
had called TLS with a sendmsg() with MSG_MORE set when the user didn't set
MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
2bfc668509 splice, net: Add a splice_eof op to file-ops and socket-ops
Add an optional method, ->splice_eof(), to allow splice to indicate the
premature termination of a splice to struct file_operations and struct
proto_ops.

This is called if sendfile() or splice() encounters all of the following
conditions inside splice_direct_to_actor():

 (1) the user did not set SPLICE_F_MORE (splice only), and

 (2) an EOF condition occurred (->splice_read() returned 0), and

 (3) we haven't read enough to fulfill the request (ie. len > 0 still), and

 (4) we have already spliced at least one byte.

A further patch will modify the behaviour of SPLICE_F_MORE to always be
passed to the actor if either the user set it or we haven't yet read
sufficient data to fulfill the request.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: David Hildenbrand <david@redhat.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: linux-mm@kvack.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
2dc334f1a6 splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()
Replace generic_splice_sendpage() + splice_from_pipe + pipe_to_sendpage()
with a net-specific handler, splice_to_socket(), that calls sendmsg() with
MSG_SPLICE_PAGES set instead of calling ->sendpage().

MSG_MORE is used to indicate if the sendmsg() is expected to be followed
with more data.

This allows multiple pipe-buffer pages to be passed in a single call in a
BVEC iterator, allowing the processing to be pushed down to a loop in the
protocol driver.  This helps pave the way for passing multipage folios down
too.

Protocols that haven't been converted to handle MSG_SPLICE_PAGES yet should
just ignore it and do a normal sendmsg() for now - although that may be a
bit slower as it may copy everything.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
81840b3b91 tls: Allow MSG_SPLICE_PAGES but treat it as normal sendmsg
Allow MSG_SPLICE_PAGES to be specified to sendmsg() but treat it as normal
sendmsg for now.  This means the data will just be copied until
MSG_SPLICE_PAGES is handled.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
Eric Dumazet
736013292e tcp: let tcp_mtu_probe() build headless packets
tcp_mtu_probe() is still copying payload from skbs in the write queue,
using skb_copy_bits(), ignoring potential errors.

Modern TCP stack wants to only deal with payload found in page frags,
as this is a prereq for TCPDirect (host stack might not have access
to the payload)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230607214113.1992947-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:31:06 -07:00
Justin Chen
55b24334c0 ethtool: ioctl: improve error checking for set_wol
The netlink version of set_wol checks for not supported wolopts and avoids
setting wol when the correct wolopt is already set. If we do the same with
the ioctl version then we can remove these checks from the driver layer.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/1686179653-29750-1-git-send-email-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:24:54 -07:00
Guillaume Nault
91ffd1bae1 ping6: Fix send to link-local addresses with VRF.
Ping sockets can't send packets when they're bound to a VRF master
device and the output interface is set to a slave device.

For example, when net.ipv4.ping_group_range is properly set, so that
ping6 can use ping sockets, the following kind of commands fails:
  $ ip vrf exec red ping6 fe80::854:e7ff:fe88:4bf1%eth1

What happens is that sk->sk_bound_dev_if is set to the VRF master
device, but 'oif' is set to the real output device. Since both are set
but different, ping_v6_sendmsg() sees their value as inconsistent and
fails.

Fix this by allowing 'oif' to be a slave device of ->sk_bound_dev_if.

This fixes the following kselftest failure:
  $ ./fcnal-test.sh -t ipv6_ping
  [...]
  TEST: ping out, vrf device+address bind - ns-B IPv6 LLA        [FAIL]

Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Closes: https://lore.kernel.org/netdev/b6191f90-ffca-dbca-7d06-88a9788def9c@alu.unizg.hr/
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Fixes: 5e45789698 ("net: ipv6: Fix ping to link-local addresses.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/6c8b53108816a8d0d5705ae37bdc5a8322b5e3d9.1686153846.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 18:59:57 -07:00
Pablo Neira Ayuso
1240eb93f0 netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE
In case of error when adding a new rule that refers to an anonymous set,
deactivate expressions via NFT_TRANS_PREPARE state, not NFT_TRANS_RELEASE.
Thus, the lookup expression marks anonymous sets as inactive in the next
generation to ensure it is not reachable in this transaction anymore and
decrement the set refcount as introduced by c1592a8994 ("netfilter:
nf_tables: deactivate anonymous set from preparation phase"). The abort
step takes care of undoing the anonymous set.

This is also consistent with rule deletion, where NFT_TRANS_PREPARE is
used. Note that this error path is exercised in the preparation step of
the commit protocol. This patch replaces nf_tables_rule_release() by the
deactivate and destroy calls, this time with NFT_TRANS_PREPARE.

Due to this incorrect error handling, it is possible to access a
dangling pointer to the anonymous set that remains in the transaction
list.

[1009.379054] BUG: KASAN: use-after-free in nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379106] Read of size 8 at addr ffff88816c4c8020 by task nft-rule-add/137110
[1009.379116] CPU: 7 PID: 137110 Comm: nft-rule-add Not tainted 6.4.0-rc4+ #256
[1009.379128] Call Trace:
[1009.379132]  <TASK>
[1009.379135]  dump_stack_lvl+0x33/0x50
[1009.379146]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379191]  print_address_description.constprop.0+0x27/0x300
[1009.379201]  kasan_report+0x107/0x120
[1009.379210]  ? nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379255]  nft_set_lookup_global+0x147/0x1a0 [nf_tables]
[1009.379302]  nft_lookup_init+0xa5/0x270 [nf_tables]
[1009.379350]  nf_tables_newrule+0x698/0xe50 [nf_tables]
[1009.379397]  ? nf_tables_rule_release+0xe0/0xe0 [nf_tables]
[1009.379441]  ? kasan_unpoison+0x23/0x50
[1009.379450]  nfnetlink_rcv_batch+0x97c/0xd90 [nfnetlink]
[1009.379470]  ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink]
[1009.379485]  ? __alloc_skb+0xb8/0x1e0
[1009.379493]  ? __alloc_skb+0xb8/0x1e0
[1009.379502]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[1009.379509]  ? unwind_get_return_address+0x2a/0x40
[1009.379517]  ? write_profile+0xc0/0xc0
[1009.379524]  ? avc_lookup+0x8f/0xc0
[1009.379532]  ? __rcu_read_unlock+0x43/0x60

Fixes: 958bee14d0 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-08 21:49:26 +02:00
Jakub Kicinski
449f6bc17a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/sch_taprio.c
  d636fc5dd6 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
  dced11ef84 ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")

net/ipv4/sysctl_net_ipv4.c
  e209fee411 ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294")
  ccce324dab ("tcp: make the first N SYN RTO backoffs linear")
https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 11:35:14 -07:00
Linus Torvalds
25041a4c02 Networking fixes for 6.4-rc6, including fixes from can, wifi, netfilter,
bluetooth and ebpf.
 
 Current release - regressions:
 
   - bpf: sockmap: avoid potential NULL dereference in sk_psock_verdict_data_ready()
 
   - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()
 
   - phylink: actually fix ksettings_set() ethtool call
 
   - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3
 
 Current release - new code bugs:
 
   - wifi: mt76: fix possible NULL pointer dereference in mt7996_mac_write_txwi()
 
 Previous releases - regressions:
 
   - netfilter: fix NULL pointer dereference in nf_confirm_cthelper
 
   - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS
 
   - openvswitch: fix upcall counter access before allocation
 
   - bluetooth:
     - fix use-after-free in hci_remove_ltk/hci_remove_irk
     - fix l2cap_disconnect_req deadlock
 
   - nic: bnxt_en: prevent kernel panic when receiving unexpected PHC_UPDATE event
 
 Previous releases - always broken:
 
   - core: annotate rfs lockless accesses
 
   - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values
 
   - netfilter: add null check for nla_nest_start_noflag() in nft_dump_basechain_hook()
 
   - bpf: fix UAF in task local storage
 
   - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294
 
   - ipv6: rpl: fix route of death.
 
   - tcp: gso: really support BIG TCP
 
   - mptcp: fixes for user-space PM address advertisement
 
   - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT
 
   - can: avoid possible use-after-free when j1939_can_rx_register fails
 
   - batman-adv: fix UaF while rescheduling delayed work
 
   - eth: qede: fix scheduling while atomic
 
   - eth: ice: make writes to /dev/gnssX synchronous
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmSBsv4SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkMXUP/jisT2xvTFRmtshX3h+xxPkBxZSo9ovx
 ujviqZkyCNep9fu7Njv+5WWp0V8cy3Ui6G6RiGNHDV24vBtISlX21yQt+VANOPjH
 7x8oqqnANxn3PXjL5hp6YZhNaxiwfAfQGJiU+TngVo1jTJopnWEt2x8Q3EhF/k0S
 id8VaHGh/ugC8lRZSJBK/b+FsJjWY0sxTcsoRSjp6gg1WHUVO8mJXlCfHFhNJcQQ
 /8ghieuskLUs4V6UX3TGg4smGxgl2HPdA79+ohvrVhcB1WoGCsWV83SfUTBWgHkU
 IZrIfM4BFCThcN88IgRgJioeX95D54SK0RzEZdCnJx+elmgTK1ZdUGlBh1Vybh+v
 iQel2dgJI+8zyIl/4lXYdhHogLwnONVrkszMrx+Ds2PzNecmnFWg4LUK01xLjW7J
 poAFsZGVBk0BuTkEqXtxv/8Cc7wU/PMOmy4ZVBrHkNIyGgOLbt5eM0T/pArYoKvr
 +34del2Us2vGVk6i89F/GgRuNCvevO0Y+HyAArOJr2XwpakwQYQHdBdj/77FGjFZ
 PyR/bVJZhxdUMv+J7BdKQK+mwt+ZFBVwIRfU2gvHcDa2XQJe2Eg8GXRtcJ1P7hpr
 Q2A+AgiHSoAn6GrgYNHNZVBhWywQFCsu2ZpH7J0uo4zOyTUl3+4O8jyfDrD7o56D
 BodtDJKZit3B
 =X6b2
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, wifi, netfilter, bluetooth and ebpf.

  Current release - regressions:

   - bpf: sockmap: avoid potential NULL dereference in
     sk_psock_verdict_data_ready()

   - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()

   - phylink: actually fix ksettings_set() ethtool call

   - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3

  Current release - new code bugs:

   - wifi: mt76: fix possible NULL pointer dereference in
     mt7996_mac_write_txwi()

  Previous releases - regressions:

   - netfilter: fix NULL pointer dereference in nf_confirm_cthelper

   - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS

   - openvswitch: fix upcall counter access before allocation

   - bluetooth:
      - fix use-after-free in hci_remove_ltk/hci_remove_irk
      - fix l2cap_disconnect_req deadlock

   - nic: bnxt_en: prevent kernel panic when receiving unexpected
     PHC_UPDATE event

  Previous releases - always broken:

   - core: annotate rfs lockless accesses

   - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values

   - netfilter: add null check for nla_nest_start_noflag() in
     nft_dump_basechain_hook()

   - bpf: fix UAF in task local storage

   - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294

   - ipv6: rpl: fix route of death.

   - tcp: gso: really support BIG TCP

   - mptcp: fixes for user-space PM address advertisement

   - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT

   - can: avoid possible use-after-free when j1939_can_rx_register fails

   - batman-adv: fix UaF while rescheduling delayed work

   - eth: qede: fix scheduling while atomic

   - eth: ice: make writes to /dev/gnssX synchronous"

* tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  bnxt_en: Implement .set_port / .unset_port UDP tunnel callbacks
  bnxt_en: Prevent kernel panic when receiving unexpected PHC_UPDATE event
  bnxt_en: Skip firmware fatal error recovery if chip is not accessible
  bnxt_en: Query default VLAN before VNIC setup on a VF
  bnxt_en: Don't issue AP reset during ethtool's reset operation
  bnxt_en: Fix bnxt_hwrm_update_rss_hash_cfg()
  net: bcmgenet: Fix EEE implementation
  eth: ixgbe: fix the wake condition
  eth: bnxt: fix the wake condition
  lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release()
  bpf: Add extra path pointer check to d_path helper
  net: sched: fix possible refcount leak in tc_chain_tmplt_add()
  net: sched: act_police: fix sparse errors in tcf_police_dump()
  net: openvswitch: fix upcall counter access before allocation
  net: sched: move rtm_tca_policy declaration to include file
  ice: make writes to /dev/gnssX synchronous
  net: sched: add rcu annotations around qdisc->qdisc_sleeping
  rfs: annotate lockless accesses to RFS sock flow table
  rfs: annotate lockless accesses to sk->sk_rxhash
  virtio_net: use control_buf for coalesce params
  ...
2023-06-08 09:27:19 -07:00
Jakub Kicinski
182620ab36 Here is a batman-adv bugfix:
- fix a broken sync while rescheduling delayed work, by
    Vladislav Efanov
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmSAp90WHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZPwEADXDBvWlT7DH3sP1peNmFF/y+pd
 AgNJE5wJiJGBrCA1K0gZONulyhpOLjsBtwuyWDuS143IlBSPPQqFcJgZrmU6zHfQ
 MjZBOJMGEgxUbh51vQH4bVTotZB1STVSbst9+HJi+KLpgEN/BnMU2/gDfbV5JwzK
 xVDk2/D0+1OX4w61V+UDqmXuljBLa5cbOUPnRP4/gz/suVui5Q0CESeB+H/One9z
 RlKM6YkcSOr4y9MAIgSpJwY8O4hZ0oeqZyewMTYYWYDQ3nGpZ9NUCGR8kYozusqg
 u8c71nrJwdHV8VS7IU3eEzeKXFo2uz8UxyTgK+qcsoem4oTZgCZy8nXh+Pwp2y72
 R+RHFngBcKIYlvil5cyVUisnJ7GZOjHK/N2pESeG7A/iI0jU6YZgVe5oJaCHbMJl
 //F6m4iFHvPAbf61f5tRePTZTPd98LC3KAlI1Fu4/g+07H0ivgsiFk+qi45OvvcE
 MWvK12FlgTbCUeqjhg6bKuVva2NY5SDn1uRcVrTAD3HDpcpfw7UYv/jLBIeAwpoY
 S7SLRl6xho1+aEPaJWx39DrXWQCFQZP95ygZIN5TPZcihvVp8knY0F7mLPMRovf9
 WBFp89+aS/bv1x14fLrPYOzyJD0XisA3A04iERvf1eLroNa9poh3KHf5jhJLH+RP
 CTumDtHSDt+GIH7E5A==
 =euFF
 -----END PGP SIGNATURE-----

Merge tag 'batadv-net-pullrequest-20230607' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here is a batman-adv bugfix:

 - fix a broken sync while rescheduling delayed work,
   by Vladislav Efanov

* tag 'batadv-net-pullrequest-20230607' of git://git.open-mesh.org/linux-merge:
  batman-adv: Broken sync while rescheduling delayed work
====================

Link: https://lore.kernel.org/r/20230607155515.548120-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07 21:56:01 -07:00
Jakub Kicinski
c9d99cfa66 bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZIDxUwAKCRDbK58LschI
 g5hDAQD7ukrniCvMRNIm2yUZIGSxE4RvGiXptO4a0NfLck5R/wEAsfN2KUsPcPhW
 HS37lVfx7VVXfj42+REf7lWLu4TXpwk=
 =6mS/
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2023-06-07

We've added 7 non-merge commits during the last 7 day(s) which contain
a total of 12 files changed, 112 insertions(+), 7 deletions(-).

The main changes are:

1) Fix a use-after-free in BPF's task local storage, from KP Singh.

2) Make struct path handling more robust in bpf_d_path, from Jiri Olsa.

3) Fix a syzbot NULL-pointer dereference in sockmap, from Eric Dumazet.

4) UAPI fix for BPF_NETFILTER before final kernel ships,
   from Florian Westphal.

5) Fix map-in-map array_map_gen_lookup code generation where elem_size was
   not being set for inner maps, from Rhys Rustad-Elliott.

6) Fix sockopt_sk selftest's NETLINK_LIST_MEMBERSHIPS assertion,
   from Yonghong Song.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add extra path pointer check to d_path helper
  selftests/bpf: Fix sockopt_sk selftest
  bpf: netfilter: Add BPF_NETFILTER bpf_attach_type
  selftests/bpf: Add access_inner_map selftest
  bpf: Fix elem_size not being set for inner maps
  bpf: Fix UAF in task local storage
  bpf, sockmap: Avoid potential NULL dereference in sk_psock_verdict_data_ready()
====================

Link: https://lore.kernel.org/r/20230607220514.29698-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07 21:47:11 -07:00
Pablo Neira Ayuso
a1a64a151d netfilter: nfnetlink: skip error delivery on batch in case of ENOMEM
If caller reports ENOMEM, then stop iterating over the batch and send a
single netlink message to userspace to report OOM.

Fixes: cbb8125eb4 ("netfilter: nfnetlink: deliver netlink errors on batch completion")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-08 04:00:02 +02:00
Pablo Neira Ayuso
212ed75dc5 netfilter: nf_tables: integrate pipapo into commit protocol
The pipapo set backend follows copy-on-update approach, maintaining one
clone of the existing datastructure that is being updated. The clone
and current datastructures are swapped via rcu from the commit step.

The existing integration with the commit protocol is flawed because
there is no operation to clean up the clone if the transaction is
aborted. Moreover, the datastructure swap happens on set element
activation.

This patch adds two new operations for sets: commit and abort, these new
operations are invoked from the commit and abort steps, after the
transactions have been digested, and it updates the pipapo set backend
to use it.

This patch adds a new ->pending_update field to sets to maintain a list
of sets that require this new commit and abort operations.

Fixes: 3c4287f620 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-08 03:56:20 +02:00
Johannes Berg
fe0af9fe54 wifi: cfg80211: move scan done work to wiphy work
Move the scan done work to the new wiphy work to
simplify the code a bit.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:37 +02:00
Johannes Berg
c88d717822 wifi: cfg80211: move sched scan stop to wiphy work
This work can now trivially be converted, it behaves
identical either way.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:31 +02:00
Johannes Berg
4b8d43f113 wifi: mac80211: mlme: move disconnects to wiphy work
Move the beacon loss work that might cause a disconnect
and the CSA disconnect work to be wiphy work, so we hold
the wiphy lock for them.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:29 +02:00
Johannes Berg
87351d0926 wifi: mac80211: ibss: move disconnect to wiphy work
Move the IBSS disconnect work to be a wiphy work.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:27 +02:00
Johannes Berg
ec3252bff7 wifi: mac80211: use wiphy work for channel switch
Channel switch obviously must be handled per link, and we
have a (potential) deadlock when canceling that work. Use
the new delayed wiphy work to handle this instead and get
rid of the explicit timer that way too.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:24 +02:00
Johannes Berg
1444f58931 wifi: mac80211: use wiphy work for SMPS
SMPS requests are per link, and currently there's a potential
deadlock with canceling. Use the new wiphy work to handle SMPS
instead, so that the cancel cannot deadlock.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:22 +02:00
Johannes Berg
a3df43b16f wifi: mac80211: unregister netdevs through cfg80211
Since we want to have wiphy_lock() for the unregistration
in the future, unregister also netdevs via cfg80211 now
to be able to hold the wiphy_lock() for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:21 +02:00
Johannes Berg
16114496d6 wifi: mac80211: use wiphy work for sdata->work
We'll need this later to convert other works that might
be cancelled from here, so convert this one first.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:18 +02:00
Johannes Berg
a3ee4dc84c wifi: cfg80211: add a work abstraction with special semantics
Add a work abstraction at the cfg80211 level that will always
hold the wiphy_lock() for any work executed and therefore also
can be canceled safely (without waiting) while holding that.
This improves on what we do now as with the new wiphy works we
don't have to worry about locking while cancelling them safely.

Also, don't let such works run while the device is suspended,
since they'll likely need to interact with the device. Flush
them before suspend though.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:15 +02:00
Johannes Berg
4d45145ba6 wifi: cfg80211: hold wiphy lock when sending wiphy
Sending the wiphy out might cause calls to the driver,
notably get_txq_stats() and get_antenna(). These aren't
very important, since the normally have their own locks
and/or just send out static data, but if the contract
should be that the wiphy lock is always held, these are
also affected. Fix that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:13 +02:00
Johannes Berg
7d2d0ff49d wifi: cfg80211: wext: hold wiphy lock in siwgenie
Missed this ioctl since it's in wext-sme.c where we
usually get via a front-level ioctl handler in the
other files, but it should also hold the wiphy lock
to align the locking contract towards the driver.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:11 +02:00
Johannes Berg
a993df0f91 wifi: cfg80211: move wowlan disable under locks
This is a driver callback, and the driver should be able
to assume that it's called with the wiphy lock held. Move
the call up so that's true, it has no other effect since
the device is already unregistering and we cannot reach
this function through other paths.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:09 +02:00
Johannes Berg
0dcb84ede5 wifi: cfg80211: hold wiphy lock in pmsr work
Most code paths in cfg80211 already hold the wiphy lock,
mostly by virtue of being called from nl80211, so make
the pmsr cleanup worker also hold it, aligning the
locking promises between different parts of cfg80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:07 +02:00
Johannes Berg
e9da6df749 wifi: cfg80211: hold wiphy lock in auto-disconnect
Most code paths in cfg80211 already hold the wiphy lock,
mostly by virtue of being called from nl80211, so make
the auto-disconnect worker also hold it, aligning the
locking promises between different parts of cfg80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-07 19:53:04 +02:00