linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-29 23:24:11 +08:00

Author	SHA1	Message	Date
Johannes Berg	86e74a08fe	wifi: mac80211_hwsim: remove multicast workaround Now that we have proper multicast TX in mac80211, there's no longer a need to fake something here. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 17:01:37 +02:00
Jinpeng Cui	a21cd7d63b	wifi: nl80211: remove redundant err variable Return value from rdev_set_mcast_rate() directly instead of taking this in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 17:01:09 +02:00
James Prestwood	3c06e91b40	wifi: mac80211: Support POWERED_ADDR_CHANGE feature Adds support in mac80211 for NL80211_EXT_FEATURE_POWERED_ADDR_CHANGE. The motivation behind this functionality is to fix limitations of address randomization on frequencies which are disallowed in world roaming. The way things work now, if a client wants to randomize their address per-connection it must power down the device, change the MAC, and power back up. Here lies a problem since powering down the device may result in frequencies being disabled (until the regdom is set). If the desired BSS is on one such frequency the client is unable to connect once the phy is powered again. For mac80211 based devices changing the MAC while powered is possible but currently disallowed (-EBUSY). This patch adds some logic to allow a MAC change while powered by removing the interface, changing the MAC, and adding it again. mac80211 will advertise support for this feature so userspace can determine the best course of action e.g. disallow address randomization on certain frequencies if not supported. There are certain limitations put on this which simplify the logic: - No active connection - No offchannel work, including scanning. Signed-off-by: James Prestwood <prestwoj@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 17:01:04 +02:00
James Prestwood	a36c421690	wifi: nl80211: Add POWERED_ADDR_CHANGE feature Add a new extended feature bit signifying that the wireless hardware supports changing the MAC address while the underlying net_device is powered. Note that this has a different meaning from IFF_LIVE_ADDR_CHANGE as additional restrictions might be imposed by the hardware, such as: - No connection is active on this interface, carrier is off - No scan is in progress - No offchannel operations are in progress Signed-off-by: James Prestwood <prestwoj@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 16:58:41 +02:00
Johannes Berg	90703ba9bb	wifi: mac80211: prevent 4-addr use on MLDs We haven't tried this yet, and it's not very likely to work well right now, so for now disable 4-addr use on interfaces that are MLDs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20220902161143.f2e4cc2efaa1.I5924e8fb44a2d098b676f5711b36bbc1b1bd68e2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 16:57:34 +02:00
Johannes Berg	ae960ee90b	wifi: mac80211: prevent VLANs on MLDs Do not allow VLANs to be added to AP interfaces that are MLDs, this isn't going to work because the link structs aren't propagated to the VLAN interfaces yet. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20220902161144.8c88531146e9.If2ef9a3b138d4f16ed2fda91c852da156bdf5e4d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2022-09-03 16:57:01 +02:00
David S. Miller	aa3fab0110	Merge branch 'net_sched-redundant-resource-cleanups' Zhengchao Shao says: ==================== net: sched: remove redundant resource cleanup when init() fails qdisc_create() calls .init() to initialize qdisc. If the initialization fails, qdisc_create() will call .destroy() to release resources. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 10:40:40 +01:00
Zhengchao Shao	d59f4e1d1f	net: sched: htb: remove redundant resource cleanup in htb_init() If htb_init() fails, qdisc_create() invokes htb_destroy() to clear resources. Therefore, remove redundant resource cleanup in htb_init(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 10:40:40 +01:00
Zhengchao Shao	494f5063b8	net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init() If fq_codel_init() fails, qdisc_create() invokes fq_codel_destroy() to clear resources. Therefore, remove redundant resource cleanup in fq_codel_init(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 10:40:40 +01:00
Wei Fang	40c79ce13b	net: fec: add stop mode support for imx8 platform The current driver support stop mode by calling machine api. The patch add dts support to set GPR register for stop request. imx8mq enter stop/exit stop mode by setting GPR bit, which can be accessed by A core. imx8qm enter stop/exit stop mode by calling IMX_SC ipc APIs that communicate with M core ipc service, and the M core set the related GPR bit at last. Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 09:54:34 +01:00
André Apitzsch	e26c258434	r8152: Add MAC passthrough support for Lenovo Travel Hub The Lenovo USB-C Travel Hub supports MAC passthrough. Signed-off-by: André Apitzsch <git@apitzsch.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 09:53:38 +01:00
Gustavo A. R. Silva	5854a09b49	net/ipv4: Use __DECLARE_FLEX_ARRAY() helper We now have a cleaner way to keep compatibility with user-space (a.k.a. not breaking it) when we need to keep in place a one-element array (for its use in user-space) together with a flexible-array member (for its use in kernel-space) without making it hard to read at the source level. This is through the use of the new __DECLARE_FLEX_ARRAY() helper macro. The size and memory layout of the structure is preserved after the changes. See below. Before changes: $ pahole -C ip_msfilter net/ipv4/igmp.o struct ip_msfilter { union { struct { __be32 imsf_multiaddr_aux; /* 0 4 / __be32 imsf_interface_aux; / 4 4 / __u32 imsf_fmode_aux; / 8 4 / __u32 imsf_numsrc_aux; / 12 4 / __be32 imsf_slist[1]; / 16 4 / }; / 0 20 / struct { __be32 imsf_multiaddr; / 0 4 / __be32 imsf_interface; / 4 4 / __u32 imsf_fmode; / 8 4 / __u32 imsf_numsrc; / 12 4 / __be32 imsf_slist_flex[0]; / 16 0 / }; / 0 16 / }; / 0 20 / / size: 20, cachelines: 1, members: 1 / / last cacheline: 20 bytes / }; After changes: $ pahole -C ip_msfilter net/ipv4/igmp.o struct ip_msfilter { __be32 imsf_multiaddr; / 0 4 / __be32 imsf_interface; / 4 4 / __u32 imsf_fmode; / 8 4 / __u32 imsf_numsrc; / 12 4 / union { __be32 imsf_slist[1]; / 16 4 / struct { struct { } __empty_imsf_slist_flex; / 16 0 / __be32 imsf_slist_flex[0]; / 16 0 / }; / 16 0 / }; / 16 4 / / size: 20, cachelines: 1, members: 5 / / last cacheline: 20 bytes */ }; In the past, we had to duplicate the whole original structure within a union, and update the names of all the members. Now, we just need to declare the flexible-array member to be used in kernel-space through the __DECLARE_FLEX_ARRAY() helper together with the one-element array, within a union. This makes the source code more clean and easier to read. Link: https://github.com/KSPP/linux/issues/193 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-03 09:51:10 +01:00
Zhengchao Shao	2e5fb32232	net/sched: cls_api: remove redundant 0 check in tcf_qevent_init() tcf_qevent_parse_block_index() never returns a zero block_index. Therefore, it is unnecessary to check the value of block_index in tcf_qevent_init(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220901011617.14105-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 21:24:49 -07:00
GUO Zihua	c8ef3c94bd	net: lantiq_etop: Fix return type for implementation of ndo_start_xmit Since Linux now supports CFI, it will be a good idea to fix mismatched return type for implementation of hooks. Otherwise this might get cought out by CFI and cause a panic. ltq_etop_tx() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so change the return type to netdev_tx_t directly. Signed-off-by: GUO Zihua <guozihua@huawei.com> Link: https://lore.kernel.org/r/20220902081521.59867-1-guozihua@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 21:19:35 -07:00
GUO Zihua	7b620e1560	net: sunplus: Fix return type for implementation of ndo_start_xmit Since Linux now supports CFI, it will be a good idea to fix mismatched return type for implementation of hooks. Otherwise this might get cought out by CFI and cause a panic. spl2sw_ethernet_start_xmit() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so change the return type to netdev_tx_t directly. Signed-off-by: GUO Zihua <guozihua@huawei.com> Link: https://lore.kernel.org/r/20220902081550.60095-1-guozihua@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 21:19:14 -07:00
GUO Zihua	0dbaf0fa62	net: xscale: Fix return type for implementation of ndo_start_xmit Since Linux now supports CFI, it will be a good idea to fix mismatched return type for implementation of hooks. Otherwise this might get cought out by CFI and cause a panic. eth_xmit() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so change the return type to netdev_tx_t directly. Signed-off-by: GUO Zihua <guozihua@huawei.com> Link: https://lore.kernel.org/r/20220902081612.60405-1-guozihua@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 21:18:56 -07:00
GUO Zihua	12f7bd2522	net: broadcom: Fix return type for implementation of Since Linux now supports CFI, it will be a good idea to fix mismatched return type for implementation of hooks. Otherwise this might get cought out by CFI and cause a panic. bcm4908_enet_start_xmit() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so change the return type to netdev_tx_t directly. Signed-off-by: GUO Zihua <guozihua@huawei.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220902075407.52358-1-guozihua@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 21:18:43 -07:00
Alexei Starovoitov	0b20a133c0	Merge branch 'bpf: net: Remove duplicated code from bpf_getsockopt()' Martin KaFai Lau says: ==================== From: Martin KaFai Lau <martin.lau@kernel.org> The earlier commits [0] removed duplicated code from bpf_setsockopt(). This series is to remove duplicated code from bpf_getsockopt(). Unlike the setsockopt() which had already changed to take the sockptr_t argument, the same has not been done to getsockopt(). This is the extra step being done in this series. [0]: https://lore.kernel.org/all/20220817061704.4174272-1-kafai@fb.com/ v2: - The previous v2 did not reach the list. It is a resend. - Add comments on bpf_getsockopt() should not free the saved_syn (Stanislav) - Explicitly null-terminate the tcp-cc name (Stanislav) ==================== Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:35:11 -07:00
Martin KaFai Lau	f649f992de	selftest/bpf: Add test for bpf_getsockopt() This patch removes the __bpf_getsockopt() which directly reads the sk by using PTR_TO_BTF_ID. Instead, the test now directly uses the kernel bpf helper bpf_getsockopt() which supports all the required optname now. TCP_SAVE[D]_SYN and TCP_MAXSEG are not tested in a loop for all the hooks and sock_ops's cb. TCP_SAVE[D]_SYN only works in passive connection. TCP_MAXSEG only works when it is setsockopt before the connection is established and the getsockopt return value can only be tested after the connection is established. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002937.2896904-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:32 -07:00
Martin KaFai Lau	38566ec06f	bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() This patch changes bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt(). It removes the duplicated code from bpf_getsockopt(SOL_IPV6). This also makes bpf_getsockopt(SOL_IPV6) supporting the same set of optnames as in bpf_setsockopt(SOL_IPV6). In particular, this adds IPV6_AUTOFLOWLABEL support to bpf_getsockopt(SOL_IPV6). ipv6 could be compiled as a module. Like how other code solved it with stubs in ipv6_stubs.h, this patch adds the do_ipv6_getsockopt to the ipv6_bpf_stub. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002931.2896218-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:32 -07:00
Martin KaFai Lau	fd969f25fe	bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt() This patch changes bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt() and remove the duplicated code. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002925.2895416-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:32 -07:00
Martin KaFai Lau	273b7f0fb4	bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt() This patch changes bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). It removes the duplicated code from bpf_getsockopt(SOL_TCP). Before this patch, there were some optnames available to bpf_setsockopt(SOL_TCP) but missing in bpf_getsockopt(SOL_TCP). For example, TCP_NODELAY, TCP_MAXSEG, TCP_KEEPIDLE, TCP_KEEPINTVL, and a few more. It surprises users from time to time. This patch automatically closes this gap without duplicating more code. bpf_getsockopt(TCP_SAVED_SYN) does not free the saved_syn, so it stays in sol_tcp_sockopt(). For string name value like TCP_CONGESTION, bpf expects it is always null terminated, so sol_tcp_sockopt() decrements optlen by one before calling do_tcp_getsockopt() and the 'if (optlen < saved_optlen) memset(..,0,..);' in __bpf_getsockopt() will always do a null termination. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002918.2894511-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:32 -07:00
Martin KaFai Lau	65ddc82d3b	bpf: Change bpf_getsockopt(SOL_SOCKET) to reuse sk_getsockopt() This patch changes bpf_getsockopt(SOL_SOCKET) to reuse sk_getsockopt(). It removes all duplicated code from bpf_getsockopt(SOL_SOCKET). Before this patch, there were some optnames available to bpf_setsockopt(SOL_SOCKET) but missing in bpf_getsockopt(SOL_SOCKET). It surprises users from time to time. For example, SO_REUSEADDR, SO_KEEPALIVE, SO_RCVLOWAT, and SO_MAX_PACING_RATE. This patch automatically closes this gap without duplicating more code. The only exception is SO_BINDTODEVICE because it needs to acquire a blocking lock. Thus, SO_BINDTODEVICE is not supported. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002912.2894040-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	c2b063ca34	bpf: Embed kernel CONFIG check into the if statement in bpf_getsockopt This patch moves the "#ifdef CONFIG_XXX" check into the "if/else" statement itself. The change is done for the bpf_getsockopt() function only. It will make the latter patches easier to follow without the surrounding ifdef macro. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002906.2893572-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	0f95f7d426	bpf: net: Avoid do_ipv6_getsockopt() taking sk lock when called from bpf Similar to the earlier patch that changes sk_getsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking the lock when called from bpf. This patch also changes do_ipv6_getsockopt() to use sockopt_{lock,release}_sock() such that bpf_getsockopt(SOL_IPV6) can reuse do_ipv6_getsockopt(). Although bpf_getsockopt(SOL_IPV6) currently does not support optname that requires lock_sock(), using sockopt_{lock,release}_sock() consistently across *_getsockopt() will make future optname addition harder to miss the sockopt_{lock,release}_sock() usage. eg. when adding new optname that requires a lock and the new optname is needed in bpf_getsockopt() also. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002859.2893064-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	6dadbe4bac	bpf: net: Change do_ipv6_getsockopt() to take the sockptr_t argument Similar to the earlier patch that changes sk_getsockopt() to take the sockptr_t argument . This patch also changes do_ipv6_getsockopt() to take the sockptr_t argument such that a latter patch can make bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt(). Note on the change in ip6_mc_msfget(). This function is to return an array of sockaddr_storage in optval. This function is shared between ipv6_get_msfilter() and compat_ipv6_get_msfilter(). However, the sockaddr_storage is stored at different offset of the optval because of the difference between group_filter and compat_group_filter. Thus, a new 'ss_offset' argument is added to ip6_mc_msfget(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002853.2892532-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	9c3f9707de	net: Add a len argument to compat_ipv6_get_msfilter() Pass the len to the compat_ipv6_get_msfilter() instead of compat_ipv6_get_msfilter() getting it again from optlen. Its counter part ipv6_get_msfilter() is also taking the len from do_ipv6_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002846.2892091-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	75f2397988	net: Remove unused flags argument from do_ipv6_getsockopt The 'unsigned int flags' argument is always 0, so it can be removed. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002840.2891763-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	1985320c54	bpf: net: Avoid do_ip_getsockopt() taking sk lock when called from bpf Similar to the earlier commit that changed sk_setsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking lock when called from bpf. This patch also changes do_ip_getsockopt() to use sockopt_{lock,release}_sock() such that a latter patch can make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002834.2891514-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	728f064cd7	bpf: net: Change do_ip_getsockopt() to take the sockptr_t argument Similar to the earlier patch that changes sk_getsockopt() to take the sockptr_t argument. This patch also changes do_ip_getsockopt() to take the sockptr_t argument such that a latter patch can make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt(). Note on the change in ip_mc_gsfget(). This function is to return an array of sockaddr_storage in optval. This function is shared between ip_get_mcast_msfilter() and compat_ip_get_mcast_msfilter(). However, the sockaddr_storage is stored at different offset of the optval because of the difference between group_filter and compat_group_filter. Thus, a new 'ss_offset' argument is added to ip_mc_gsfget(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002828.2890585-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	d51bbff2ab	bpf: net: Avoid do_tcp_getsockopt() taking sk lock when called from bpf Similar to the earlier commit that changed sk_setsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking lock when called from bpf. This patch also changes do_tcp_getsockopt() to use sockopt_{lock,release}_sock() such that a latter patch can make bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002821.2889765-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:31 -07:00
Martin KaFai Lau	34704ef024	bpf: net: Change do_tcp_getsockopt() to take the sockptr_t argument Similar to the earlier patch that changes sk_getsockopt() to take the sockptr_t argument . This patch also changes do_tcp_getsockopt() to take the sockptr_t argument such that a latter patch can make bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002815.2889332-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:30 -07:00
Martin KaFai Lau	2c5b6bf5cd	bpf: net: Avoid sk_getsockopt() taking sk lock when called from bpf Similar to the earlier commit that changed sk_setsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking lock when called from bpf. This patch also changes sk_getsockopt() to use sockopt_{lock,release}_sock() such that a latter patch can make bpf_getsockopt(SOL_SOCKET) to reuse sk_getsockopt(). Only sk_get_filter() requires this change and it is used by the optname SO_GET_FILTER. The '.getname' implementations in sock->ops->getname() is not changed also since bpf does not always have the sk->sk_socket pointer and cannot support SO_PEERNAME. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002809.2888981-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:30 -07:00
Martin KaFai Lau	4ff09db1b7	bpf: net: Change sk_getsockopt() to take the sockptr_t argument This patch changes sk_getsockopt() to take the sockptr_t argument such that it can be used by bpf_getsockopt(SOL_SOCKET) in a latter patch. security_socket_getpeersec_stream() is not changed. It stays with the __user ptr (optval.user and optlen.user) to avoid changes to other security hooks. bpf_getsockopt(SOL_SOCKET) also does not support SO_PEERSEC. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002802.2888419-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:30 -07:00
Martin KaFai Lau	ba74a7608d	net: Change sock_getsockopt() to take the sk ptr instead of the sock ptr A latter patch refactors bpf_getsockopt(SOL_SOCKET) with the sock_getsockopt() to avoid code duplication and code drift between the two duplicates. The current sock_getsockopt() takes sock ptr as the argument. The very first thing of this function is to get back the sk ptr by 'sk = sock->sk'. bpf_getsockopt() could be called when the sk does not have the sock ptr created. Meaning sk->sk_socket is NULL. For example, when a passive tcp connection has just been established but has yet been accept()-ed. Thus, it cannot use the sock_getsockopt(sk->sk_socket) or else it will pass a NULL ptr. This patch moves all sock_getsockopt implementation to the newly added sk_getsockopt(). The new sk_getsockopt() takes a sk ptr and immediately gets the sock ptr by 'sock = sk->sk_socket' The existing sock_getsockopt(sock) is changed to call sk_getsockopt(sock->sk). All existing callers have both sock->sk and sk->sk_socket pointer. The latter patch will make bpf_getsockopt(SOL_SOCKET) call sk_getsockopt(sk) directly. The bpf_getsockopt(SOL_SOCKET) does not use the optnames that require sk->sk_socket, so it will be safe. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002756.2887884-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-02 20:34:30 -07:00
Gal Pressman	8254393663	net: ieee802154: Fix compilation error when CONFIG_IEEE802154_NL802154_EXPERIMENTAL is disabled When CONFIG_IEEE802154_NL802154_EXPERIMENTAL is disabled, NL802154_CMD_DEL_SEC_LEVEL is undefined and results in a compilation error: net/ieee802154/nl802154.c:2503:19: error: 'NL802154_CMD_DEL_SEC_LEVEL' undeclared here (not in a function); did you mean 'NL802154_CMD_SET_CCA_ED_LEVEL'? 2503 \| .resv_start_op = NL802154_CMD_DEL_SEC_LEVEL + 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ \| NL802154_CMD_SET_CCA_ED_LEVEL Unhide the experimental commands, having them defined in an enum makes no difference. Fixes: `9c5d03d362` ("genetlink: start to validate reserved header bytes") Signed-off-by: Gal Pressman <gal@nvidia.com> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20220902030620.2737091-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-02 19:59:08 -07:00
Ian Rogers	af515a5587	selftests/xsk: Avoid use-after-free on ctx The put lowers the reference count to 0 and frees ctx, reading it afterwards is invalid. Move the put after the uses and determine the last use by the reference count being 1. Fixes: `39e940d4ab` ("selftests/xsk: Destroy BPF resources only when ctx refcount drops to 0") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901202645.1463552-1-irogers@google.com	2022-09-02 15:57:18 +02:00
Daniel Müller	afef88e655	selftests/bpf: Store BPF object files with .bpf.o extension BPF object files are, in a way, the final artifact produced as part of the ahead-of-time compilation process. That makes them somewhat special compared to "regular" object files, which are a intermediate build artifacts that can typically be removed safely. As such, it can make sense to name them differently to make it easier to spot this difference at a glance. Among others, libbpf-bootstrap [0] has established the extension .bpf.o for BPF object files. It seems reasonable to follow this example and establish the same denomination for selftest build artifacts. To that end, this change adjusts the corresponding part of the build system and the test programs loading BPF object files to work with .bpf.o files. [0] https://github.com/libbpf/libbpf-bootstrap Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220901222253.1199242-1-deso@posteo.net	2022-09-02 15:55:37 +02:00
Maciej Fijalkowski	fe2ad08e1e	selftests/xsk: Add support for zero copy testing Introduce new mode to xdpxceiver responsible for testing AF_XDP zero copy support of driver that serves underlying physical device. When setting up test suite, determine whether driver has ZC support or not by trying to bind XSK ZC socket to the interface. If it succeeded, interpret it as ZC support being in place and do softirq and busy poll tests for zero copy mode. Note that Rx dropped tests are skipped since ZC path is not touching rx_dropped stat at all. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-7-maciej.fijalkowski@intel.com	2022-09-02 15:38:08 +02:00
Maciej Fijalkowski	c29fe883de	selftests/xsk: Make sure single threaded test terminates For single threaded poll tests call pthread_kill() from main thread so that we are sure worker thread has finished its job and it is possible to proceed with next test types from test suite. It was observed that on some platforms it takes a bit longer for worker thread to exit and next test case sees device as busy in this case. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-6-maciej.fijalkowski@intel.com	2022-09-02 15:38:03 +02:00
Maciej Fijalkowski	a693ff3ed5	selftests/xsk: Add support for executing tests on physical device Currently, architecture of xdpxceiver is designed strictly for conducting veth based tests. Veth pair is created together with a network namespace and one of the veth interfaces is moved to the mentioned netns. Then, separate threads for Tx and Rx are spawned which will utilize described setup. Infrastructure described in the paragraph above can not be used for testing AF_XDP support on physical devices. That testing will be conducted on a single network interface and same queue. Xskxceiver needs to be extended to distinguish between veth tests and physical interface tests. Since same iface/queue id pair will be used by both Tx/Rx threads for physical device testing, Tx thread, which happen to run after the Rx thread, is going to create XSK socket with shared umem flag. In order to track this setting throughout the lifetime of spawned threads, introduce 'shared_umem' boolean variable to struct ifobject and set it to true when xdpxceiver is run against physical device. In such case, UMEM size needs to be doubled, so half of it will be used by Rx thread and other half by Tx thread. For two step based test types, value of XSKMAP element under key 0 has to be updated as there is now another socket for the second step. Also, to avoid race conditions when destroying XSK resources, move this activity to the main thread after spawned Rx and Tx threads have finished its job. This way it is possible to gracefully remove shared umem without introducing synchronization mechanisms. To run xsk selftests suite on physical device, append "-i $IFACE" when invoking test_xsk.sh. For veth based tests, simply skip it. When "-i $IFACE" is in place, under the hood test_xsk.sh will use $IFACE for both interfaces supplied to xdpxceiver, which in turn will interpret that this execution of test suite is for a physical device. Note that currently this makes it possible only to test SKB and DRV mode (in case underlying device has native XDP support). ZC testing support is added in a later patch. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-5-maciej.fijalkowski@intel.com	2022-09-02 15:37:57 +02:00
Maciej Fijalkowski	24037ba7c4	selftests/xsk: Increase chars for interface name to 16 So that "enp240s0f0" or such name can be used against xskxceiver. While at it, also extend character count for netns name. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-4-maciej.fijalkowski@intel.com	2022-09-02 15:37:53 +02:00
Maciej Fijalkowski	1adef0643b	selftests/xsk: Introduce default Rx pkt stream In order to prepare xdpxceiver for physical device testing, let us introduce default Rx pkt stream. Reason for doing it is that physical device testing will use a UMEM with a doubled size where half of it will be used by Tx and other half by Rx. This means that pkt addresses will differ for Tx and Rx streams. Rx thread will initialize the xsk_umem_info::base_addr that is added here so that pkt_set(), when working on Rx UMEM will add this offset and second half of UMEM space will be used. Note that currently base_addr is 0 on both sides. Future commit will do the mentioned initialization. Previously, veth based testing worked on separate UMEMs, so single default stream was fine. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-3-maciej.fijalkowski@intel.com	2022-09-02 15:37:45 +02:00
Maciej Fijalkowski	0d68e6fe12	selftests/xsk: Query for native XDP support Currently, xdpxceiver assumes that underlying device supports XDP in native mode - it is fine by now since tests can run only on a veth pair. Future commit is going to allow running test suite against physical devices, so let us query the device if it is capable of running XDP programs in native mode. This way xdpxceiver will not try to run TEST_MODE_DRV if device being tested is not supporting it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220901114813.16275-2-maciej.fijalkowski@intel.com	2022-09-02 15:37:37 +02:00
Shmulik Ladkani	8cc61b7a64	selftests/bpf: Amend test_tunnel to exercise BPF_F_TUNINFO_FLAGS Get the tunnel flags in {ipv6}vxlan_get_tunnel_src and ensure they are aligned with tunnel params set at {ipv6}vxlan_set_tunnel_dst. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831144010.174110-2-shmulik.ladkani@gmail.com	2022-09-02 15:21:03 +02:00
Shmulik Ladkani	44c51472be	bpf: Support getting tunnel flags Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters (id, ttl, tos, local and remote) but does not expose ip_tunnel_info's tun_flags to the BPF program. It makes sense to expose tun_flags to the BPF program. Assume for example multiple GRE tunnels maintained on a single GRE interface in collect_md mode. The program expects origins to initiate over GRE, however different origins use different GRE characteristics (e.g. some prefer to use GRE checksum, some do not; some pass a GRE key, some do not, etc..). A BPF program getting tun_flags can therefore remember the relevant flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote. In the reply path, the program can use 'bpf_skb_set_tunnel_key' in order to correctly reply to the remote, using similar characteristics, based on the stored tunnel flags. Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key. If specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags. Decided to use the existing unused 'tunnel_ext' as the storage for the 'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout. Also, the following has been considered during the design: 1. Convert the "interesting" internal TUNNEL_xxx flags back to BPF_F_yyy and place into the new 'tunnel_flags' field. This has 2 drawbacks: - The BPF_F_yyy flags are from set_tunnel_key enumeration space, e.g. BPF_F_ZERO_CSUM_TX. It is awkward that it is "returned" into tunnel_flags from a get_tunnel_key call. - Not all "interesting" TUNNEL_xxx flags can be mapped to existing BPF_F_yyy flags, and it doesn't make sense to create new BPF_F_yyy flags just for purposes of the returned tunnel_flags. 2. Place key.tun_flags into 'tunnel_flags' but mask them, keeping only "interesting" flags. That's ok, but the drawback is that what's "interesting" for my usecase might be limiting for other usecases. Therefore I decided to expose what's in key.tun_flags as is, which seems most flexible. The BPF user can just choose to ignore bits he's not interested in. The TUNNEL_xxx are also UAPI, so no harm exposing them back in the get_tunnel_key call. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831144010.174110-1-shmulik.ladkani@gmail.com	2022-09-02 15:20:55 +02:00
Shung-Hsi Yu	dc84dbbcc9	bpf, tnums: Warn against the usage of tnum_in(tnum_range(), ...) Commit `a657182a5c` ("bpf: Don't use tnum_range on array range checking for poke descriptors") has shown that using tnum_range() as argument to tnum_in() can lead to misleading code that looks like tight bound check when in fact the actual allowed range is much wider. Document such behavior to warn against its usage in general, and suggest some scenario where result can be trusted. Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.net Link: https://www.openwall.com/lists/oss-security/2022/08/26/1 Link: https://lore.kernel.org/bpf/20220831031907.16133-3-shung-hsi.yu@suse.com Link: https://lore.kernel.org/bpf/20220831031907.16133-2-shung-hsi.yu@suse.com	2022-09-02 14:44:54 +02:00
Jakub Kicinski	c3f760ef12	net: remove netif_tx_napi_add() All callers are now gone. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-02 12:41:43 +01:00
Eric Dumazet	977f1aa5e4	net: bql: add more documentation Add some documentation for netdev_tx_sent_queue() and netdev_tx_completed_queue() Stating that netdev_tx_completed_queue() must be called once per TX completion round is apparently not obvious for everybody. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-02 12:27:46 +01:00
David S. Miller	25de4a0b7b	Merge branch 'net-ipa-transaction-state-IDs' Alex Elder says: ==================== net: ipa: use IDs to track transaction state This series is the first of three groups of changes that simplify the way the IPA driver tracks the state of its transactions. Each GSI channel has a fixed number of transactions allocated at initialization time. The number allocated matches the number of TREs in the transfer ring associated with the channel. This is because the transfer ring limits the number of transfers that can ever be underway, and in the worst case, each transaction represents a single TRE. Transactions go through various states during their lifetime. Currently a set of lists keeps track of which transactions are in each state. Initially, all transactions are free. An allocated transaction is placed on the allocated list. Once an allocated transaction is committed, it is moved from the allocated to the committed list. When a committed transaction is sent to hardware (via a doorbell) it is moved to the pending list. When hardware signals that some work has completed, transactions are moved to the completed list. Finally, when a completed transaction is polled it's moved to the polled list before being removed when it becomes free. Changing a transaction's state thus normally involves manipulating two lists, and to prevent corruption a spinlock is held while the lists are updated. Transactions move through their states in a well-defined sequence though, and they do so strictly in order. So transaction 0 is always allocated before transaction 1; transaction 0 is always committed before transaction 1; and so on, through completion, polling, and becoming free. Because of this, it's sufficient to just keep track of which transaction is the first in each state. The rest of the transactions in a given state can be derived from the first transaction in an "adjacent" state. As a result, we can track the state of all transactions with a set of indexes, and can update these without the need for a spinlock. This first group of patches just defines the set of indexes that will be used for this new way of tracking transaction state. Two more groups of patches will follow. I've broken the 17 patches into these three groups to facilitate review. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-09-02 12:08:44 +01:00

... 2 3 4 5 6 ...

1123129 Commits