linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-01 10:13:58 +08:00

Author	SHA1	Message	Date
M Chetan Kumar	b46c5795d6	net: wwan: iosm: endianness type correction Endianness type correction for nr_of_bytes. This field is exchanged as part of host-device protocol communication. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:28:55 +01:00
M Chetan Kumar	5a7c1b2a5b	net: wwan: iosm: fix lkp buildbot warning Correct td buffer type casting & format specifier to fix lkp buildbot warning. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:28:55 +01:00
David S. Miller	839454801e	Merge branch 'ipa-runtime-pm' Alex Elder says: ==================== net: ipa: more work toward runtime PM The first two patches in this series are basically bug fixes, but in practice I don't think we've seen the problems they might cause. The third patch moves clock and interconnect related error messages around a bit, reporting better information and doing so in the functions where they are enabled or disabled (rather than those functions' callers). The last three patches move power-related code into "ipa_clock.c", as a step toward generalizing the purpose of that source file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:05 +01:00
Alex Elder	afb08b7e22	net: ipa: move IPA flags field The ipa->flags field is only ever used in "ipa_clock.c", related to suspend/resume activity. Move the definition of the ipa_flag enumerated type to "ipa_clock.c". And move the flags field from the ipa structure and to the ipa_clock structure. Rename the type and its values to include "power" or "POWER" in the name. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:05 +01:00
Alex Elder	afe1baa82d	net: ipa: move ipa_suspend_handler() Move ipa_suspend_handler() into "ipa_clock.c" from "ipa_main.c", to group with the reset of the suspend/resume code. This IPA interrupt is triggered if an IPA RX endpoint is suspended but has a packet to be delivered. Introduce ipa_power_setup() and ipa_power_teardown() to add and remove the handler for the IPA SUSPEND interrupt at the same place as before, while allowing the handler to remain private. The "power" naming convention will be adopted elsewhere in this file as well (soon). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:05 +01:00
Alex Elder	73ff316dac	net: ipa: move IPA power operations to ipa_clock.c Move ipa_suspend() and ipa_resume(), as well as the definition of the ipa_pm_ops structure into "ipa_clock.c". Make ipa_pm_ops public and declare it as extern in "ipa_clock.h". This is part of centralizing IPA power management functionality into "ipa_clock.c" (the file will eventually get a name change). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:05 +01:00
Alex Elder	8ee7c40a25	net: ipa: improve IPA clock error messages Rearrange messages reported when errors occur in the IPA clock code, so that the specific interconnect is identified when an error occurs enabling or disabling it, or the core clock is indicated when an error occurs enabling it. Have ipa_interconnect_disable() return zero or the negative error value returned by the first interconnect that produced an error when disabled. For now, the callers ignore the returned value. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:04 +01:00
Alex Elder	10cc73c4b7	net: ipa: reorder netdev pointer assignments Assign the ipa->modem_netdev and endpoint->netdev pointers before registering the network device. As soon as the device is registered it can be opened, and by that time we'll want those pointers valid. Similarly, don't make those pointers NULL until after the modem network device is unregistered in ipa_modem_stop(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:04 +01:00
Alex Elder	30c2515b89	net: ipa: don't suspend/resume modem if not up The modem network device is set up by ipa_modem_start(). But its TX queue is not actually started and endpoints enabled until it is opened. So avoid stopping the modem network device TX queue and disabling endpoints on suspend or stop unless the netdev is marked UP. And skip attempting to resume unless it is UP. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:27:04 +01:00
David S. Miller	1f52247ef8	Merge branch 'sja1105-H' Vladimir Oltean says: ==================== NXP SJA1105 driver support for "H" switch topologies Changes in v3: Preserve the behavior of dsa_tree_setup_default_cpu() which is to pick the first CPU port and not the last. Changes in v2: Send as non-RFC, drop the patches for discarding DSA-tagged packets on user ports and DSA-untagged packets on DSA and CPU ports for now. NXP builds boards like the Bluebox 3 where there are multiple SJA1110 switches connected to an LX2160A, but they are also connected to each other. I call this topology an "H" tree because of the lateral connection between switches. A piece extracted from a non-upstream device tree looks like this: &spi_bridge { /* SW1 / ethernet-switch@0 { compatible = "nxp,sja1110a"; reg = <0>; dsa,member = <0 0>; ethernet-ports { #address-cells = <1>; #size-cells = <0>; / SW1_P1 / port@1 { reg = <1>; label = "con_2x20"; phy-mode = "sgmii"; fixed-link { speed = <1000>; full-duplex; }; }; port@2 { reg = <2>; ethernet = <&dpmac17>; phy-mode = "rgmii-id"; fixed-link { speed = <1000>; full-duplex; }; }; port@3 { reg = <3>; label = "1ge_p1"; phy-mode = "rgmii-id"; phy-handle = <&sw1_mii3_phy>; }; sw1p4: port@4 { reg = <4>; link = <&sw2p1>; phy-mode = "sgmii"; fixed-link { speed = <1000>; full-duplex; }; }; port@5 { reg = <5>; label = "trx1"; phy-mode = "internal"; phy-handle = <&sw1_port5_base_t1_phy>; }; port@6 { reg = <6>; label = "trx2"; phy-mode = "internal"; phy-handle = <&sw1_port6_base_t1_phy>; }; port@7 { reg = <7>; label = "trx3"; phy-mode = "internal"; phy-handle = <&sw1_port7_base_t1_phy>; }; port@8 { reg = <8>; label = "trx4"; phy-mode = "internal"; phy-handle = <&sw1_port8_base_t1_phy>; }; port@9 { reg = <9>; label = "trx5"; phy-mode = "internal"; phy-handle = <&sw1_port9_base_t1_phy>; }; port@a { reg = <10>; label = "trx6"; phy-mode = "internal"; phy-handle = <&sw1_port10_base_t1_phy>; }; }; }; / SW2 */ ethernet-switch@2 { compatible = "nxp,sja1110a"; reg = <2>; dsa,member = <0 1>; ethernet-ports { #address-cells = <1>; #size-cells = <0>; sw2p1: port@1 { reg = <1>; link = <&sw1p4>; phy-mode = "sgmii"; fixed-link { speed = <1000>; full-duplex; }; }; port@2 { reg = <2>; ethernet = <&dpmac18>; phy-mode = "rgmii-id"; fixed-link { speed = <1000>; full-duplex; }; }; port@3 { reg = <3>; label = "1ge_p2"; phy-mode = "rgmii-id"; phy-handle = <&sw2_mii3_phy>; }; port@4 { reg = <4>; label = "to_sw3"; phy-mode = "2500base-x"; fixed-link { speed = <2500>; full-duplex; }; }; port@5 { reg = <5>; label = "trx7"; phy-mode = "internal"; phy-handle = <&sw2_port5_base_t1_phy>; }; port@6 { reg = <6>; label = "trx8"; phy-mode = "internal"; phy-handle = <&sw2_port6_base_t1_phy>; }; port@7 { reg = <7>; label = "trx9"; phy-mode = "internal"; phy-handle = <&sw2_port7_base_t1_phy>; }; port@8 { reg = <8>; label = "trx10"; phy-mode = "internal"; phy-handle = <&sw2_port8_base_t1_phy>; }; port@9 { reg = <9>; label = "trx11"; phy-mode = "internal"; phy-handle = <&sw2_port9_base_t1_phy>; }; port@a { reg = <10>; label = "trx12"; phy-mode = "internal"; phy-handle = <&sw2_port10_base_t1_phy>; }; }; }; }; Basically it is a single DSA tree with 2 "ethernet" properties, i.e. a multi-CPU-port system. There is also a DSA link between the switches, but it is not a daisy chain topology, i.e. there is no "upstream" and "downstream" switch, the DSA link is only to be used for the bridge data plane (autonomous forwarding between switches, between the RJ-45 ports and the automotive Ethernet ports), otherwise all traffic that should reach the host should do so through the dedicated CPU port of the switch. Of course, plain forwarding in this topology is bound to create packet loops. I have thought long and hard about strategies to cut forwarding in such a way as to prevent loops but also not impede normal operation of the network on such a system, and I believe I have found a solution that does work as expected. This relies heavily on DSA's recent ability to perform RX filtering towards the host by installing MAC addresses as static FDB entries. Since we have 2 distinct DSA masters, we have 2 distinct MAC addresses, and if the bridge is configured to have its own MAC address that makes it 3 distinct MAC addresses. The bridge core, plus the switchdev_handle_fdb_add_to_device() extension, handle each MAC address by replicating it to each port of the DSA switch tree. So the end result is that both switch 1 and switch 2 will have static FDB entries towards their respective CPU ports for the 3 MAC addresses corresponding to the DSA masters and to the bridge net device (and of course, towards any station learned on a foreign interface). So I think the basic design works, and it is basically just as fragile as any other multi-CPU-port system is bound to be in terms of reliance on static FDB entries towards the host (if hardware address learning on the CPU port is to be used, MAC addresses would randomly bounce between one CPU port and the other otherwise). In fact, I think it is even better to start DSA's support of multi-CPU-port systems with something small like the NXP Bluebox 3, because we allow some time for the code paths like dsa_switch_host_address_match(), which were specifically designed for it, to break in, and this board needs no user space configuration of CPU ports, like static assignments between user and CPU ports, or bonding between the CPU ports/DSA masters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	81d45898a5	net: dsa: sja1105: enable address learning on cascade ports Right now, address learning is disabled on DSA ports, which means that a packet received over a DSA port from a cross-chip switch will be flooded to unrelated ports. It is desirable to eliminate that, but for that we need a breakdown of the possibilities for the sja1105 driver. A DSA port can be: - a downstream-facing cascade port. This is simple because it will always receive packets from a downstream switch, and there should be no other route to reach that downstream switch in the first place, which means it should be safe to learn that MAC address towards that switch. - an upstream-facing cascade port. This receives packets either: * autonomously forwarded by an upstream switch (and therefore these packets belong to the data plane of a bridge, so address learning should be ok), or * injected from the CPU. This deserves further discussion, as normally, an upstream-facing cascade port is no different than the CPU port itself. But with "H" topologies (a DSA link towards a switch that has its own CPU port), these are more "laterally-facing" cascade ports than they are "upstream-facing". Here, there is a risk that the port might learn the host addresses on the wrong port (on the DSA port instead of on its own CPU port), but this is solved by DSA's RX filtering infrastructure, which installs the host addresses as static FDB entries on the CPU port of all switches in a "H" tree. So even if there will be an attempt from the switch to migrate the FDB entry from the CPU port to the laterally-facing cascade port, it will fail to do that, because the FDB entry that already exists is static and cannot migrate. So address learning should be safe for this configuration too. Ok, so what about other MAC addresses coming from the host, not necessarily the bridge local FDB entries? What about MAC addresses dynamically learned on foreign interfaces, isn't there a risk that cascade ports will learn these entries dynamically when they are supposed to be delivered towards the CPU port? Well, that is correct, and this is why we also need to enable the assisted learning feature, to snoop for these addresses and write them to hardware as static FDB entries towards the CPU, to make the switch's learning process on the cascade ports ineffective for them. With assisted learning enabled, the hardware learning on the CPU port must be disabled. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	0f9b762c09	net: dsa: sja1105: suppress TX packets from looping back in "H" topologies H topologies like this one have a problem: eth0 eth1 \| \| CPU port CPU port \| DSA link \| sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0 \| \| \| \| \| \| user user user user user user port port port port port port Basically any packet sent by the eth0 DSA master can be flooded on the interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the eth1 DSA master too. Basically we are talking to ourselves. In VLAN-unaware mode, these packets are encoded using a tag_8021q TX VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains. Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN which _can_ be decoded by the tagger running on eth1, so it will attempt to reinject that packet into the network stack (the bridge, if there is any port under eth1 that is under a bridge). In the case where the ports under eth1 are under the same cross-chip bridge as the ports under eth0, the TX packets will even be learned as RX packets. The only thing that will prevent loops with the software bridging path, and therefore disaster, is that the source port and the destination port are in the same hardware domain, and the bridge will receive packets from the driver with skb->offload_fwd_mark = true and will not forward between the two. The proper solution to this problem is to detect H topologies and enforce that all packets are received through the local switch and we do not attempt to receive packets on our CPU port from switches that have their own. This is a viable solution which works thanks to the fact that MAC addresses which should be filtered towards the host are installed by DSA as static MAC addresses towards the CPU port of each switch. TX from a CPU port towards the DSA port continues to be allowed, this is because sja1105 supports bridge TX forwarding offload, and the skb->dev used initially for xmit does not have any direct correlation with where the station that will respond to that packet is connected. It may very well happen that when we send a ping through a br0 interface that spans all switch ports, the xmit packet will exit the system through a DSA switch interface under eth1 (say sw1p2), but the destination station is connected to a switch port under eth0, like sw0p0. So the switch under eth1 needs to communicate on TX with the switch under eth0. The response, however, will not follow the same path, but instead, this patch enforces that the response is sent by the first switch directly to its DSA master which is eth0. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	777e55e30d	net: dsa: sja1105: increase MTU to account for VLAN header on DSA ports Since all packets are transmitted as VLAN-tagged over a DSA link (this VLAN tag represents the tag_8021q header), we need to increase the MTU of these interfaces to account for the possibility that we are already transporting a user-visible VLAN header. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	c513002980	net: dsa: sja1105: manage VLANs on cascade ports Since commit `ed040abca4` ("net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic"), this driver uses a reserved value as pvid for the host port (DSA CPU port). Control packets which are sent as untagged get classified to this VLAN, and all ports are members of it (this is to be expected for control packets). Manage all cascade ports in the same way and allow control packets to egress everywhere. Also, all VLANs need to be sent as egress-tagged on all cascade ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	3fa212707b	net: dsa: sja1105: manage the forwarding domain towards DSA ports Manage DSA links towards other switches, be they host ports or cascade ports, the same as the CPU port, i.e. allow forwarding and flooding unconditionally from all user ports. We send packets as always VLAN-tagged on a DSA port, and we rely on the cross-chip notifiers from tag_8021q to install the RX VLAN of a switch port only on the proper remote ports of another switch (the ports that are in the same bridging domain). So if there is no cross-chip bridging in the system, the flooded packets will be sent on the DSA ports too, but they will be dropped by the remote switches due to either (a) a lack of the RX VLAN in the VLAN table of the ingress DSA port, or (b) a lack of valid destinations for those packets, due to a lack of the RX VLAN on the user ports of the switch Note that switches which only transport packets in a cross-chip bridge, but have no user ports of their own as part of that bridge, such as switch 1 in this case: DSA link DSA link sw0p0 sw0p1 sw0p2 -------- sw1p0 sw1p2 sw1p3 -------- sw2p0 sw2p2 sw2p3 ip link set sw0p0 master br0 ip link set sw2p3 master br0 will still work, because the tag_8021q cross-chip notifiers keep the RX VLANs installed on all DSA ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	30a100e60c	net: dsa: sja1105: configure the cascade ports based on topology The sja1105 switch family has a feature called "cascade ports" which can be used in topologies where multiple SJA1105/SJA1110 switches are daisy chained. Upstream switches set this bit for the DSA link towards the downstream switches. This is used when the upstream switch receives a control packet (PTP, STP) from a downstream switch, because if the source port for a control packet is marked as a cascade port, then the source port, switch ID and RX timestamp will not be taken again on the upstream switch, it is assumed that this has already been done by the downstream switch (the leaf port in the tree) and that the CPU has everything it needs to decode the information from this packet. We need to distinguish between an upstream-facing DSA link and a downstream-facing DSA link, because the upstream-facing DSA links are "host ports" for the SJA1105/SJA1110 switches, and the downstream-facing DSA links are "cascade ports". Note that SJA1105 supports a single cascade port, so only daisy chain topologies work. With SJA1110, there can be more complex topologies such as: eth0 \| host port \| sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 \| \| \| \| cascade cascade user user port port port port \| \| \| \| \| \| \| host \| port \| \| \| sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 \| \| \| \| \| \| user user user user host port port port port port \| sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 \| \| \| \| user user user user port port port port Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	2c0b03258b	net: dsa: give preference to local CPU ports Be there an "H" switch topology, where there are 2 switches connected as follows: eth0 eth1 \| \| CPU port CPU port \| DSA link \| sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0 \| \| \| \| \| \| user user user user user user port port port port port port basically one where each switch has its own CPU port for termination, but there is also a DSA link in case packets need to be forwarded in hardware between one switch and another. DSA insists to see this as a daisy chain topology, basically registering all network interfaces as sw0p0@eth0, ... sw1p0@eth0 and disregarding eth1 as a valid DSA master. This is only half the story, since when asked using dsa_port_is_cpu(), DSA will respond that sw1p1 is a CPU port, however one which has no dp->cpu_dp pointing to it. So sw1p1 is enabled, but not used. Furthermore, be there a driver for switches which support only one upstream port. This driver iterates through its ports and checks using dsa_is_upstream_port() whether the current port is an upstream one. For switch 1, two ports pass the "is upstream port" checks: - sw1p4 is an upstream port because it is a routing port towards the dedicated CPU port assigned using dsa_tree_setup_default_cpu() - sw1p1 is also an upstream port because it is a CPU port, albeit one that is disabled. This is because dsa_upstream_port() returns: if (!cpu_dp) return port; which means that if @dp does not have a ->cpu_dp pointer (which is a characteristic of CPU ports themselves as well as unused ports), then @dp is its own upstream port. So the driver for switch 1 rightfully says: I have two upstream ports, but I don't support multiple upstream ports! So let me error out, I don't know which one to choose and what to do with the other one. Generally I am against enforcing any default policy in the kernel in terms of user to CPU port assignment (like round robin or such) but this case is different. To solve the conundrum, one would have to: - Disable sw1p1 in the device tree or mark it as "not a CPU port" in order to comply with DSA's view of this topology as a daisy chain, where the termination traffic from switch 1 must pass through switch 0. This is counter-productive because it wastes 1Gbps of termination throughput in switch 1. - Disable the DSA link between sw0p4 and sw1p4 and do software forwarding between switch 0 and 1, and basically treat the switches as part of disjoint switch trees. This is counter-productive because it wastes 1Gbps of autonomous forwarding throughput between switch 0 and 1. - Treat sw0p4 and sw1p4 as user ports instead of DSA links. This could work, but it makes cross-chip bridging impossible. In this setup we would need to have 2 separate bridges, br0 spanning the ports of switch 0, and br1 spanning the ports of switch 1, and the "DSA links treated as user ports" sw0p4 (part of br0) and sw1p4 (part of br1) are the gateway ports between one bridge and another. This is hard to manage from a user's perspective, who wants to have a unified view of the switching fabric and the ability to transparently add ports to the same bridge. VLANs would also need to be explicitly managed by the user on these gateway ports. So it seems that the only reasonable thing to do is to make DSA prefer CPU ports that are local to the switch. Meaning that by default, the user and DSA ports of switch 0 will get assigned to the CPU port from switch 0 (sw0p1) and the user and DSA ports of switch 1 will get assigned to the CPU port from switch 1. The way this solves the problem is that sw1p4 is no longer an upstream port as far as switch 1 is concerned (it no longer views sw0p1 as its dedicated CPU port). So here we are, the first multi-CPU port that DSA supports is also perhaps the most uneventful one: the individual switches don't support multiple CPUs, however the DSA switch tree as a whole does have multiple CPU ports. No user space assignment of user ports to CPU ports is desirable, necessary, or possible. Ports that do not have a local CPU port (say there was an extra switch hanging off of sw0p0) default to the standard implementation of getting assigned to the first CPU port of the DSA switch tree. Is that good enough? Probably not (if the downstream switch was hanging off of switch 1, we would most certainly prefer its CPU port to be sw1p1), but in order to support that use case too, we would need to traverse the dst->rtable in search of an optimum dedicated CPU port, one that has the smallest number of hops between dp->ds and dp->cpu_dp->ds. At the moment, the DSA routing table structure does not keep the number of hops between dl->dp and dl->link_dp, and while it is probably deducible, there is zero justification to write that code now. Let's hope DSA will never have to support that use case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Vladimir Oltean	0e8eb9a16e	net: dsa: rename teardown_default_cpu to teardown_cpu_ports There is nothing specific to having a default CPU port to what dsa_tree_teardown_default_cpu() does. Even with multiple CPU ports, it would do the same thing: iterate through the ports of this switch tree and reset the ->cpu_dp pointer to NULL. So rename it accordingly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:05:48 +01:00
Alex Elder	0fd75f5760	net: ipa: fix IPA v4.9 interconnects Three interconnects are defined for IPA version 4.9, but there should only be two. They should also use names that match what's used for other platforms (and specified in the Device Tree binding). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 11:01:16 +01:00
Colin Ian King	df7ba0eb25	mctp: remove duplicated assignment of pointer hdr The pointer hdr is being initialized and also re-assigned with the same value from the call to function mctp_hdr. Static analysis reports that the initializated value is unused. The second assignment is duplicated and can be removed. Addresses-Coverity: ("Unused value"). Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 10:56:01 +01:00
Sean Christopherson	d5aaad6f83	KVM: x86/mmu: Fix per-cpu counter corruption on 32-bit builds Take a signed 'long' instead of an 'unsigned long' for the number of pages to add/subtract to the total number of pages used by the MMU. This fixes a zero-extension bug on 32-bit kernels that effectively corrupts the per-cpu counter used by the shrinker. Per-cpu counters take a signed 64-bit value on both 32-bit and 64-bit kernels, whereas kvm_mod_used_mmu_pages() takes an unsigned long and thus an unsigned 32-bit value on 32-bit kernels. As a result, the value used to adjust the per-cpu counter is zero-extended (unsigned -> signed), not sign-extended (signed -> signed), and so KVM's intended -1 gets morphed to 4294967295 and effectively corrupts the counter. This was found by a staggering amount of sheer dumb luck when running kvm-unit-tests on a 32-bit KVM build. The shrinker just happened to kick in while running tests and do_shrink_slab() logged an error about trying to free a negative number of objects. The truly lucky part is that the kernel just happened to be a slightly stale build, as the shrinker no longer yells about negative objects as of commit `18bb473e50` ("mm: vmscan: shrink deferred objects proportional to priority"). vmscan: shrink_slab: mmu_shrink_scan+0x0/0x210 [kvm] negative objects to delete nr=-858993460 Fixes: `bc8a3d8925` ("kvm: mmu: Fix overflow on kvm mmu page limit calculation") Cc: stable@vger.kernel.org Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210804214609.1096003-1-seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-08-05 03:33:56 -04:00
Stanislav Fomichev	372642ea83	selftests/bpf: Move netcnt test under test_progs Rewrite to skel and ASSERT macros as well while we are at it. v3: - replace -f with -A to make it work with busybox ping. -A is available on both busybox and iputils, from the man page: On networks with low RTT this mode is essentially equivalent to flood mode. v2: - don't check result of bpf_map__fd (Yonghong Song) - remove from .gitignore (Andrii Nakryiko) - move ping_command into network_helpers (Andrii Nakryiko) - remove assert() (Andrii Nakryiko) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210804205524.3748709-1-sdf@google.com	2021-08-04 16:18:48 -07:00
Matthew Cover	34ad6d9d8c	bpf, samples: Add missing mprog-disable to xdp_redirect_cpu's optstring Commit `ce4dade7f1` ("samples/bpf: xdp_redirect_cpu: Load a eBPF program on cpumap") added the following option, but missed adding it to optstring: - mprog-disable: disable loading XDP program on cpumap entries Fix it and add the missing option character. Fixes: `ce4dade7f1` ("samples/bpf: xdp_redirect_cpu: Load a eBPF program on cpumap") Signed-off-by: Matthew Cover <matthew.cover@stackpath.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210731005632.13228-1-matthew.cover@stackpath.com	2021-08-05 00:41:13 +02:00
Andrii Nakryiko	6d4eb36d65	bpf: Fix bpf_prog_test_run_xdp logic after incorrect merge resolution During recent net into net-next merge ([0]) a piece of old logic ([1]) got reintroduced accidentally while resolving merge conflict between bpf's [2] and bpf-next's [3]. This check was removed in bpf-next tree to allow extra ctx_in parameter passed for XDP test runs. Reinstating the check breaks bpf_prog_test_run_xdp logic and causes a corresponding xdp_context_test_run selftest failure. Fix by removing the check and allow ctx_in for XDP test runs. [0] `5af84df962` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") [1] `947e8b595b` ("bpf: explicitly prohibit ctx_{in, out} in non-skb BPF_PROG_TEST_RUN") [2] `5e21bb4e81` ("bpf, test: fix NULL pointer dereference on invalid expected_attach_type") [3] `47316f4a30` ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN") Fixes: `5af84df962` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2021-08-04 23:55:00 +02:00
Hui Su	1c0cec64a7	scripts/tracing: fix the bug that can't parse raw_trace_func Since commit `77271ce4b2` ("tracing: Add irq, preempt-count and need resched info to default trace output"), the default trace output format has been changed to: <idle>-0 [009] d.h. 22420.068695: _raw_spin_lock_irqsave <-hrtimer_interrupt <idle>-0 [000] ..s. 22420.068695: _nohz_idle_balance <-run_rebalance_domains <idle>-0 [011] d.h. 22420.068695: account_process_tick <-update_process_times origin trace output format:(before v3.2.0) # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule The draw_functrace.py(introduced in v2.6.28) can't parse the new version format trace_func, So we need modify draw_functrace.py to adapt the new version trace output format. Link: https://lkml.kernel.org/r/20210611022107.608787-1-suhui@zeku.com Cc: stable@vger.kernel.org Fixes: `77271ce4b2` tracing: Add irq, preempt-count and need resched info to default trace output Signed-off-by: Hui Su <suhui@zeku.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-08-04 17:49:26 -04:00
Nathan Chancellor	b18b851ba8	scripts/recordmcount.pl: Remove check_objcopy() and $can_use_local When building ARCH=riscv allmodconfig with llvm-objcopy, the objcopy version warning from this script appears: WARNING: could not find objcopy version or version is less than 2.17. Local function references are disabled. The check_objcopy() function in scripts/recordmcount.pl is set up to parse GNU objcopy's version string, not llvm-objcopy's, which triggers the warning. Commit `799c434154` ("kbuild: thin archives make default for all archs") made binutils 2.20 mandatory and commit `ba64beb174` ("kbuild: check the minimum assembler version in Kconfig") enforces this at configuration time so just remove check_objcopy() and $can_use_local instead, assuming --globalize-symbol is always available. llvm-objcopy has supported --globalize-symbol since LLVM 7.0.0 in 2018 and the minimum version for building the kernel with LLVM is 10.0.1 so there is no issue introduced: Link: `ee5be798da` Link: https://lkml.kernel.org/r/20210802210307.3202472-1-nathan@kernel.org Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-08-04 17:49:26 -04:00
Masami Hiramatsu	a9d10ca498	tracing: Reject string operand in the histogram expression Since the string type can not be the target of the addition / subtraction operation, it must be rejected. Without this fix, the string type silently converted to digits. Link: https://lkml.kernel.org/r/162742654278.290973.1523000673366456634.stgit@devnote2 Cc: stable@vger.kernel.org Fixes: `100719dcef` ("tracing: Add simple expression support to hist triggers") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-08-04 17:49:26 -04:00
Steven Rostedt (VMware)	2c05caa7ba	tracing / histogram: Give calculation hist_fields a size When working on my user space applications, I found a bug in the synthetic event code where the automated synthetic event field was not matching the event field calculation it was attached to. Looking deeper into it, it was because the calculation hist_field was not given a size. The synthetic event fields are matched to their hist_fields either by having the field have an identical string type, or if that does not match, then the size and signed values are used to match the fields. The problem arose when I tried to match a calculation where the fields were "unsigned int". My tool created a synthetic event of type "u32". But it failed to match. The string was: diff=field1-field2:onmatch(event).trace(synth,$diff) Adding debugging into the kernel, I found that the size of "diff" was 0. And since it was given "unsigned int" as a type, the histogram fallback code used size and signed. The signed matched, but the size of u32 (4) did not match zero, and the event failed to be created. This can be worse if the field you want to match is not one of the acceptable fields for a synthetic event. As event fields can have any type that is supported in Linux, this can cause an issue. For example, if a type is an enum. Then there's no way to use that with any calculations. Have the calculation field simply take on the size of what it is calculating. Link: https://lkml.kernel.org/r/20210730171951.59c7743f@oasis.local.home Cc: Tom Zanussi <zanussi@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: `100719dcef` ("tracing: Add simple expression support to hist triggers") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-08-04 17:48:41 -04:00
Sebastian Andrzej Siewior	372bbdd5bb	net: Replace deprecated CPU-hotplug functions. The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-04 13:47:50 -07:00
Sebastian Andrzej Siewior	a0d1d0f47e	virtio_net: Replace deprecated CPU-hotplug functions. The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtualization@lists.linux-foundation.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-04 13:47:33 -07:00
Linus Torvalds	251a152429	SCSI fixes on 20210804 Seven fixes, five in drivers. The two core changes are a trivial warning removal in scsi_scan.c and a change to rescan for capacity when a device makes a user induced (via a write to the state variable) offline->running transition to fix issues with device mapper. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYQq1bCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZixAQC7+11n NN5gaFI66HJk51BrtzNt9U75U1oBB3juCdEFEwD9GrJVFNn7GySSaUfLS2iUl/gN eZJHRlvNZxmk8QqhKu8= =ause -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Seven fixes, five in drivers. The two core changes are a trivial warning removal in scsi_scan.c and a change to rescan for capacity when a device makes a user induced (via a write to the state variable) offline->running transition to fix issues with device mapper" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: core: Fix capacity set to zero after offlinining device scsi: sr: Return correct event when media event code is 3 scsi: ibmvfc: Fix command state accounting and stale response detection scsi: core: Avoid printing an error if target_alloc() returns -ENXIO scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach() scsi: megaraid_mm: Fix end of loop tests for list_for_each_entry() scsi: pm80xx: Fix TMF task completion race condition	2021-08-04 12:41:30 -07:00
Linus Torvalds	0c2e31d2bd	gpio fixes for v5.14-rc5 - revert a patch intruducing breakage in interrupt handling in gpio-mpc8xxx - correctly handle missing IRQs in gpio-tqmx86 by really making them optional -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmEKmacACgkQEacuoBRx 13JHnxAApzU5tliDwvceJN7SLmlJza6f/s8jj5KtUxWvclkg8yoWBw6d0hanmOHO I26UCLlZjMOyLAZLLcxHNGu4ZSF97Hs2eu+i2M743NtdbwE4Auo/hmG1SVJFZZ/O UXSUvq2SVx7SYbLnejbQ3K44LCCtcaDSC/EPRHoL6grY0jLN2Q6MwfN/90wJ6eSx 5DhZcY6wusYjyO2XnWZ2VM0qf7dzFDZxL+UlD5XTDVoLnTfIN2dyhq9tZZ28xAc8 hmZEVgobDheKoViq7GpOFA9TRxQsPVfXiDDJSjjaUdSSN2QxzJgN/cL11myBOGfh Yb/eSKsXOZA8lENo1BdiJ5zO0OXBkcCSP/0+ySh8eEvhI03P7fl/n54CvLik+KUi 1le8L7U+OLFPBh7LHKXnnR404iXuGg9viBCTNd2OzascIZY986XbZLT+8O9/N0Uo m6DYOysD2h4zQEX8iRTTEj8vOdgAHjEIcj01quS91qXTJ4gv4RCpFdbyeuDHgsA2 f7eIGJsITORX49G6E5wCifsBeEkIDJxLi9lbit5ZdIqp4C+GjgAYSy/JZBmye9Zc nefdGVhFdzXcmKCG/52nwnhLWPPJDIrcIZX5R0nrUTfdjpKCCKtyyF8XB5VjAkIR 32n8Aa9XOwxMCR1Jx1t/wo0r/4rAyFB1GJvLkyoKs/Zh9esrFZY= =6gja -----END PGP SIGNATURE----- Merge tag 'gpio-updates-for-v5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - revert a patch intruducing breakage in interrupt handling in gpio-mpc8xxx - correctly handle missing IRQs in gpio-tqmx86 by really making them optional * tag 'gpio-updates-for-v5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: tqmx86: really make IRQ optional Revert "gpio: mpc8xxx: change the gpio interrupt flags."	2021-08-04 12:31:53 -07:00
Maxim Levitsky	13c2c3cfe0	KVM: selftests: fix hyperv_clock test The test was mistakenly using addr_gpa2hva on a gva and that happened to work accidentally. Commit `106a2e766e` ("KVM: selftests: Lower the min virtual address for misc page allocations") revealed this bug. Fixes: `2c7f76b4c4` ("selftests: kvm: Add basic Hyper-V clocksources tests", 2021-03-18) Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210804112057.409498-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-08-04 09:43:03 -04:00
Mingwei Zhang	bb2baeb214	KVM: SVM: improve the code readability for ASID management KVM SEV code uses bitmaps to manage ASID states. ASID 0 was always skipped because it is never used by VM. Thus, in existing code, ASID value and its bitmap postion always has an 'offset-by-1' relationship. Both SEV and SEV-ES shares the ASID space, thus KVM uses a dynamic range [min_asid, max_asid] to handle SEV and SEV-ES ASIDs separately. Existing code mixes the usage of ASID value and its bitmap position by using the same variable called 'min_asid'. Fix the min_asid usage: ensure that its usage is consistent with its name; allocate extra size for ASID 0 to ensure that each ASID has the same value with its bitmap position. Add comments on ASID bitmap allocation to clarify the size change. Signed-off-by: Mingwei Zhang <mizhang@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Marc Orr <marcorr@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Alper Gun <alpergun@google.com> Cc: Dionna Glaze <dionnaglaze@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Vipin Sharma <vipinsh@google.com> Cc: Peter Gonda <pgonda@google.com> Cc: Joerg Roedel <joro@8bytes.org> Message-Id: <20210802180903.159381-1-mizhang@google.com> [Fix up sev_asid_free to also index by ASID, as suggested by Sean Christopherson, and use nr_asids in sev_cpu_init. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-08-04 09:43:03 -04:00
Nick Richardson	c2eecaa193	pktgen: Remove redundant clone_skb override When the netif_receive xmit_mode is set, a line is supposed to set clone_skb to a default 0 value. This line is made redundant due to a preceding line that checks if clone_skb is more than zero and returns -ENOTSUPP. Overriding clone_skb to 0 does not make any difference to the behavior because if it was positive we return error. So it can be either 0 or negative, and in both cases the behavior is the same. Remove redundant line that sets clone_skb to zero. Signed-off-by: Nick Richardson <richardsonnick@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:54:09 +01:00
Jonathan Lemon	773bda9649	ptp: ocp: Expose various resources on the timecard. The OpenCompute timecard driver has additional functionality besides a clock. Make the following resources available: - The external timestamp channels (ts0/ts1) - devlink support for flashing and health reporting - GPS and MAC serial ports - board serial number (obtained from i2c device) Also add watchdog functionality for when GNSS goes into holdover. The resources are collected under a timecard class directory: [jlemon@timecard ~]$ ls -g /sys/class/timecard/ocp1/ total 0 -r--r--r--. 1 root 4096 Aug 3 19:49 available_clock_sources -rw-r--r--. 1 root 4096 Aug 3 19:49 clock_source lrwxrwxrwx. 1 root 0 Aug 3 19:49 device -> ../../../0000:04:00.0/ -r--r--r--. 1 root 4096 Aug 3 19:49 gps_sync lrwxrwxrwx. 1 root 0 Aug 3 19:49 i2c -> ../../xiic-i2c.1024/i2c-2/ drwxr-xr-x. 2 root 0 Aug 3 19:49 power/ lrwxrwxrwx. 1 root 0 Aug 3 19:49 pps -> ../../../../../virtual/pps/pps1/ lrwxrwxrwx. 1 root 0 Aug 3 19:49 ptp -> ../../ptp/ptp2/ -r--r--r--. 1 root 4096 Aug 3 19:49 serialnum lrwxrwxrwx. 1 root 0 Aug 3 19:49 subsystem -> ../../../../../../class/timecard/ lrwxrwxrwx. 1 root 0 Aug 3 19:49 ttyGPS -> ../../tty/ttyS7/ lrwxrwxrwx. 1 root 0 Aug 3 19:49 ttyMAC -> ../../tty/ttyS8/ -rw-r--r--. 1 root 4096 Aug 3 19:39 uevent The labeling is needed at the minimum, in order to tell the serial devices apart. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:52:50 +01:00
Pavel Tikhomirov	04190bf894	sock: allow reading and changing sk_userlocks with setsockopt SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and tcp_sndbuf_expand()). If we've just created a new socket this adjustment is enabled on it, but if one changes the socket buffer size by setsockopt(SO_{SND,RCV}BUF) it becomes disabled. CRIU needs to call setsockopt(SO_{SND,RCV}BUF) on each socket on restore as it first needs to increase buffer sizes for packet queues restore and second it needs to restore back original buffer sizes. So after CRIU restore all sockets become non-auto-adjustable, which can decrease network performance of restored applications significantly. CRIU need to be able to restore sockets with enabled/disabled adjustment to the same state it was before dump, so let's add special setsockopt for it. Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so that using these interface one can reenable automatic socket buffer adjustment on their sockets. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:52:03 +01:00
Ivan T. Ivanov	6b67d4d63e	net: usb: lan78xx: don't modify phy_device state concurrently Currently phy_device state could be left in inconsistent state shown by following alert message[1]. This is because phy_read_status could be called concurrently from lan78xx_delayedwork, phy_state_machine and __ethtool_get_link. Fix this by making sure that phy_device state is updated atomically. [1] lan78xx 1-1.1.1:1.0 eth0: No phy led trigger registered for speed(-1) Signed-off-by: Ivan T. Ivanov <iivanov@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:51:14 +01:00
Jakub Kicinski	396492b4c5	docs: networking: netdevsim rules There are aspects of netdevsim which are commonly misunderstood and pointed out in review. Cong suggest we document them. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:43:27 +01:00
Peilin Ye	625af9f029	tc-testing: Add control-plane selftests for sch_mq Recently we added multi-queue support to netdevsim in commit `d4861fc6be` ("netdevsim: Add multi-queue support"); add a few control-plane selftests for sch_mq using this new feature. Use nsPlugin.py to avoid network interface name collisions. Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:42:27 +01:00
Vladimir Oltean	a54182b2a5	Revert "net: build all switchdev drivers as modules when the bridge is a module" This reverts commit `b0e8181762`. Explicit driver dependency on the bridge is no longer needed since switchdev_bridge_port_{,un}offload() is no longer implemented by the bridge driver but by switchdev. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:35:07 +01:00
Vladimir Oltean	957e2235e5	net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge With the introduction of explicit offloading API in switchdev in commit `2f5dc00f7a` ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded"), we started having Ethernet switch drivers calling directly into a function exported by net/bridge/br_switchdev.c, which is a function exported by the bridge driver. This means that drivers that did not have an explicit dependency on the bridge before, like cpsw and am65-cpsw, now do - otherwise it is not possible to call a symbol exported by a driver that can be built as module unless you are a module too. There was an attempt to solve the dependency issue in the form of commit `b0e8181762` ("net: build all switchdev drivers as modules when the bridge is a module"). Grygorii Strashko, however, says about it: \| In my opinion, the problem is a bit bigger here than just fixing the \| build :( \| \| In case, of ^cpsw the switchdev mode is kinda optional and in many \| cases (especially for testing purposes, NFS) the multi-mac mode is \| still preferable mode. \| \| There were no such tight dependency between switchdev drivers and \| bridge core before and switchdev serviced as independent, notification \| based layer between them, so ^cpsw still can be "Y" and bridge can be \| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE \| will need to be set as "Y", or we will have to update drivers to \| support build with BRIDGE=n and maintain separate builds for \| networking vs non-networking testing. But is this enough? Wouldn't \| it cause 'chain reaction' required to add more and more "Y" options \| (like CONFIG_VLAN_8021Q)? \| \| PS. Just to be sure we on the same page - ARM builds will be forced \| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our \| automation testing will just fail with omap2plus_defconfig. In the light of this, it would be desirable for some configurations to avoid dependencies between switchdev drivers and the bridge, and have the switchdev mode as completely optional within the driver. Arnd Bergmann also tried to write a patch which better expressed the build time dependency for Ethernet switch drivers where the switchdev support is optional, like cpsw/am65-cpsw, and this made the drivers follow the bridge (compile as module if the bridge is a module) only if the optional switchdev support in the driver was enabled in the first place: https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/ but this still did not solve the fact that cpsw and am65-cpsw now must be built as modules when the bridge is a module - it just expressed correctly that optional dependency. But the new behavior is an apparent regression from Grygorii's perspective. So to support the use case where the Ethernet driver is built-in, NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we need a framework that can handle the possible absence of the bridge from the running system, i.e. runtime bloatware as opposed to build-time bloatware. Luckily we already have this framework, since switchdev has been using it extensively. Events from the bridge side are transmitted to the driver side using notifier chains - this was originally done so that unrelated drivers could snoop for events emitted by the bridge towards ports that are implemented by other drivers (think of a switch driver with LAG offload that listens for switchdev events on a bonding/team interface that it offloads). There are also events which are transmitted from the driver side to the bridge side, which again are modeled using notifiers. SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with notifying the bridge that a MAC address has been dynamically learned. So there is a precedent we can use for modeling the new framework. The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work that the bridge needs to do when a port becomes offloaded is blocking in its nature: replay VLANs, MDBs etc. The calling context is indeed blocking (we are under rtnl_mutex), but the existing switchdev notification chain that the bridge is subscribed to is only the atomic one. So we need to subscribe the bridge to the blocking switchdev notification chain too. This patch: - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload unchanged - moves the implementation of switchdev_bridge_port_{,un}offload from the bridge module into the switchdev module. - makes everybody that is subscribed to the switchdev blocking notifier chain "hear" offload & unoffload events - makes the bridge driver subscribe and handle those events - moves the bridge driver's handling of those events into 2 new functions called br_switchdev_port_{,un}offload. These functions contain in fact the core of the logic that was previously in switchdev_bridge_port_{,un}offload, just that now we go through an extra indirection layer to reach them. Unlike all the other switchdev notification structures, the structure used to carry the bridge port information, struct switchdev_notifier_brport_info, does not contain a "bool handled". This is because in the current usage pattern, we always know that a switchdev bridge port offloading event will be handled by the bridge, because the switchdev_bridge_port_offload() call was initiated by a NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event couldn't have happened. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:35:07 +01:00
David S. Miller	9c0532f9cc	linux-can-next-for-5.15-20210804 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmEKaBUTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCpyVqK+u3vqSvgCACpR64hydl7/qt9QGnm9Ym6/v/L9y9v aBfZMQsedP1GSuev5PpxghXU4GF0LXiDr6ryr0hhu7w2ojjlLNl9sVHCF9qdAJKz x2D4YTlxct2KuPBdhWllQr/KWFbJh2IzarHEWzdo+QoU5A8jDlsK2kLeeikFECzT fVUe3mu1k66/DvHsetsfzIvbUkuHk2SPpK/pwrUC6Siw6wQZBHlSoUEtBNwEPlyH 8+ZQJPqtrjr2v3mZUOkgHrlXEOZRu6OM3i1Yv2bn2x4VI+3KQHEw/cA1WNE2AOzN CfMp4sS98QdCrAboX4VJZpGAbziTFHedqFjjIP9ultCfH9ROHhQj4Zsl =37wt -----END PGP SIGNATURE----- Merge tag 'linux-can-next-for-5.15-20210804' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2021-08-04 this is a pull request of 5 patches for net-next/master. The first patch is by me and fixes a typo in a comment in the CAN J1939 protocol. The next 2 patches are by Oleksij Rempel and update the CAN J1939 protocol to send RX status updates via the error queue mechanism. The next patch is by me and adds a missing variable initialization to the flexcan driver (the problem was introduced in the current net-next cycle). The last patch is by Aswath Govindraju and adds power-domains to the Bosch m_can DT binding documentation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 11:30:09 +01:00
Aswath Govindraju	d85165b238	dt-bindings: net: can: Document power-domains property Document power-domains property for adding the Power domain provider. Link: https://lore.kernel.org/r/20210802091822.16407-1-a-govindraju@ti.com Signed-off-by: Aswath Govindraju <a-govindraju@ti.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-08-04 12:11:57 +02:00
Marc Kleine-Budde	3362666972	can: flexcan: flexcan_clks_enable(): add missing variable initialization This patch adds the missing initialization of the "err" variable in the flexcan_clks_enable() function. Fixes: `d9cead75b1` ("can: flexcan: add mcf5441x support") Link: https://lore.kernel.org/r/20210728075428.1493568-1-mkl@pengutronix.de Reported-by: kernel test robot <lkp@intel.com> Cc: Angelo Dureghello <angelo@kernel-space.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-08-04 12:11:56 +02:00
Oleksij Rempel	5b9272e93f	can: j1939: extend UAPI to notify about RX status To be able to create applications with user friendly feedback, we need be able to provide receive status information. Typical ETP transfer may take seconds or even hours. To give user some clue or show a progress bar, the stack should push status updates. Same as for the TX information, the socket error queue will be used with following new signals: - J1939_EE_INFO_RX_RTS - received and accepted request to send signal. - J1939_EE_INFO_RX_DPO - received data package offset signal - J1939_EE_INFO_RX_ABORT - RX session was aborted Instead of completion signal, user will get data package. To activate this signals, application should set SOF_TIMESTAMPING_RX_SOFTWARE to the SO_TIMESTAMPING socket option. This will avoid unpredictable application behavior for the old software. Link: https://lore.kernel.org/r/20210707094854.30781-3-o.rempel@pengutronix.de Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-08-04 12:11:52 +02:00
Oleksij Rempel	cd85d3aed5	can: j1939: rename J1939_ERRQUEUE_* to J1939_ERRQUEUE_TX_* Prepare the world for the J1939_ERRQUEUE_RX_ version Link: https://lore.kernel.org/r/20210707094854.30781-2-o.rempel@pengutronix.de Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-08-04 12:11:48 +02:00
Sean Christopherson	179c6c27bf	KVM: SVM: Fix off-by-one indexing when nullifying last used SEV VMCB Use the raw ASID, not ASID-1, when nullifying the last used VMCB when freeing an SEV ASID. The consumer, pre_sev_run(), indexes the array by the raw ASID, thus KVM could get a false negative when checking for a different VMCB if KVM manages to reallocate the same ASID+VMCB combo for a new VM. Note, this cannot cause a functional issue _in the current code_, as pre_sev_run() also checks which pCPU last did VMRUN for the vCPU, and last_vmentry_cpu is initialized to -1 during vCPU creation, i.e. is guaranteed to mismatch on the first VMRUN. However, prior to commit `8a14fe4f0c` ("kvm: x86: Move last_cpu into kvm_vcpu_arch as last_vmentry_cpu"), SVM tracked pCPU on its own and zero-initialized the last_cpu variable. Thus it's theoretically possible that older versions of KVM could miss a TLB flush if the first VMRUN is on pCPU0 and the ASID and VMCB exactly match those of a prior VM. Fixes: `70cd94e60c` ("KVM: SVM: VMRUN should use associated ASID when SEV is enabled") Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-08-04 06:02:09 -04:00
Paolo Bonzini	85cd39af14	KVM: Do not leak memory for duplicate debugfs directories KVM creates a debugfs directory for each VM in order to store statistics about the virtual machine. The directory name is built from the process pid and a VM fd. While generally unique, it is possible to keep a file descriptor alive in a way that causes duplicate directories, which manifests as these messages: [ 471.846235] debugfs: Directory '20245-4' with parent 'kvm' already present! Even though this should not happen in practice, it is more or less expected in the case of KVM for testcases that call KVM_CREATE_VM and close the resulting file descriptor repeatedly and in parallel. When this happens, debugfs_create_dir() returns an error but kvm_create_vm_debugfs() goes on to allocate stat data structs which are later leaked. The slow memory leak was spotted by syzkaller, where it caused OOM reports. Since the issue only affects debugfs, do a lookup before calling debugfs_create_dir, so that the message is downgraded and rate-limited. While at it, ensure kvm->debugfs_dentry is NULL rather than an error if it is not created. This fixes kvm_destroy_vm_debugfs, which was not checking IS_ERR_OR_NULL correctly. Cc: stable@vger.kernel.org Fixes: `536a6f88c4` ("KVM: Create debugfs dir and stat files for each VM") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-08-04 06:02:03 -04:00
David S. Miller	d00551b402	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2021-08-04 1) Fix a sysbot reported memory leak in xfrm_user_rcv_msg. From Pavel Skripkin. 2) Revert "xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype". This commit tried to fix a lockin bug, but only cured some of the symptoms. A proper fix is applied on top of this revert. 3) Fix a locking bug on xfrm state hash resize. A recent change on sequence counters accidentally repaced a spinlock by a mutex. Fix from Frederic Weisbecker. 4) Fix possible user-memory-access in xfrm_user_rcv_msg_compat(). From Dmitry Safonov. 5) Add initialiation sefltest fot xfrm_spdattr_type_t. From Dmitry Safonov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 10:45:41 +01:00

... 3 4 5 6 7 ...

1031477 Commits