linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-17 09:14:19 +08:00

Author	SHA1	Message	Date
Florian Westphal	fc518953bc	mptcp: add and use MIB counter infrastructure Exported via same /proc file as the Linux TCP MIB counters, so "netstat -s" or "nstat" will show them automatically. The MPTCP MIB counters are allocated in a distinct pcpu area in order to avoid bloating/wasting TCP pcpu memory. Counters are allocated once the first MPTCP socket is created in a network namespace and free'd on exit. If no sockets have been allocated, all-zero mptcp counters are shown. The MIB counter list is taken from the multipath-tcp.org kernel, but only a few counters have been picked up so far. The counter list can be increased at any time later on. v2 -> v3: - remove 'inline' in foo.c files (David S. Miller) Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-29 22:14:49 -07:00
Peter Krystad	ec3edaa7ca	mptcp: Add handling of outgoing MP_JOIN requests Subflow creation may be initiated by the path manager when the primary connection is fully established and a remote address has been received via ADD_ADDR. Create an in-kernel sock and use kernel_connect() to initiate connection. Passive sockets can't acquire the mptcp socket lock at subflow creation time, so an additional list protected by a new spinlock is used to track the MPJ subflows. Such list is spliced into conn_list tail every time the msk socket lock is acquired, so that it will not interfere with data flow on the original connection. Data flow and connection failover not addressed by this commit. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-29 22:14:48 -07:00
Peter Krystad	f296234c98	mptcp: Add handling of incoming MP_JOIN requests Process the MP_JOIN option in a SYN packet with the same flow as MP_CAPABLE but when the third ACK is received add the subflow to the MPTCP socket subflow list instead of adding it to the TCP socket accept queue. The subflow is added at the end of the subflow list so it will not interfere with the existing subflows operation and no data is expected to be transmitted on it. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-29 22:14:48 -07:00
Peter Krystad	3df523ab58	mptcp: Add ADD_ADDR handling Add handling for sending and receiving the ADD_ADDR, ADD_ADDR6, and RM_ADDR suboptions. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-29 22:14:48 -07:00
Vladimir Oltean	bff33f7e2a	net: dsa: implement auto-normalization of MTU for bridge hardware datapath Many switches don't have an explicit knob for configuring the MTU (maximum transmission unit per interface). Instead, they do the length-based packet admission checks on the ingress interface, for reasons that are easy to understand (why would you accept a packet in the queuing subsystem if you know you're going to drop it anyway). So it is actually the MRU that these switches permit configuring. In Linux there only exists the IFLA_MTU netlink attribute and the associated dev_set_mtu function. The comments like to play blind and say that it's changing the "maximum transfer unit", which is to say that there isn't any directionality in the meaning of the MTU word. So that is the interpretation that this patch is giving to things: MTU == MRU. When 2 interfaces having different MTUs are bridged, the bridge driver MTU auto-adjustment logic kicks in: what br_mtu_auto_adjust() does is it adjusts the MTU of the bridge net device itself (and not that of the slave net devices) to the minimum value of all slave interfaces, in order for forwarded packets to not exceed the MTU regardless of the interface they are received and send on. The idea behind this behavior, and why the slave MTUs are not adjusted, is that normal termination from Linux over the L2 forwarding domain should happen over the bridge net device, which _is_ properly limited by the minimum MTU. And termination over individual slave devices is possible even if those are bridged. But that is not "forwarding", so there's no reason to do normalization there, since only a single interface sees that packet. The problem with those switches that can only control the MRU is with the offloaded data path, where a packet received on an interface with MRU 9000 would still be forwarded to an interface with MRU 1500. And the br_mtu_auto_adjust() function does not really help, since the MTU configured on the bridge net device is ignored. In order to enforce the de-facto MTU == MRU rule for these switches, we need to do MTU normalization, which means: in order for no packet larger than the MTU configured on this port to be sent, then we need to limit the MRU on all ports that this packet could possibly come from. AKA since we are configuring the MRU via MTU, it means that all ports within a bridge forwarding domain should have the same MTU. And that is exactly what this patch is trying to do. >From an implementation perspective, we try to follow the intent of the user, otherwise there is a risk that we might livelock them (they try to change the MTU on an already-bridged interface, but we just keep changing it back in an attempt to keep the MTU normalized). So the MTU that the bridge is normalized to is either: - The most recently changed one: ip link set dev swp0 master br0 ip link set dev swp1 master br0 ip link set dev swp0 mtu 1400 This sequence will make swp1 inherit MTU 1400 from swp0. - The one of the most recently added interface to the bridge: ip link set dev swp0 master br0 ip link set dev swp1 mtu 1400 ip link set dev swp1 master br0 The above sequence will make swp0 inherit MTU 1400 as well. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-27 16:07:25 -07:00
Vladimir Oltean	bfcb813203	net: dsa: configure the MTU for switch ports It is useful be able to configure port policers on a switch to accept frames of various sizes: - Increase the MTU for better throughput from the default of 1500 if it is known that there is no 10/100 Mbps device in the network. - Decrease the MTU to limit the latency of high-priority frames under congestion, or work around various network segments that add extra headers to packets which can't be fragmented. For DSA slave ports, this is mostly a pass-through callback, called through the regular ndo ops and at probe time (to ensure consistency across all supported switches). The CPU port is called with an MTU equal to the largest configured MTU of the slave ports. The assumption is that the user might want to sustain a bidirectional conversation with a partner over any switch port. The DSA master is configured the same as the CPU port, plus the tagger overhead. Since the MTU is by definition L2 payload (sans Ethernet header), it is up to each individual driver to figure out if it needs to do anything special for its frame tags on the CPU port (it shouldn't except in special cases). So the MTU does not contain the tagger overhead on the CPU port. However the MTU of the DSA master, minus the tagger overhead, is used as a proxy for the MTU of the CPU port, which does not have a net device. This is to avoid uselessly calling the .change_mtu function on the CPU port when nothing should change. So it is safe to assume that the DSA master and the CPU port MTUs are apart by exactly the tagger's overhead in bytes. Some changes were made around dsa_master_set_mtu(), function which was now removed, for 2 reasons: - dev_set_mtu() already calls dev_validate_mtu(), so it's redundant to do the same thing in DSA - __dev_set_mtu() returns 0 if ops->ndo_change_mtu is an absent method That is to say, there's no need for this function in DSA, we can safely call dev_set_mtu() directly, take the rtnl lock when necessary, and just propagate whatever errors get reported (since the user probably wants to be informed). Some inspiration (mainly in the MTU DSA notifier) was taken from a vaguely similar patch from Murali and Florian, who are credited as co-developers down below. Co-developed-by: Murali Krishna Policharla <murali.policharla@broadcom.com> Signed-off-by: Murali Krishna Policharla <murali.policharla@broadcom.com> Co-developed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-27 16:07:24 -07:00
Vasundhara Volam	2d9eade8f2	devlink: Add macro for "fw.mgmt.api" to info_get cb. Add definition and documentation for the new generic info "fw.mgmt.api". This macro specifies the version of the software interfaces between driver and firmware. Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-27 15:34:42 -07:00
Dmitry Bogdanov	b62c362450	net: macsec: add support for getting offloaded stats When HW offloading is enabled, offloaded stats should be used, because s/w stats are wrong and out of sync with the HW in this case. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 20:17:36 -07:00
Antoine Tenart	8fa9137180	net: macsec: allow to reference a netdev from a MACsec context This patch allows to reference a net_device from a MACsec context. This is needed to allow implementing MACsec operations in net device drivers. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 20:17:36 -07:00
Maciej Żenczykowski	c24a77edc9	ipv6: ndisc: add support for 'PREF64' dns64 prefix identifier This is trivial since we already have support for the entirely identical (from the kernel's point of view) RDNSS, DNSSL, etc. that also contain opaque data that needs to be passed down to userspace for further processing. As specified in draft-ietf-6man-ra-pref64-09 (while it is still a draft, it is purely waiting on the RFC Editor for cleanups and publishing): PREF64 option contains lifetime and a (up to) 96-bit IPv6 prefix. The 8-bit identifier of the option type as assigned by the IANA is 38. Since we lack DNS64/NAT64/CLAT support in kernel at the moment, thus this option should also be passed on to userland. See: https://tools.ietf.org/html/draft-ietf-6man-ra-pref64-09 https://www.iana.org/assignments/icmpv6-parameters/icmpv6-parameters.xhtml#icmpv6-parameters-5 Cc: Erik Kline <ek@google.com> Cc: Jen Linkova <furry@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Michael Haro <mharo@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 20:05:58 -07:00
Guillaume Nault	e4a58ef3ce	net: sched: refine extack messages in tcf_change_indev Add an error message when device wasn't found. While there, also set the bad attribute's offset in extack. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:52:31 -07:00
Jacob Keller	b9a17abfde	devlink: implement DEVLINK_CMD_REGION_NEW Implement support for the DEVLINK_CMD_REGION_NEW command for creating snapshots. This new command parallels the existing DEVLINK_CMD_REGION_DEL. In order for DEVLINK_CMD_REGION_NEW to work for a region, the new ".snapshot" operation must be implemented in the region's ops structure. The desired snapshot id must be provided. This helps avoid confusion on the purpose of DEVLINK_CMD_REGION_NEW, and keeps the API simpler. The requested id will be inserted into the xarray tracking the number of snapshots using each id. If this id is already used by another snapshot on any region, an error will be returned. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Jacob Keller	12102436ac	devlink: track snapshot id usage count using an xarray Each snapshot created for a devlink region must have an id. These ids are supposed to be unique per "event" that caused the snapshot to be created. Drivers call devlink_region_snapshot_id_get to obtain a new id to use for a new event trigger. The id values are tracked per devlink, so that the same id number can be used if a triggering event creates multiple snapshots on different regions. There is no mechanism for snapshot ids to ever be reused. Introduce an xarray to store the count of how many snapshots are using a given id, replacing the snapshot_id field previously used for picking the next id. The devlink_region_snapshot_id_get() function will use xa_alloc to insert an initial value of 1 value at an available slot between 0 and U32_MAX. The new __devlink_snapshot_id_increment() and __devlink_snapshot_id_decrement() functions will be used to track how many snapshots currently use an id. Drivers must now call devlink_snapshot_id_put() in order to release their reference of the snapshot id after adding region snapshots. By tracking the total number of snapshots using a given id, it is possible for the decrement() function to erase the id from the xarray when it is not in use. With this method, a snapshot id can become reused again once all snapshots that referred to it have been deleted via DEVLINK_CMD_REGION_DEL, and the driver has finished adding snapshots. This work also paves the way to introduce a mechanism for userspace to request a snapshot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Jacob Keller	7ef19d3b1d	devlink: report error once U32_MAX snapshot ids have been used The devlink_snapshot_id_get() function returns a snapshot id. The snapshot id is a u32, so there is no way to indicate an error code. A future change is going to possibly add additional cases where this function could fail. Refactor the function to return the snapshot id in an argument, so that it can return zero or an error value. This ensures that snapshot ids cannot be confused with error values, and aids in the future refactor of snapshot id allocation management. Because there is no current way to release previously used snapshot ids, add a simple check ensuring that an error is reported in case the snapshot_id would over flow. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Jacob Keller	a0a09f6bb2	devlink: convert snapshot destructor callback to region op It does not makes sense that two snapshots for a given region would use different destructors. Simplify snapshot creation by adding a .destructor op for regions. This operation will replace the data_destructor for the snapshot creation, and makes snapshot creation easier. Noticed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Jacob Keller	e893768179	devlink: prepare to support region operations Modify the devlink region code in preparation for adding new operations on regions. Create a devlink_region_ops structure, and move the name pointer from within the devlink_region structure into the ops structure (similar to the devlink_health_reporter_ops). This prepares the regions to enable support of additional operations in the future such as requesting snapshots, or accessing the region directly without a snapshot. In order to re-use the constant strings in the mlx4 driver their declaration must be changed to 'const char * const' to ensure the compiler realizes that both the data and the pointer cannot change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 19:39:26 -07:00
Petr Machata	1f40be6a34	net: flow_offload.h: Fix a comment at flow_action_entry.mangle This field references FLOW_ACTION_PACKET_EDIT. Such action does not exist though. Instead the field is used for FLOW_ACTION_MANGLE and _ADD. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-26 11:55:40 -07:00
David S. Miller	9fb16955fb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Overlapping header include additions in macsec.c A bug fix in 'net' overlapping with the removal of 'version' string in ena_netdev.c Overlapping test additions in selftests Makefile Overlapping PCI ID table adjustments in iwlwifi driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 18:58:11 -07:00
Pablo Neira Ayuso	2c64605b59	net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’: net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’ pkt->skb->tc_redirected = 1; ^~ net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’ pkt->skb->tc_from_ingress = 1; ^~ To avoid a direct dependency with tc actions from netfilter, wrap the redirect bits around CONFIG_NET_REDIRECT and move helpers to include/linux/skbuff.h. Turn on this toggle from the ifb driver, the only existing client of these bits in the tree. This patch adds skb_set_redirected() that sets on the redirected bit on the skbuff, it specifies if the packet was redirect from ingress and resets the timestamp (timestamp reset was originally missing in the netfilter bugfix). Fixes: `bcfabee1af` ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress") Reported-by: noreply@ellerman.id.au Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 12:24:33 -07:00
David Laight	af13b3c338	Remove DST_HOST Previous changes to the IP routing code have removed all the tests for the DS_HOST route flag. Remove the flags and all the code that sets it. Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 21:57:44 -07:00
Ido Schimmel	107f167894	devlink: Only pass packet trap group identifier in trap structure Packet trap groups are now explicitly registered by drivers and not implicitly registered when the packet traps are registered. Therefore, there is no need to encode entire group structure the trap is associated with inside the trap structure. Instead, only pass the group identifier. Refer to it as initial group identifier, as future patches will allow user space to move traps between groups. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 21:40:40 -07:00
Ido Schimmel	95ad9555b5	devlink: Add API to register packet trap groups Currently, packet trap groups are implicitly registered by drivers upon packet trap registration. When the traps are registered, each is associated with a group and the group is created by devlink, if it does not exist already. This makes it difficult for drivers to pass additional attributes for the groups. Therefore, as a preparation for future patches that require passing additional group attributes, add an API to explicitly register / unregister these groups. Next patches will convert existing drivers to use this API. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 21:40:40 -07:00
David S. Miller	adbea1a5f5	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-03-21 Implement basic support for the devlink interface in the ice driver. Additionally pave some necessary changes for adding a devlink region that exposes the NVM contents. This series first contains 5 patches for enabling and implementing full NVM read access via the ETHTOOL_GEEPROM interface. This includes some cleanup of endian-types, a new function for reading from the NVM and Shadow RAM as a flat addressable space, a function to calculate the available flash size during load, and a change to how some of the NVM version fields are stored in the ice_nvm_info structure. Following this is 3 patches for implementing devlink support. First, one patch which implements the basic framework and introduces the ice_devlink.c file. Second, a patch to implement basic .info_get support. Finally, a patch which reads the device PBA identifier and reports it as the `board.id` value in the .info_get response. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 21:21:33 -07:00
Jakub Kicinski	0dfb2d82af	net: sched: rename more stats_types Commit `53eca1f347` ("net: rename flow_action_hw_stats_types* -> flow_action_hw_stats*") renamed just the flow action types and helpers. For consistency rename variables, enums, struct members and UAPI too (note that this UAPI was not in any official release, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 20:54:23 -07:00
Jacob Keller	c90977a3c2	devlink: promote "fw.bundle_id" to a generic info version The nfp driver uses ``fw.bundle_id`` to represent a unique identifier of the entire firmware bundle. A future change is going to introduce a similar notion in the ice driver, so promote ``fw.bundle_id`` into a generic version now. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-03-21 00:57:16 -07:00
David S. Miller	0d7043f355	Another set of changes: * HE ranging (fine timing measurement) API support * hwsim gets virtio support, for use with wmediumd, to be able to simulate with multiple machines * eapol-over-nl80211 improvements to exclude preauth * IBSS reset support, to recover connections from userspace * and various others. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl50yM8ACgkQB8qZga/f l8Qe0w//Rt5vsqlkiuZtg1SHQcMHq5Y6jJ92Xfg6Yw8H3H1g40r6UWDeSsF7Dl9H YuC4VvfNH/FhnNz01FDVxjgjnUyjpAolxM6sYj3hW3ZIBahSBeovAdjTP59y1Fhi zLMP/zYLsqtlKO5qr0fK7tUmENiF/IxyALRcxqsmQ1ul5003pIpFRVPoQVDTjHZn JCzDTboQLmtyYiDYjejaeauo4Awjj4lDiyk3Fu+kXSv8weY4dl+w8/AM5zsiVNDs 4HRDpJUBIwfwWuygMNwl0EQVhnpYaNQFQ+IEe9B29qiAEb95MPHhSudU1qaT+4d7 CykikhH39/n4Wv1PBg4PrCfWInxAUiPz6K027lDOY5VtJFFbLg8GqQkn3Vy/WnHm CJuVaLdJ73k7AypvWdZUgSe9SaloymVd987IIMixmz0t0cqIE3kPYot//kXFjy6m rqvWJ3a7SIopPdkD4xXDwzASUmctCUedQ0QVnWYRQCUW5rnkZ96SHjblUA9IykYS Ia44BwB8ggZsax1KrqyGV9hgj8mh9srLmEjtRPhxVD3ncE1ewiQaBRwv56J0EKNu yRnV10dNMCtUyT3j9dRpHa8zuNpS/A6ShXe4aGgoXAz+ShjNOBj2GxXtolsDJzrd zIqZH8gcWaeV+8yWhaGx8d82gvY8PhRvJl+AMax7Ds2avOZ/pa0= =t3QF -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-net-next-2020-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Another set of changes: * HE ranging (fine timing measurement) API support * hwsim gets virtio support, for use with wmediumd, to be able to simulate with multiple machines * eapol-over-nl80211 improvements to exclude preauth * IBSS reset support, to recover connections from userspace * and various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-20 08:57:38 -07:00
Veerendranath Jakkam	7fc82af856	cfg80211: Configure PMK lifetime and reauth threshold for PMKSA entries Drivers that trigger roaming need to know the lifetime of the configured PMKSA for deciding whether to trigger the full or PMKSA cache based authentication. The configured PMKSA is invalid after the PMK lifetime has expired and must not be used after that and the STA needs to disassociate if the PMK expires. Hence the STA is expected to refresh the PMK with a full authentication before this happens (e.g., when reassociating to a new BSS the next time or by performing EAPOL reauthentication depending on the AKM) to avoid unnecessary disconnection. The PMK reauthentication threshold is the percentage of the PMK lifetime value and indicates to the driver to trigger a full authentication roam (without PMKSA caching) after the reauthentication threshold time, but before the PMK timer has expired. Authentication methods like SAE need to be able to generate a new PMKSA entry without having to force a disconnection after this threshold timeout. If no roaming occurs between the reauthentication threshold time and PMK lifetime expiration, disassociation is still forced. The new attributes for providing these values correspond to the dot11 MIB variables dot11RSNAConfigPMKLifetime and dot11RSNAConfigPMKReauthThreshold. This type of functionality is already available in cases where user space component is in control of roaming. This commit extends that same capability into cases where parts or all of this functionality is offloaded to the driver. Signed-off-by: Veerendranath Jakkam <vjakkam@codeaurora.org> Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Link: https://lore.kernel.org/r/20200312235903.18462-1-jouni@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:20 +01:00
Shaul Triebitz	7e8d6f12bb	nl80211: pass HE operation element to the driver Pass the AP's HE operation element to the driver. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20200131111300.891737-18-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:19 +01:00
Avraham Stern	efb5520d0e	nl80211/cfg80211: add support for non EDCA based ranging measurement Add support for requesting that the ranging measurement will use the trigger-based / non trigger-based flow instead of the EDCA based flow. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20200131111300.891737-2-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:19 +01:00
Qiujun Huang	07e9733886	mac80211: update documentation about tx power The structure member added at some point, but the kernel-doc was not updated. Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Link: https://lore.kernel.org/r/20200312144424.3023-1-hqjagain@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:19 +01:00
Markus Theil	7f3f96cedd	mac80211: handle no-preauth flag for control port This patch adds support for disabling pre-auth rx over the nl80211 control port for mac80211. Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20200312091055.54257-3-markus.theil@tu-ilmenau.de [fix indentation slightly, squash feature enablement] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:19 +01:00
Johannes Berg	1f7e9f46c2	cfg80211: fix documentation format Kernel-doc complains if the line isn't prefixed with an asterisk, fix that. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200320144110.2786ad5fb234.I369d103d11c71e39e3a3f97ed68a528c5b875f1e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-03-20 14:42:12 +01:00
David S. Miller	43861da75e	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2020-03-19 Here's the main bluetooth-next pull request for the 5.7 kernel. - Added wideband speech support to mgmt and the ability for HCI drivers to declare support for it. - Added initial support for L2CAP Enhanced Credit Based Mode - Fixed suspend handling for several use cases - Fixed Extended Advertising related issues - Added support for Realtek 8822CE device - Added DT bindings for QTI chip WCN3991 - Cleanups to replace zero-length arrays with flexible-array members - Several other smaller cleanups & fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-19 21:33:27 -07:00
Petr Machata	2ce124109c	net: tc_skbedit: Make the skbedit priority offloadable The skbedit action "priority" is used for adjusting SKB priority. Allow drivers to offload the action by introducing two new skbedit getters and a new flow action, and initializing appropriately in tc_setup_flow_action(). Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-19 21:09:19 -07:00
Petr Machata	fe93f0b225	net: tc_skbedit: Factor a helper out of is_tcf_skbedit_{mark, ptype}() The two functions is_tcf_skbedit_mark() and is_tcf_skbedit_ptype() have a very similar structure. A follow-up patch will add one more such function. Instead of more cut'n'pasting, extract a helper function that checks whether a TC action is an skbedit with the required flag. Convert the two existing functions into thin wrappers around the helper. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-19 21:09:19 -07:00
Ido Schimmel	3ebaf6da07	net: sched: Do not assume RTNL is held in tunnel key action helpers The cited commit removed RTNL from tc_setup_flow_action(), but the function calls two tunnel key action helpers that use rtnl_dereference() to fetch the action's parameters. This leads to "suspicious RCU usage" warnings [1][2]. Change the helpers to use rcu_dereference_protected() while requiring the action's lock to be held. This is safe because the two helpers are only called from tc_setup_flow_action() which acquires the lock. [1] [ 156.950855] ============================= [ 156.955463] WARNING: suspicious RCU usage [ 156.960085] 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 Not tainted [ 156.967116] ----------------------------- [ 156.971728] include/net/tc_act/tc_tunnel_key.h:31 suspicious rcu_dereference_protected() usage! [ 156.981583] [ 156.981583] other info that might help us debug this: [ 156.981583] [ 156.990675] [ 156.990675] rcu_scheduler_active = 2, debug_locks = 1 [ 156.998205] 1 lock held by tc/877: [ 157.002187] #0: ffff8881cbf7bea0 (&(&p->tcfa_lock)->rlock){+...}, at: tc_setup_flow_action+0xbe/0x4f78 [ 157.012866] [ 157.012866] stack backtrace: [ 157.017886] CPU: 2 PID: 877 Comm: tc Not tainted 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 [ 157.027253] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ 157.037389] Call Trace: [ 157.040170] dump_stack+0xfd/0x178 [ 157.044034] lockdep_rcu_suspicious+0x14a/0x153 [ 157.049157] tc_setup_flow_action+0x89f/0x4f78 [ 157.054227] fl_hw_replace_filter+0x375/0x640 [ 157.064348] fl_change+0x28ec/0x4f6b [ 157.088843] tc_new_tfilter+0x15e2/0x2260 [ 157.176801] rtnetlink_rcv_msg+0x8d6/0xb60 [ 157.190915] netlink_rcv_skb+0x177/0x460 [ 157.208884] rtnetlink_rcv+0x21/0x30 [ 157.212925] netlink_unicast+0x5d0/0x7f0 [ 157.227728] netlink_sendmsg+0x981/0xe90 [ 157.245416] ____sys_sendmsg+0x76d/0x8f0 [ 157.255348] ___sys_sendmsg+0x10f/0x190 [ 157.320308] __sys_sendmsg+0x115/0x1f0 [ 157.342553] __x64_sys_sendmsg+0x7d/0xc0 [ 157.346987] do_syscall_64+0xc1/0x600 [ 157.351142] entry_SYSCALL_64_after_hwframe+0x49/0xbe [2] [ 157.432346] ============================= [ 157.436937] WARNING: suspicious RCU usage [ 157.441537] 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 Not tainted [ 157.448559] ----------------------------- [ 157.453204] include/net/tc_act/tc_tunnel_key.h:43 suspicious rcu_dereference_protected() usage! [ 157.463042] [ 157.463042] other info that might help us debug this: [ 157.463042] [ 157.472112] [ 157.472112] rcu_scheduler_active = 2, debug_locks = 1 [ 157.479529] 1 lock held by tc/877: [ 157.483442] #0: ffff8881cbf7bea0 (&(&p->tcfa_lock)->rlock){+...}, at: tc_setup_flow_action+0xbe/0x4f78 [ 157.494119] [ 157.494119] stack backtrace: [ 157.499114] CPU: 2 PID: 877 Comm: tc Not tainted 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 [ 157.508485] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ 157.518628] Call Trace: [ 157.521416] dump_stack+0xfd/0x178 [ 157.525293] lockdep_rcu_suspicious+0x14a/0x153 [ 157.530425] tc_setup_flow_action+0x993/0x4f78 [ 157.535505] fl_hw_replace_filter+0x375/0x640 [ 157.545650] fl_change+0x28ec/0x4f6b [ 157.570204] tc_new_tfilter+0x15e2/0x2260 [ 157.658199] rtnetlink_rcv_msg+0x8d6/0xb60 [ 157.672315] netlink_rcv_skb+0x177/0x460 [ 157.690278] rtnetlink_rcv+0x21/0x30 [ 157.694320] netlink_unicast+0x5d0/0x7f0 [ 157.709129] netlink_sendmsg+0x981/0xe90 [ 157.726813] ____sys_sendmsg+0x76d/0x8f0 [ 157.736725] ___sys_sendmsg+0x10f/0x190 [ 157.801721] __sys_sendmsg+0x115/0x1f0 [ 157.823967] __x64_sys_sendmsg+0x7d/0xc0 [ 157.828403] do_syscall_64+0xc1/0x600 [ 157.832558] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `b15e7a6e8d` ("net: sched: don't take rtnl lock during flow_action setup") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-19 20:25:52 -07:00
David S. Miller	a58741ef1e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Use nf_flow_offload_tuple() to fetch flow stats, from Paul Blakey. 2) Add new xt_IDLETIMER hard mode, from Manoj Basapathi. Follow up patch to clean up this new mode, from Dan Carpenter. 3) Add support for geneve tunnel options, from Xin Long. 4) Make sets built-in and remove modular infrastructure for sets, from Florian Westphal. 5) Remove unused TEMPLATE_NULLS_VAL, from Li RongQing. 6) Statify nft_pipapo_get, from Chen Wandun. 7) Use C99 flexible-array member, from Gustavo A. R. Silva. 8) More descriptive variable names for bitwise, from Jeremy Sowden. 9) Four patches to add tunnel device hardware offload to the flowtable infrastructure, from wenxu. 10) pipapo set supports for 8-bit grouping, from Stefano Brivio. 11) pipapo can switch between nibble and byte grouping, also from Stefano. 12) Add AVX2 vectorized version of pipapo, from Stefano Brivio. 13) Update pipapo to be use it for single ranges, from Stefano. 14) Add stateful expression support to elements via control plane, eg. counter per element. 15) Re-visit sysctls in unprivileged namespaces, from Florian Westphal. 15) Add new egress hook, from Lukas Wunner. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-17 23:51:31 -07:00
Eric Dumazet	efe074c2cc	net_sched: add qdisc_watchdog_schedule_range_ns() Some packet schedulers might want to add a slack when programming hrtimers. This can reduce number of interrupts and increase batch sizes and thus give good xmit_more savings. This commit adds qdisc_watchdog_schedule_range_ns() helper, with an extra delta_ns parameter. Legacy qdisc_watchdog_schedule_n() becomes an inline passing a zero slack. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-17 21:16:34 -07:00
Jakub Kicinski	53eca1f347	net: rename flow_action_hw_stats_types* -> flow_action_hw_stats* flow_action_hw_stats_types_check() helper takes one of the FLOW_ACTION_HW_STATS_*_BIT values as input. If we align the arguments to the opening bracket of the helper there is no way to call this helper and stay under 80 characters. Remove the "types" part from the new flow_action helpers and enum values. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-17 21:12:39 -07:00
Era Mayflower	48ef50fa86	macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw) Netlink support of extended packet number cipher suites, allows adding and updating XPN macsec interfaces. Added support in: * Creating interfaces with GCM-AES-XPN-128 and GCM-AES-XPN-256 suites. * Setting and getting 64bit packet numbers with of SAs. * Setting (only on SA creation) and getting ssci of SAs. * Setting salt when installing a SAK. Added 2 cipher suite identifiers according to 802.1AE-2018 table 14-1: * MACSEC_CIPHER_ID_GCM_AES_XPN_128 * MACSEC_CIPHER_ID_GCM_AES_XPN_256 In addition, added 2 new netlink attribute types: * MACSEC_SA_ATTR_SSCI * MACSEC_SA_ATTR_SALT Depends on: macsec: Support XPN frame handling - IEEE 802.1AEbw. Signed-off-by: Era Mayflower <mayflowerera@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-16 01:42:31 -07:00
Era Mayflower	a21ecf0e03	macsec: Support XPN frame handling - IEEE 802.1AEbw Support extended packet number cipher suites (802.1AEbw) frames handling. This does not include the needed netlink patches. * Added xpn boolean field to `struct macsec_secy`. * Added ssci field to `struct_macsec_tx_sa` (802.1AE figure 10-5). * Added ssci field to `struct_macsec_rx_sa` (802.1AE figure 10-5). * Added salt field to `struct macsec_key` (802.1AE 10.7 NOTE 1). * Created pn_t type for easy access to lower and upper halves. * Created salt_t type for easy access to the "ssci" and "pn" parts. * Created `macsec_fill_iv_xpn` function to create IV in XPN mode. * Support in PN recovery and preliminary replay check in XPN mode. In addition, according to IEEE 802.1AEbw figure 10-5, the PN of incoming frame can be 0 when XPN cipher suite is used, so fixed the function `macsec_validate_skb` to fail on PN=0 only if XPN is off. Signed-off-by: Era Mayflower <mayflowerera@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-16 01:42:31 -07:00
Pablo Neira Ayuso	76adfafeca	netfilter: nf_tables: add nft_set_elem_update_expr() helper function This helper function runs the eval path of the stateful expression of an existing set element. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:27:49 +01:00
Pablo Neira Ayuso	795a6d6b42	netfilter: nf_tables: statify nft_expr_init() Not exposed anymore to modules, statify this function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:27:48 +01:00
Pablo Neira Ayuso	a7fc936804	netfilter: nf_tables: add nft_set_elem_expr_alloc() Add helper function to create stateful expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:27:47 +01:00
Stefano Brivio	7400b06396	nft_set_pipapo: Introduce AVX2-based lookup implementation If the AVX2 set is available, we can exploit the repetitive characteristic of this algorithm to provide a fast, vectorised version by using 256-bit wide AVX2 operations for bucket loads and bitwise intersections. In most cases, this implementation consistently outperforms rbtree set instances despite the fact they are configured to use a given, single, ranged data type out of the ones used for performance measurements by the nft_concat_range.sh kselftest. That script, injecting packets directly on the ingoing device path with pktgen, reports, averaged over five runs on a single AMD Epyc 7402 thread (3.35GHz, 768 KiB L1D$, 12 MiB L2$), the figures below. CONFIG_RETPOLINE was not set here. Note that this is not a fair comparison over hash and rbtree set types: non-ranged entries (used to have a reference for hash types) would be matched faster than this, and matching on a single field only (which is the case for rbtree) is also significantly faster. However, it's not possible at the moment to choose this set type for non-ranged entries, and the current implementation also needs a few minor adjustments in order to match on less than two fields. ---------------.-----------------------------------.------------. AMD Epyc 7402 \| baselines, Mpps \| this patch \| 1 thread \|___________________________________\|____________\| 3.35GHz \| \| \| \| \| \| 768KiB L1D$ \| netdev \| hash \| rbtree \| \| \| ---------------\| hook \| no \| single \| \| pipapo \| type entries \| drop \| ranges \| field \| pipapo \| AVX2 \| ---------------\|--------\|--------\|--------\|--------\|------------\| net,port \| \| \| \| \| \| 1000 \| 19.0 \| 10.4 \| 3.8 \| 4.0 \| 7.5 +87% \| ---------------\|--------\|--------\|--------\|--------\|------------\| port,net \| \| \| \| \| \| 100 \| 18.8 \| 10.3 \| 5.8 \| 6.3 \| 8.1 +29% \| ---------------\|--------\|--------\|--------\|--------\|------------\| net6,port \| \| \| \| \| \| 1000 \| 16.4 \| 7.6 \| 1.8 \| 2.1 \| 4.8 +128% \| ---------------\|--------\|--------\|--------\|--------\|------------\| port,proto \| \| \| \| \| \| 30000 \| 19.6 \| 11.6 \| 3.9 \| 0.5 \| 2.6 +420% \| ---------------\|--------\|--------\|--------\|--------\|------------\| net6,port,mac \| \| \| \| \| \| 10 \| 16.5 \| 5.4 \| 4.3 \| 3.4 \| 4.7 +38% \| ---------------\|--------\|--------\|--------\|--------\|------------\| net6,port,mac, \| \| \| \| \| \| proto 1000 \| 16.5 \| 5.7 \| 1.9 \| 1.4 \| 3.6 +26% \| ---------------\|--------\|--------\|--------\|--------\|------------\| net,mac \| \| \| \| \| \| 1000 \| 19.0 \| 8.4 \| 3.9 \| 2.5 \| 6.4 +156% \| ---------------'--------'--------'--------'--------'------------' A similar strategy could be easily reused to implement specialised versions for other SIMD sets, and I plan to post at least a NEON version at a later time. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:27:45 +01:00
wenxu	cfab6dbd0e	netfilter: flowtable: add tunnel match offload support This patch support both ipv4 and ipv6 tunnel_id, tunnel_src and tunnel_dst match for flowtable offload Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:26:17 +01:00
Gustavo A. R. Silva	6daf141401	netfilter: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] Lastly, fix checkpatch.pl warning WARNING: __aligned(size) is preferred over __attribute__((aligned(size))) in net/bridge/netfilter/ebtables.c This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:20:16 +01:00
Florian Westphal	24d19826fc	netfilter: nf_tables: make all set structs const They do not need to be writeable anymore. v2: remove left-over __read_mostly annotation in set_pipapo.c (Stefano) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:20:16 +01:00
Florian Westphal	e32a4dc651	netfilter: nf_tables: make sets built-in Placing nftables set support in an extra module is pointless: 1. nf_tables needs dynamic registeration interface for sake of one module 2. nft heavily relies on sets, e.g. even simple rule like "nft ... tcp dport { 80, 443 }" will not work with _SETS=n. IOW, either nftables isn't used or both nf_tables and nf_tables_set modules are needed anyway. With extra module: 307K net/netfilter/nf_tables.ko 79K net/netfilter/nf_tables_set.ko text data bss dec filename 146416 3072 545 150033 nf_tables.ko 35496 1817 0 37313 nf_tables_set.ko This patch: 373K net/netfilter/nf_tables.ko 178563 4049 545 183157 nf_tables.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-03-15 15:20:16 +01:00
Petr Machata	0a7fad2376	net: sched: RED: Introduce an ECN nodrop mode When the RED Qdisc is currently configured to enable ECN, the RED algorithm is used to decide whether a certain SKB should be marked. If that SKB is not ECN-capable, it is early-dropped. It is also possible to keep all traffic in the queue, and just mark the ECN-capable subset of it, as appropriate under the RED algorithm. Some switches support this mode, and some installations make use of it. To that end, add a new RED flag, TC_RED_NODROP. When the Qdisc is configured with this flag, non-ECT traffic is enqueued instead of being early-dropped. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00

1 2 3 4 5 ...

13599 Commits