linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-04 01:24:12 +08:00

Author	SHA1	Message	Date
Stefan Raspl	89e7d2ba61	net/ism: Add new API for client registration Add a new API that allows other drivers to concurrently access ISM devices. To do so, we introduce a new API that allows other modules to register for ISM device usage. Furthermore, we move the GID to struct ism, where it belongs conceptually, and rename and relocate struct smcd_event to struct ism_event. This is the first part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-25 09:46:48 +00:00
Stefan Raspl	1baedb13f1	s390/ism: Introduce struct ism_dmb Conceptually, a DMB is a structure that belongs to ISM devices. However, SMC currently 'owns' this structure. So future exploiters of ISM devices would be forced to include SMC headers to work - which is just weird. Therefore, we switch ISM to struct ism_dmb, introduce a new public header with the definition (will be populated with further API calls later on), and, add a thin wrapper to please SMC. Since structs smcd_dmb and ism_dmb are identical, we can simply convert between the two for now. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-25 09:46:48 +00:00
Stefan Raspl	462502ff9a	net/ism: Add missing calls to disable bus-mastering Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-25 09:46:48 +00:00
Stefan Raspl	c40bff4132	net/smc: Terminate connections prior to device removal Removing an ISM device prior to terminating its associated connections doesn't end well. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-25 09:46:48 +00:00
Parav Pandit	d067111586	virtio-net: Reduce debug name field size to 16 bytes virtio queue index can be maximum of 65535. 16 bytes are enough to store the vq name with the existing string prefix. With this change, send queue struct saves 24 bytes and receive queue saves whole cache line worth 64 bytes per structure due to saving in alignment bytes. Pahole results before: pahole -s drivers/net/virtio_net.o \| \ grep -e "send_queue" -e "receive_queue" send_queue 1112 0 receive_queue 1280 1 Pahole results after: pahole -s drivers/net/virtio_net.o \| \ grep -e "send_queue" -e "receive_queue" send_queue 1088 0 receive_queue 1216 1 Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-25 09:27:13 +00:00
Jakub Kicinski	4373a023e0	devlink: remove a dubious assumption in fmsg dumping Build bot detects that err may be returned uninitialized in devlink_fmsg_prepare_skb(). This is not really true because all fmsgs users should create at least one outer nest, and therefore fmsg can't be completely empty. That said the assumption is not trivial to confirm, so let's follow the bots advice, anyway. This code does not seem to have changed since its inception in commit `1db64e8733` ("devlink: Add devlink formatted message (fmsg) API") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20230124035231.787381-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-24 20:31:35 -08:00
Vladimir Oltean	28113cfada	net: mscc: ocelot: fix incorrect verify_enabled reporting in ethtool get_mm() We don't read the verify_enabled variable from hardware in the MAC Merge layer state GET operation, instead we always leave it set to "false". The user may think something is wrong if they set verify_enabled to true, then read it back and see it's still false, even though the configuration took place. Fixes: `6505b68056` ("net: mscc: ocelot: add MAC Merge layer support for VSC9959") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230123184538.3420098-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-24 18:34:20 -08:00
James Hershaw	74b4f1739d	nfp: flower: change get/set_eeprom logic and enable for flower reps The changes in this patch are as follows: - Alter the logic of get/set_eeprom functions to use the helper function nfp_app_from_netdev() which handles differentiating between an nfp_net and a nfp_repr. This allows us to get an agnostic backpointer to the pdev. - Enable the various eeprom commands by adding the 'get_eeprom_len', 'get_eeprom', 'set_eeprom' callbacks to the nfp_port_ethtool_ops struct. This allows the eeprom commands to work on representor interfaces, similar to a previous patch which added it to the vnics. Currently these are being used to configure persistent MAC addresses for the physical ports on the nfp. Signed-off-by: James Hershaw <james.hershaw@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230123134135.293278-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-24 18:19:12 -08:00
Guillaume Nault	90317bcdbd	ipv6: Make ip6_route_output_flags_noref() static. This function is only used in net/ipv6/route.c and has no reason to be visible outside of it. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/50706db7f675e40b3594d62011d9363dce32b92e.1674495822.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-24 18:12:52 -08:00
Jakub Kicinski	ec8f7d495b	netlink: fix spelling mistake in dump size assert Commit `2c7bc10d0f` ("netlink: add macro for checking dump ctx size") misspelled the name of the assert as asset, missing an R. Reported-by: Ido Schimmel <idosch@idosch.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230123222224.732338-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-24 16:29:11 -08:00
Paolo Abeni	c554520f2c	Merge branch 'netlink-protocol-specs' Jakub Kicinski says: ==================== Netlink protocol specs I think the Netlink proto specs are far along enough to merge. Filling in all attribute types and quirks will be an ongoing effort but we have enough to cover FOU so it's somewhat complete. I fully intend to continue polishing the code but at the same time I'd like to start helping others base their work on the specs (e.g. DPLL) and need to start working on some new families myself. That's the progress / motivation for merging. The RFC [1] has more of a high level blurb, plus I created a lot of documentation, I'm not going to repeat it here. There was also the talk at LPC [2]. [1] https://lore.kernel.org/all/20220811022304.583300-1-kuba@kernel.org/ [2] https://youtu.be/9QkXIQXkaQk?t=2562 v2: https://lore.kernel.org/all/20220930023418.1346263-1-kuba@kernel.org/ v3: https://lore.kernel.org/all/20230119003613.111778-1-kuba@kernel.org/1 v4: - spec improvements (patch 2) - Python cleanup (patch 3) - rename auto-gen files and use the right comment style ==================== Link: https://lore.kernel.org/r/20230120175041.342573-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 11:02:03 +01:00
Jakub Kicinski	e4b48ed460	tools: ynl: add a completely generic client Add a CLI sample which can take in arbitrary request in JSON format, convert it to Netlink and do the inverse for output. It's meant as a development tool primarily and perhaps for selftests which need to tickle netlink in a special way. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	1d562c32e4	net: fou: use policy and operation tables generated from the spec Generate and plug in the spec-based tables. A little bit of renaming is needed in the FOU code. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	08d323234d	net: fou: rename the source for linking We'll need to link two objects together to form the fou module. This means the source can't be called fou, the build system expects fou.o to be the combined object. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	3a330496ba	net: fou: regenerate the uAPI from the spec Regenerate the FOU uAPI header from the YAML spec. The flags now come before attributes which use them, and the comments for type disappear (coders should look at the spec instead). Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	4eb77b4ecd	netlink: add a proto specification for FOU FOU has a reasonably modern Genetlink family. Add a spec. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	be5bea1cc0	net: add basic C code generators for Netlink Code generators to turn Netlink specs into C code. I'm definitely not proud of it. The main generator is in Python, there's a bash script to regen all code-gen'ed files in tree after making spec changes. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	e616c07ca5	netlink: add schemas for YAML specs Add schemas for Netlink spec files. As described in the docs we have 4 "protocols" or compatibility levels, and each one comes with its own schema, but the more general / legacy schemas are superset of more modern ones: genetlink is the smallest followed by genetlink-c and genetlink-legacy. There is no schema for raw netlink, yet, I haven't found the time.. I don't know enough jsonschema to do inheritance or something but the repetition is not too bad. I hope. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Jakub Kicinski	9d6a65079c	docs: add more netlink docs (incl. spec docs) Add documentation about the upcoming Netlink protocol specs. Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:58:11 +01:00
Paolo Abeni	d961bee454	Merge branch 'net-sched-use-the-backlog-for-nested-mirred-ingress' Davide Caratti says: ==================== net/sched: use the backlog for nested mirred ingress TC mirred has a protection against excessive stack growth, but that protection doesn't really guarantee the absence of recursion, nor it guards against loops. Patch 1/2 rewords "recursion" to "nesting" to make this more clear. We can leverage on this existing mechanism to prevent TCP / SCTP from doing soft lock-up in some specific scenarios that uses mirred egress->ingress: patch 2 changes mirred so that the networking backlog is used for nested mirred ingress actions. ==================== Link: https://lore.kernel.org/r/cover.1674233458.git.dcaratti@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:30:56 +01:00
Davide Caratti	ca22da2fbd	act_mirred: use the backlog for nested calls to mirred ingress William reports kernel soft-lockups on some OVS topologies when TC mirred egress->ingress action is hit by local TCP traffic [1]. The same can also be reproduced with SCTP (thanks Xin for verifying), when client and server reach themselves through mirred egress to ingress, and one of the two peers sends a "heartbeat" packet (from within a timer). Enqueueing to backlog proved to fix this soft lockup; however, as Cong noticed [2], we should preserve - when possible - the current mirred behavior that counts as "overlimits" any eventual packet drop subsequent to the mirred forwarding action [3]. A compromise solution might use the backlog only when tcf_mirred_act() has a nest level greater than one: change tcf_mirred_forward() accordingly. Also, add a kselftest that can reproduce the lockup and verifies TC mirred ability to account for further packet drops after TC mirred egress->ingress (when the nest level is 1). [1] https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com/ [2] https://lore.kernel.org/netdev/Y0w%2FWWY60gqrtGLp@pop-os.localdomain/ [3] such behavior is not guaranteed: for example, if RPS or skb RX timestamping is enabled on the mirred target device, the kernel can defer receiving the skb and return NET_RX_SUCCESS inside tcf_mirred_forward(). Reported-by: William Zhao <wizhao@redhat.com> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:30:54 +01:00
Davide Caratti	78dcdffe04	net/sched: act_mirred: better wording on protection against excessive stack growth with commit `e2ca070f89` ("net: sched: protect against stack overflow in TC act_mirred"), act_mirred protected itself against excessive stack growth using per_cpu counter of nested calls to tcf_mirred_act(), and capping it to MIRRED_RECURSION_LIMIT. However, such protection does not detect recursion/loops in case the packet is enqueued to the backlog (for example, when the mirred target device has RPS or skb timestamping enabled). Change the wording from "recursion" to "nesting" to make it more clear to readers. CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:30:54 +01:00
Paolo Abeni	5cf6c22b5b	Merge branch 'fix-cpts-release-action-in-am65-cpts-driver' Siddharth Vadapalli says: ==================== Fix CPTS release action in am65-cpts driver Delete unreachable code in am65_cpsw_init_cpts() function, which was Reported-by: Leon Romanovsky <leon@kernel.org> at: https://lore.kernel.org/r/Y8aHwSnVK9+sAb24@unreal Remove the devm action associated with am65_cpts_release() and invoke the function directly on the cleanup and exit paths. v4: https://lore.kernel.org/r/20230120044201.357950-1-s-vadapalli@ti.com/ v3: https://lore.kernel.org/r/20230118095439.114222-1-s-vadapalli@ti.com/ v2: https://lore.kernel.org/r/20230116044517.310461-1-s-vadapalli@ti.com/ v1: https://lore.kernel.org/r/20230113104816.132815-1-s-vadapalli@ti.com/ ==================== Link: https://lore.kernel.org/r/20230120070731.383729-1-s-vadapalli@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:08:54 +01:00
Siddharth Vadapalli	4ad8766cd3	net: ethernet: ti: am65-cpsw/cpts: Fix CPTS release action The am65_cpts_release() function is registered as a devm_action in the am65_cpts_create() function in am65-cpts driver. When the am65-cpsw driver invokes am65_cpts_create(), am65_cpts_release() is added in the set of devm actions associated with the am65-cpsw driver's device. In the event of probe failure or probe deferral, the platform_drv_probe() function invokes dev_pm_domain_detach() which powers off the CPSW and the CPSW's CPTS hardware, both of which share the same power domain. Since the am65_cpts_disable() function invoked by the am65_cpts_release() function attempts to reset the CPTS hardware by writing to its registers, the CPTS hardware is assumed to be powered on at this point. However, the hardware is powered off before the devm actions are executed. Fix this by getting rid of the devm action for am65_cpts_release() and invoking it directly on the cleanup and exit paths. Fixes: `f6bd59526c` ("net: ethernet: ti: introduce am654 common platform time sync driver") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:08:50 +01:00
Siddharth Vadapalli	0a974b1fff	net: ethernet: ti: am65-cpsw: Delete unreachable error handling code The am65_cpts_create() function returns -EOPNOTSUPP only when the config "CONFIG_TI_K3_AM65_CPTS" is disabled. Also, in the am65_cpsw_init_cpts() function, am65_cpts_create() can only be invoked if the config "CONFIG_TI_K3_AM65_CPTS" is enabled. Thus, the error handling code for the case in which the return value of am65_cpts_create() is -EOPNOTSUPP, is unreachable. Hence delete it. Reported-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-01-24 10:08:50 +01:00
Rakesh Sankaranarayanan	d7bf56e0c5	net: phy: microchip: run phy initialization during each link update PHY initialization is supposed to run on every mode changes. "lan87xx_config_aneg()" verifies every mode change using "phy_modify_changed()" function. Earlier code had phy_modify_changed() followed by genphy_soft_reset. But soft_reset resets all the pre-configured register values to default state, and lost all the initialization done. With this reason gen_phy_reset was removed. But it need to go through init sequence each time the mode changed. Update lan87xx_config_aneg() to invoke phy_init once successful mode update is detected. PHY init sequence added in lan87xx_phy_init() have slave init commands executed every time. Update the init sequence to run slave init only if phydev is in slave mode. Test setup contains LAN9370 EVB connected to SAMA5D3 (Running DSA), and issue can be reproduced by connecting link to any of the available ports after SAMA5D3 boot-up. With this issue, port will fail to update link state. But once the SAMA5D3 is reset with LAN9370 link in connected state itself, on boot-up link state will be reported as UP. But Again after some time, if link is moved to DOWN state, it will not get reported. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230120104733.724701-1-rakesh.sankaranarayanan@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:34:19 -08:00
Jakub Kicinski	90e05ef3d1	Merge branch 'net-dsa-microchip-add-support-for-credit-based-shaper' Arun Ramadoss says: ==================== net: dsa: microchip: add support for credit based shaper LAN937x switch family, KSZ9477, KSZ9567, KSZ9563 and KSZ8563 supports the credit based shaper. But there were few difference between LAN937x and KSZ switch like - number of queues for LAN937x is 8 and for others it is 4. - size of credit increment register for LAN937x is 24 and for other is 16-bit. This patch series add the credit based shaper with common implementation for LAN937x and KSZ swithes. ==================== Link: https://lore.kernel.org/r/20230120052135.32120-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:12:37 -08:00
Arun Ramadoss	71d7920fb2	net: dsa: microchip: add support for credit based shaper KSZ9477, KSZ9567, KSZ9563, KSZ8563 and LAN937x supports Credit based shaper. To differentiate the chip supporting cbs, tc_cbs_supported flag is introduced in ksz_chip_data. And KSZ series has 16bit Credit increment registers whereas LAN937x has 24bit register. The value to be programmed in the credit increment is determined using the successive multiplication method to convert decimal fraction to hexadecimal fraction. For example: if idleslope is 10000 and sendslope is -90000, then bandwidth is 10000 - (-90000) = 100000. The 10% bandwidth of 100Mbps means 10/100 = 0.1(decimal). This value has to be converted to hexa. 1) 0.1 * 16 = 1.6 --> fraction 0.6 Carry = 1 (MSB) 2) 0.6 * 16 = 9.6 --> fraction 0.6 Carry = 9 3) 0.6 * 16 = 9.6 --> fraction 0.6 Carry = 9 4) 0.6 * 16 = 9.6 --> fraction 0.6 Carry = 9 5) 0.6 * 16 = 9.6 --> fraction 0.6 Carry = 9 6) 0.6 * 16 = 9.6 --> fraction 0.6 Carry = 9 (LSB) Now 0.1(decimal) becomes 0.199999(Hex). If it is LAN937x, 24 bit value will be programmed to Credit Inc register, 0x199999. For others 16 bit value will be prgrammed, 0x1999. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:12:35 -08:00
Arun Ramadoss	e30f33a5f5	net: dsa: microchip: enable port queues for tc mqprio LAN937x family of switches has 8 queues per port where the KSZ switches has 4 queues per port. By default, only one queue per port is enabled. The queues are configurable in 2, 4 or 8. This patch add 8 number of queues for LAN937x and 4 for other switches. In the tag_ksz.c file, prioirty of the packet is queried using the skb buffer and the corresponding value is updated in the tag. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:12:35 -08:00
Jesper Dangaard Brouer	3176eb8268	net: avoid irqsave in skb_defer_free_flush The spin_lock irqsave/restore API variant in skb_defer_free_flush can be replaced with the faster spin_lock irq variant, which doesn't need to read and restore the CPU flags. Using the unconditional irq "disable/enable" API variant is safe, because the skb_defer_free_flush() function is only called during NAPI-RX processing in net_rx_action(), where it is known the IRQs are enabled. Expected gain is 14 cycles from avoiding reading and restoring CPU flags in a spin_lock_irqsave/restore operation, measured via a microbencmark kernel module[1] on CPU E5-1650 v4 @ 3.60GHz. Microbenchmark overhead of spin_lock+unlock: - spin_lock_unlock_irq cost: 34 cycles(tsc) 9.486 ns - spin_lock_unlock_irqsave cost: 48 cycles(tsc) 13.567 ns We don't expect to see a measurable packet performance gain, as skb_defer_free_flush() is called infrequently once per NIC device NAPI bulk cycle and conditionally only if SKBs have been deferred by other CPUs via skb_attempt_defer_free(). [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/167421646327.1321776.7390743166998776914.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:08:06 -08:00
Jon Maxwell	695a376b59	ipv6: Document that max_size sysctl is deprecated v4: fix deprecated typo. Document that max_size is deprecated due to: commit `af6d10345c` ("ipv6: remove max_size check inline with ipv4") Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Link: https://lore.kernel.org/r/20230120232331.1273881-1-jmaxwell37@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 22:07:21 -08:00
Jesper Dangaard Brouer	f72ff8b81e	net: fix kfree_skb_list use of skb_mark_not_on_list A bug was introduced by commit `eedade12f4` ("net: kfree_skb_list use kmem_cache_free_bulk"). It unconditionally unlinked the SKB list via invoking skb_mark_not_on_list(). In this patch we choose to remove the skb_mark_not_on_list() call as it isn't necessary. It would be possible and correct to call skb_mark_not_on_list() only when __kfree_skb_reason() returns true, meaning the SKB is ready to be free'ed, as it calls/check skb_unref(). This fix is needed as kfree_skb_list() is also invoked on skb_shared_info frag_list (skb_drop_fraglist() calling kfree_skb_list()). A frag_list can have SKBs with elevated refcnt due to cloning via skb_clone_fraglist(), which takes a reference on all SKBs in the list. This implies the invariant that all SKBs in the list must have the same refcnt, when using kfree_skb_list(). Reported-by: syzbot+c8a2e66e37eee553c4fd@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+c8a2e66e37eee553c4fd@syzkaller.appspotmail.com Fixes: `eedade12f4` ("net: kfree_skb_list use kmem_cache_free_bulk") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/167421088417.1125894.9761158218878962159.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:39:04 -08:00
Dan Carpenter	3bee9b573a	net: microchip: sparx5: Fix uninitialized variable in vcap_path_exist() The "eport" variable needs to be initialized to NULL for this code to work. Fixes: `814e769320` ("net: microchip: vcap api: Add a storage state to a VCAP rule") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Link: https://lore.kernel.org/r/Y8qbYAb+YSXo1DgR@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:34:59 -08:00
Heiner Kallweit	8a8b70b3f2	net: mdio: warn once if addr parameter is invalid in mdiobus_get_phy() If mdiobus_get_phy() is called with an invalid addr parameter, then the caller has a bug. Print a call trace to help identifying the caller. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/daec3f08-6192-ba79-f74b-5beb436cab6c@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:34:10 -08:00
Jakub Kicinski	62be69397e	wireless-next patches for v6.3 First set of patches for v6.3. The most important change here is that the old Wireless Extension user space interface is not supported on Wi-Fi 7 devices at all. We also added a warning if anyone with modern drivers (ie. cfg80211 and mac80211 drivers) tries to use Wireless Extensions, everyone should switch to using nl80211 interface instead. Static WEP support is removed, there wasn't any driver using that anyway so there's no user impact. Otherwise it's smaller features and fixes as usual. Note: As mt76 had tricky conflicts due to the fixes in wireless tree, we decided to merge wireless into wireless-next to solve them easily. There should not be any merge problems anymore. Major changes: cfg80211 * remove never used static WEP support * warn if Wireless Extention interface is used with cfg80211/mac80211 drivers * stop supporting Wireless Extensions with Wi-Fi 7 devices * support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting rfkill * add GPIO DT support bitfield * add FIELD_PREP_CONST() mt76 * per-PHY LED support rtw89 * support new Bluetooth co-existance version rtl8xxxu * support RTL8188EU -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmPOYeQRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZvSlAf/Y5ZY5xLEytUma7fBkBObXEfP/7tlBBsu RoRKVx77D1LGfGu0WXG9PCdvyY70e2QtrkdeLHF3gfzLYpNZIyB/eOFhwzCtbJrD ls2yXhdTm9OwDOHAdvXLXx3fmF4bXni7dYdi78VrGCFOnU6XE6X5JpnZYU1SmQ1U 8Ro7H6D9yp8MKfh5Ct19PYSTS5hmHB09vfJ4rbkjHp7kEGvJjYNbvAqGsxatPnh9 Zw35TEIwmhZO4GsXxsG12g6LZa8W8RO8uCwepHxtFM8oGsF68Yb/lkLcdtMiuN6V WdB6qn24faEWjdmt5BzJGueA3Td8KI6t5cHhGbQVKjyFD8lAC+IJQA== =Nq9U -----END PGP SIGNATURE----- Merge tag 'wireless-next-2023-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.3 First set of patches for v6.3. The most important change here is that the old Wireless Extension user space interface is not supported on Wi-Fi 7 devices at all. We also added a warning if anyone with modern drivers (ie. cfg80211 and mac80211 drivers) tries to use Wireless Extensions, everyone should switch to using nl80211 interface instead. Static WEP support is removed, there wasn't any driver using that anyway so there's no user impact. Otherwise it's smaller features and fixes as usual. Note: As mt76 had tricky conflicts due to the fixes in wireless tree, we decided to merge wireless into wireless-next to solve them easily. There should not be any merge problems anymore. Major changes: cfg80211 - remove never used static WEP support - warn if Wireless Extention interface is used with cfg80211/mac80211 drivers - stop supporting Wireless Extensions with Wi-Fi 7 devices - support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting rfkill - add GPIO DT support bitfield - add FIELD_PREP_CONST() mt76 - per-PHY LED support rtw89 - support new Bluetooth co-existance version rtl8xxxu - support RTL8188EU * tag 'wireless-next-2023-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (123 commits) wifi: wireless: deny wireless extensions on MLO-capable devices wifi: wireless: warn on most wireless extension usage wifi: mac80211: drop extra 'e' from ieeee80211... name wifi: cfg80211: Deduplicate certificate loading bitfield: add FIELD_PREP_CONST() wifi: mac80211: add kernel-doc for EHT structure mac80211: support minimal EHT rate reporting on RX wifi: mac80211: Add HE MU-MIMO related flags in ieee80211_bss_conf wifi: mac80211: Add VHT MU-MIMO related flags in ieee80211_bss_conf wifi: cfg80211: Use MLD address to indicate MLD STA disconnection wifi: cfg80211: Support 32 bytes KCK key in GTK rekey offload wifi: cfg80211: Fix extended KCK key length check in nl80211_set_rekey_data() wifi: cfg80211: remove support for static WEP wifi: rtl8xxxu: Dump the efuse only for untested devices wifi: rtl8xxxu: Print the ROM version too wifi: rtw88: Use non-atomic sta iterator in rtw_ra_mask_info_update() wifi: rtw88: Use rtw_iterate_vifs() for rtw_vif_watch_dog_iter() wifi: rtw88: Move register access from rtw_bf_assoc() outside the RCU wifi: rtl8xxxu: Use a longer retry limit of 48 wifi: rtl8xxxu: Report the RSSI to the firmware ... ==================== Link: https://lore.kernel.org/r/20230123103338.330CBC433EF@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:27:31 -08:00
Krzysztof Kozlowski	306f208259	dt-bindings: net: asix,ax88796c: allow SPI peripheral properties The AX88796C device node on SPI bus can use SPI peripheral properties in certain configurations: exynos3250-artik5-eval.dtb: ethernet@0: 'controller-data' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230120144329.305655-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:26:37 -08:00
Eric Dumazet	057fb03160	selftests: net: tcp_mmap: populate pages in send path In commit `72653ae530` ("selftests: net: tcp_mmap: Use huge pages in send path") I made a change to use hugepages for the buffer used by the client (tx path) Today, I understood that the cause for poor zerocopy performance was that after a mmap() for a 512KB memory zone, kernel uses a single zeropage, mapped 128 times. This was really the reason for poor tx path performance in zero copy mode, because this zero page refcount is under high pressure, especially when TCP ACK packets are processed on another cpu. We need either to force a COW on all the memory range, or use MAP_POPULATE so that a zero page is not abused. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230120181136.3764521-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-23 21:24:29 -08:00
Heiner Kallweit	32e54254ba	net: mdio: mux-meson-g12a: use devm_clk_get_enabled to simplify the code Use devm_clk_get_enabled() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 14:46:51 +00:00
Andy Shevchenko	d408ec0b5d	net: mdiobus: Convert to use fwnode_device_is_compatible() Replace open coded fwnode_device_is_compatible() in the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 14:37:13 +00:00
David S. Miller	dc0b98a175	ethtool: Add and use ethnl_update_bool. Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:57:39 +00:00
David S. Miller	7a98143118	Merge branch 'enetc-mac-merge-prep' Vladimir Oltean says: ==================== ENETC MAC Merge cleanup This is a preparatory patch set for MAC Merge layer support in enetc via ethtool. It does the following: - consolidates a software lockstep register write procedure for the pMAC - detects per-port frame preemption capability and only writes pMAC registers if a pMAC exists - stops enabling the pMAC by default Additionally, I noticed some build warnings in the driver which are new in this kernel version, so patch 1/6 fixes those. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:09 +00:00
Vladimir Oltean	086cc08035	net: enetc: stop auto-configuring the port pMAC The pMAC (ENETC_PFPMR_PMACE) is probably unconditionally enabled in the enetc driver to allow RX of preemptible packets and not see them as error frames. I don't know why TX preemption (ENETC_MMCSR_ME) is enabled though. With no way to say which traffic classes are preemptible (all are express by default), no preemptible frames would be transmitted anyway. Lastly, it may have been believed that the register write lock-step mode (now deleted) needed the pMAC to be enabled at all times. I don't know if that's true. However, I've checked that driver writes to PM1 registers do propagate through to the ENETC IP even when the pMAC is disabled. With such incomplete support for frame preemption, it's best to just remove whatever exists right now and come with something more coherent later. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:09 +00:00
Vladimir Oltean	12717decb5	net: enetc: implement software lockstep for port MAC registers Currently the enetc driver duplicates its writes to the PM0 registers also to PM1, but it doesn't do this consistently - for example we write to ENETC_PM0_MAXFRM but not to ENETC_PM1_MAXFRM. Create enetc_port_mac_wr() which writes both the PM0 and PM1 register with the same value (if frame preemption is supported on this port). Also create enetc_port_mac_rd() which reads from PM0 - the assumption being that PM1 contains just the same value. This will be necessary when we enable the MAC Merge layer properly, and the pMAC becomes operational. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:09 +00:00
Vladimir Oltean	219355f1b0	net: enetc: stop configuring pMAC in lockstep with eMAC The MWLM bit (MAC write lock-step mode) allows register writes to the pMAC to be auto-performed whenever the corresponding eMAC register is written by the driver. This allows their configuration to remain in sync. The driver has set this bit since the initial commit, but it doesn't do anything, since the hardware feature doesn't work (and the bit has been removed from more recent versions of the documentation). The driver does attempt, more or less, to keep those MAC registers in sync by writing the same value once to e.g. ENETC_PM0_CMD_CFG (eMAC) and once to ENETC_PM1_CMD_CFG (pMAC). Because the lockstep feature doesn't work, that's what it will stick to. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:09 +00:00
Vladimir Oltean	9c949e0b2f	net: enetc: add definition for offset between eMAC and pMAC regs This is a preliminary patch which replaces the hardcoded 0x1000 present in other PM1 (port MAC 1, aka pMAC) register definitions, which is an offset to the PM0 (port MAC 0, aka eMAC) equivalent register. This definition will be used in more places by future code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:09 +00:00
Vladimir Oltean	94557a9a73	net: enetc: detect frame preemption hardware capability Similar to other TSN features, query the Station Interface capability register to see whether preemption is supported on this port or not. On LS1028A, preemption is available on ports 0 and 2, but not on 1 and 3. This will allow us in the future to write the pMAC registers only on the ENETC ports where a pMAC actually exists. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:08 +00:00
Vladimir Oltean	e3972399bb	net: enetc: build common object files into a separate module The build system is complaining about the following: enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 13:13:08 +00:00
David S. Miller	f3c6e12893	Merge branch 'ethtool-mac-merge' Vladimir Oltean says: ==================== ethtool support for IEEE 802.3 MAC Merge layer Change log ---------- v3->v4: - add missing opening bracket in ocelot_port_mm_irq() - moved cfg.verify_time range checking so that it actually takes place for the updated rather than old value v3 at: https://patchwork.kernel.org/project/netdevbpf/cover/20230117085947.2176464-1-vladimir.oltean@nxp.com/ v2->v3: - made get_mm return int instead of void - deleted ETHTOOL_A_MM_SUPPORTED - renamed ETHTOOL_A_MM_ADD_FRAG_SIZE to ETHTOOL_A_MM_TX_MIN_FRAG_SIZE - introduced ETHTOOL_A_MM_RX_MIN_FRAG_SIZE - cleaned up documentation - rebased on top of PLCA changes - renamed ETHTOOL_STATS_SRC_* to ETHTOOL_MAC_STATS_SRC_* v2 at: https://patchwork.kernel.org/project/netdevbpf/cover/20230111161706.1465242-1-vladimir.oltean@nxp.com/ v1->v2: I've decided to focus just on the MAC Merge layer for now, which is why I am able to submit this patch set as non-RFC. v1 (RFC) at: https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/ What is being introduced ------------------------ TL;DR: a MAC Merge layer as defined by IEEE 802.3-2018, clause 99 (interspersing of express traffic). This is controlled through ethtool netlink (ETHTOOL_MSG_MM_GET, ETHTOOL_MSG_MM_SET). The raw ethtool commands are posted here: https://patchwork.kernel.org/project/netdevbpf/cover/20230111153638.1454687-1-vladimir.oltean@nxp.com/ The MAC Merge layer has its own statistics counters (ethtool --include-statistics --show-mm swp0) as well as two member MACs, the statistics of which can be queried individually, through a new ethtool netlink attribute, corresponding to: $ ethtool -I --show-pause eno2 --src aggregate $ ethtool -S eno2 --groups eth-mac eth-phy eth-ctrl rmon -- --src pmac The core properties of the MAC Merge layer are described in great detail in patches 02/12 and 03/12. They can be viewed in "make htmldocs" format. Devices for which the API is supported -------------------------------------- I decided to start with the Ethernet switch on NXP LS1028A (Felix) because of the smaller patch set. I also have support for the ENETC controller pending. I would like to get confirmation that the UAPI being proposed here will not restrict any use cases known by other hardware vendors. Why is support for preemptible traffic classes not here? -------------------------------------------------------- There is legitimate concern whether the 802.1Q portion of the standard (which traffic classes go to the eMAC and which to the pMAC) should be modeled in Linux using tc or using another UAPI. I think that is stalling the entire series, but should be discussed separately instead. Removing FP adminStatus support makes me confident enough to submit this patch set without an RFC tag (meaning: I wouldn't mind if it was merged as is). What is submitted here is sufficient for an LLDP daemon to do its job. I've patched openlldp to advertise and configure frame preemption: https://github.com/vladimiroltean/openlldp/tree/frame-preemption-v3 In case someone wants to try it out, here are some commands I've used. # Configure the interfaces to receive and transmit LLDP Data Units lldptool -L -i eno0 adminStatus=rxtx lldptool -L -i swp0 adminStatus=rxtx # Enable the transmission of certain TLVs on switch's interface lldptool -T -i eno0 -V addEthCap enableTx=yes lldptool -T -i swp0 -V addEthCap enableTx=yes # Query LLDP statistics on switch's interface lldptool -S -i swp0 # Query the received neighbor TLVs lldptool -i swp0 -t -n -V addEthCap Additional Ethernet Capabilities TLV Preemption capability supported Preemption capability enabled Preemption capability active Additional fragment size: 60 octets So using this patch set, lldpad will be able to advertise and configure frame preemption, but still, no data packet will be sent as preemptible over the link, because there is no UAPI to control which traffic classes are sent as preemptible and which as express. Preemptable or preemptible? --------------------------- IEEE 802.3 uses "preemptable" throughout. IEEE 802.1Q uses "preemptible" throughout. Because the definition of "preemptible" falls under 802.1Q's jurisdiction and 802.3 just references it, I went with the 802.1Q naming even where supporting an 802.3 feature. Also, checkpatch agrees with this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 12:44:18 +00:00
Vladimir Oltean	6505b68056	net: mscc: ocelot: add MAC Merge layer support for VSC9959 Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers (ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and pMAC statistics are already in the Ocelot switch lib even if just Felix supports them, I'm adding support for the whole MAC Merge layer in the common Ocelot library too. There is an interrupt (shared with the PTP interrupt) which signals changes to the MM verification state. This is done because the preemptible traffic classes should be committed to hardware only once the verification procedure has declared the link partner of being capable of receiving preemptible frames. We implement ethtool getters and setters for the MAC Merge layer state. The "TX enabled" and "verify status" are taken from the IRQ handler, using a mutex to ensure serialized access. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 12:44:18 +00:00
Vladimir Oltean	ab3f97a961	net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959 The Felix VSC9959 switch supports frame preemption and has a MAC Merge layer. In addition to the structured stats that exist for the eMAC, export the counters associated with its pMAC (pause, RMON, MAC, PHY, control) plus the high-level MAC Merge layer stats. The unstructured ethtool counters, as well as the rtnl_link_stats64 were left to report only the eMAC counters. Because statistics processing is quite self-contained in ocelot_stats.c now, I've opted for introducing an ocelot->mm_supported bool, based on which the common switch lib does everything, rather than pushing the TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-23 12:44:18 +00:00

1 2 3 4 5 ...

1155066 Commits