linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-18 09:44:18 +08:00

Author	SHA1	Message	Date
Leon Romanovsky	82465bec3e	devlink: Delete reload enable/disable interface Commit `a0c76345e3` ("devlink: disallow reload operation during device cleanup") added devlink_reload_{enable,disable}() APIs to prevent reload operation from racing with device probe/dismantle. After recent changes to move devlink_register() to the end of device probe and devlink_unregister() to the beginning of device dismantle, these races can no longer happen. Reload operations will be denied if the devlink instance is unregistered and devlink_unregister() will block until all in-flight operations are done. Therefore, remove these devlink_reload_{enable,disable}() APIs. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	bd032e35c5	devlink: Allow control devlink ops behavior through feature mask Introduce new devlink call to set feature mask to control devlink behavior during device initialization phase after devlink_alloc() is already called. This allows us to set reload ops based on device property which is not known at the beginning of driver initialization. For the sake of simplicity, this API lacks any type of locking and needs to be called before devlink_register() to make sure that no parallel access to the ops is possible at this stage. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	71c1b52593	netdevsim: Move devlink registration to be last devlink command This change prevents from users to access device before devlink is fully configured. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-27 16:31:59 +01:00
Leon Romanovsky	db4278c55f	devlink: Make devlink_register to be void devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-22 14:15:12 +01:00
Leon Romanovsky	6db9350a9d	devlink: Delete not-used devlink APIs Devlink core exported generously the functions calls that were used by netdevsim tests or not used at all. Delete such APIs with one exception - devlink_alloc_ns(). That function should be spared from deleting because it is a special form of devlink_alloc() needed for the netdevsim. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-17 14:19:39 +01:00
Jakub Kicinski	2e367522ce	netdevsim: add ability to change channel count For testing visibility of mq/mqprio default children. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-09-15 15:46:02 +01:00
Linus Torvalds	c6c3c5704b	Driver core update for 5.15-rc1 Here is the big set of driver core patches for 5.15-rc1. These do change a number of different things across different subsystems, and because of that, there were 2 stable tags created that might have already come into your tree from different pulls that did the following - changed the bus remove callback to return void - sysfs iomem_get_mapping rework The latter one will cause a tiny merge issue with your tree, as there was a last-minute fix for this in 5.14 in your tree, but the fixup should be "obvious". If you want me to provide a fixed merge for this, please let me know. Other than those two things, there's only a few small things in here: - kernfs performance improvements for huge numbers of sysfs users at once - tiny api cleanups - other minor changes All of these have been in linux-next for a while with no reported problems, other than the before-mentioned merge issue. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYS+FLQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ylXuACfWECnysDtXNe66DdETCFs1a1RToYAoMokWeU5 s8VFP1NY2BjmxJbkebLL =8kVu -----END PGP SIGNATURE----- Merge tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core patches for 5.15-rc1. These do change a number of different things across different subsystems, and because of that, there were 2 stable tags created that might have already come into your tree from different pulls that did the following - changed the bus remove callback to return void - sysfs iomem_get_mapping rework Other than those two things, there's only a few small things in here: - kernfs performance improvements for huge numbers of sysfs users at once - tiny api cleanups - other minor changes All of these have been in linux-next for a while with no reported problems, other than the before-mentioned merge issue" * tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits) MAINTAINERS: Add dri-devel for component.[hc] driver core: platform: Remove platform_device_add_properties() ARM: tegra: paz00: Handle device properties with software node API bitmap: extend comment to bitmap_print_bitmask/list_to_buf drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI topology: use bin_attribute to break the size limitation of cpumap ABI lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list sysfs: Rename struct bin_attribute member to f_mapping sysfs: Invoke iomem_get_mapping() from the sysfs open callback debugfs: Return error during {full/open}_proxy_open() on rmmod zorro: Drop useless (and hardly used) .driver member in struct zorro_dev zorro: Simplify remove callback sh: superhyway: Simplify check in remove callback nubus: Simplify check in remove callback nubus: Make struct nubus_driver::remove return void kernfs: dont call d_splice_alias() under kernfs node lock kernfs: use i_lock to protect concurrent inode updates kernfs: switch kernfs to use an rwsem kernfs: use VFS negative dentry caching ...	2021-09-01 08:44:42 -07:00
Yufeng Mo	f3ccfda193	ethtool: extend coalesce setting uAPI with CQE mode In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-24 07:38:29 -07:00
Leon Romanovsky	919d13a7e4	devlink: Set device as early as possible All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-09 10:21:40 +01:00
Leon Romanovsky	5c0418ed16	netdevsim: Protect both reload_down and reload_up paths Don't progress with adding and deleting ports as long as devlink reload is running. Fixes: `23809a726c` ("netdevsim: Forbid devlink reload when adding or deleting ports") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 10:36:36 +01:00
Leon Romanovsky	23809a726c	netdevsim: Forbid devlink reload when adding or deleting ports In order to remove complexity in devlink core related to devlink_reload_enable/disable, let's rewrite new_port/del_port logic to rely on internal to netdevsim lcok. We should protect only reload_down flow because it destroys nsim_dev, which is needed for nsim_dev_port_add/nsim_dev_port_del to hold port_list_lock. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 13:31:24 +01:00
Colin Ian King	f36c82ac1b	netdevsim: make array res_ids static const, makes object smaller Don't populate the array res_ids on the stack but instead it static const. Makes the object code smaller by 14 bytes. Before: text data bss dec hex filename 50833 8314 256 59403 e80b ./drivers/net/netdevsim/fib.o After: text data bss dec hex filename 50755 8378 256 59389 e7fd ./drivers/net/netdevsim/fib.o (gcc version 10.2.0) Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210801065328.138906-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-02 09:12:24 -07:00
Leon Romanovsky	2671345504	devlink: Allocate devlink directly in requested net namespace There is no need in extra call indirection and check from impossible flow where someone tries to set namespace without prior call to devlink_alloc(). Instead of this extra logic and additional EXPORT_SYMBOL, use specialized devlink allocation function that receives net namespace as an argument. Such specialized API allows clear view when devlink initialized in wrong net namespace and/or kernel users don't try to change devlink namespace under the hood. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-07-30 13:16:38 -07:00
Uwe Kleine-König	fc7a6209d5	bus: Make remove callback return void The driver core ignores the return value of this callback because there is only little it can do when a device disappears. This is the final bit of a long lasting cleanup quest where several buses were converted to also return void from their remove callback. Additionally some resource leaks were fixed that were caused by drivers returning an error code in the expectation that the driver won't go away. With struct bus_type::remove returning void it's prevented that newly implemented buses return an ignored error code and so don't anticipate wrong expectations for driver authors. Reviewed-by: Tom Rix <trix@redhat.com> (For fpga) Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio) Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts) Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb) Acked-by: Pali Rohár <pali@kernel.org> Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media) Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform) Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-By: Vinod Koul <vkoul@kernel.org> Acked-by: Juergen Gross <jgross@suse.com> (For xen) Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd) Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb) Acked-by: Johan Hovold <johan@kernel.org> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus) Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio) Acked-by: Maximilian Luz <luzmaximilian@gmail.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec) Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack) Acked-by: Geoff Levand <geoff@infradead.org> (For ps3) Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt) Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th) Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia) Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI) Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr) Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid) Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM) Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa) Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire) Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid) Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox) Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss) Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC) Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-21 11:53:42 +02:00
Peilin Ye	d4861fc6be	netdevsim: Add multi-queue support Currently netdevsim only supports a single queue per port, which is insufficient for testing multi-queue TC schedulers e.g. sch_mq. Extend the current sysfs interface so that users can create ports with multiple queues: $ echo "[ID] [PORT_COUNT] [NUM_QUEUES]" > /sys/bus/netdevsim/new_device As an example, echoing "2 4 8" creates 4 ports, with 8 queues per port. Note, this is compatible with the current interface, with default number of queues set to 1. For example, echoing "2 4" creates 4 ports with 1 queue per port; echoing "2" simply creates 1 port with 1 queue. Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-16 11:17:56 -07:00
Taehee Yoo	09adf7566d	net: netdevsim: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops There are two pointers in struct xfrm_state_offload, dev, real_dev. These are used in callback functions of struct xfrmdev_ops. The *dev points whether bonding interface or real interface. If bonding ipsec offload is used, it points bonding interface If not, it points real interface. And real_dev always points real interface. So, netdevsim should always use real_dev instead of dev. Of course, real_dev always not be null. Test commands: ip netns add A ip netns exec A bash modprobe netdevsim echo "1 1" > /sys/bus/netdevsim/new_device ip link add bond0 type bond mode active-backup ip link set eth0 master bond0 ip link set eth0 up ip link set bond0 up ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \ transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \ 0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \ dst 14.0.0.70/24 proto tcp offload dev bond0 dir in Splat looks like: BUG: spinlock bad magic on CPU#5, kworker/5:1/53 lock: 0xffff8881068c2cc8, .magic: 11121314, .owner: <none>/-1, .owner_cpu: -235736076 CPU: 5 PID: 53 Comm: kworker/5:1 Not tainted 5.13.0-rc3+ #1168 Workqueue: events linkwatch_event Call Trace: dump_stack+0xa4/0xe5 do_raw_spin_lock+0x20b/0x270 ? rwlock_bug.part.1+0x90/0x90 _raw_spin_lock_nested+0x5f/0x70 bond_get_stats+0xe4/0x4c0 [bonding] ? rcu_read_lock_sched_held+0xc0/0xc0 ? bond_neigh_init+0x2c0/0x2c0 [bonding] ? dev_get_alias+0xe2/0x190 ? dev_get_port_parent_id+0x14a/0x360 ? rtnl_unregister+0x190/0x190 ? dev_get_phys_port_name+0xa0/0xa0 ? memset+0x1f/0x40 ? memcpy+0x38/0x60 ? rtnl_phys_switch_id_fill+0x91/0x100 dev_get_stats+0x8c/0x270 rtnl_fill_stats+0x44/0xbe0 ? nla_put+0xbe/0x140 rtnl_fill_ifinfo+0x1054/0x3ad0 [ ... ] Fixes: `272c2330ad` ("xfrm: bail early on slave pass over skb") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-06 10:36:59 -07:00
Oleksandr Mazur	275b51c27c	drivers: net: netdevsim: fix devlink_trap selftests failing devlink_trap tests for the netdevsim fail due to misspelled debugfs file name. Change this name, as well as name of callback function, to match the naming as in the devlink itself - 'trap_drop_counter'. Test-results: selftests: drivers/net/netdevsim: devlink_trap.sh TEST: Initialization [ OK ] TEST: Trap action [ OK ] TEST: Trap metadata [ OK ] TEST: Non-existing trap [ OK ] TEST: Non-existing trap action [ OK ] TEST: Trap statistics [ OK ] TEST: Trap group action [ OK ] TEST: Non-existing trap group [ OK ] TEST: Trap group statistics [ OK ] TEST: Trap policer [ OK ] TEST: Trap policer binding [ OK ] TEST: Port delete [ OK ] TEST: Device delete [ OK ] Fixes: `a7b3527a43` ("drivers: net: netdevsim: add devlink trap_drop_counter_get implementation") Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-18 11:28:27 -07:00
Oleksandr Mazur	a7b3527a43	drivers: net: netdevsim: add devlink trap_drop_counter_get implementation Whenever query statistics is issued for trap with DROP action, devlink subsystem would also fill-in statistics 'dropped' field. In case if device driver did't register callback for hard drop statistics querying, 'dropped' field will be omitted and not filled. Add trap_drop_counter_get callback implementation to the netdevsim. Add new test cases for netdevsim, to test both the callback functionality, as well as drop statistics alteration check. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-14 13:04:25 -07:00
Dan Carpenter	4e744cb812	netdevsim: delete unnecessary debugfs checking In normal situations where the driver doesn't dereference "nsim_node->ddir" or "nsim_node->rate_parent" itself then we are not supposed to check the return from debugfs functions. In the case of debugfs_create_dir() the check was wrong as well because it doesn't return NULL, it returns error pointers. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-09 14:16:39 -07:00
Colin Ian King	ebbf5fcb94	netdevsim: Fix unsigned being compared to less than zero The comparison of len < 0 is always false because len is a size_t. Fix this by making len a ssize_t instead. Addresses-Coverity: ("Unsigned compared against 0") Fixes: `d395381909` ("netdevsim: Add max_vfs to bus_dev") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-03 15:33:17 -07:00
Dmytro Linkin	f3d101b485	netdevsim: Allow setting parent node of rate objects Implement new devlink ops that allow setting rate node as a parent for devlink port (leaf) or another devlink node through devlink API. Expose parent names to netdevsim debugfs in read only mode. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:37 -07:00
Dmytro Linkin	885226f568	netdevsim: Implement support for devlink rate nodes Implement new devlink ops that allow creation, deletion and setting of shared/max tx rate of devlink rate nodes through devlink API. Expose rate node and it's tx rates to netdevsim debugfs. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:37 -07:00
Dmytro Linkin	605c4f8f19	netdevsim: Implement devlink rate leafs tx rate support Implement new devlink ops that allow shared and max tx rate control for devlink port rate objects (leafs) through devlink API. Expose rate values of VF ports to netdevsim debugfs. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:37 -07:00
Dmytro Linkin	885dfe121b	netdevsim: Register devlink rate leaf objects per VF Register devlink rate leaf objects per VF. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Dmytro Linkin	160dc373ee	netdevsim: Implement legacy/switchdev mode for VFs Implement callbacks to set/get eswitch mode value. Add helpers to check current mode. Instantiate VFs' net devices and devlink ports on switchdev enabling and remove them on legacy enabling. Changing number of VFs while in switchdev mode triggers VFs creation/deletion. Also disable NDO API callback to set VF rate, since it's legacy API. Switchdev API to set VF rate will be implemented in one of the next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Dmytro Linkin	92ba1f29e6	netdevsim: Implement VFs Allow creation of netdevsim ports for VFs along with allocations of corresponding net devices and devlink ports. Add enums and helpers to distinguish PFs' ports from VFs' ports. Ports creation/deletion debugfs API intended to be used with physical ports only. VFs instantiation will be done in one of the next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Dmytro Linkin	814b9ce65e	netdevsim: Implement port types and indexing Define type of ports, which netdevsim driver currently operates with as PF. Define new port type - VF, which will be implemented in following patches. Add helper functions to distinguish them. Add helper function to get VF index from port index. Add port indexing logic where PFs' indexes starts from 0, VFs' - from NSIM_DEV_VF_PORT_INDEX_BASE. All ports uses same index pool, which means that PF port may be created with index from VFs' indexes range. Maximum number of VFs, which the driver can allocate, is limited by UINT_MAX - BASE. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Dmytro Linkin	32ac15d8fd	netdevsim: Disable VFs on nsim_dev_reload_destroy() call Move VFs disabling from device release() to nsim_dev_reload_destroy() to make VFs disabling and ports removal simultaneous. This is a requirement for VFs ports implemented in next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Dmytro Linkin	d395381909	netdevsim: Add max_vfs to bus_dev Currently there is no limit to the number of VFs netdevsim can enable. In a real systems this value exist and used by the driver. Fore example, some features might need to consider this value when allocating memory. Expose max_vfs variable to debugfs as configurable resource. If are VFs configured (num_vfs != 0) then changing of max_vfs not allowed. Co-developed-by: Yuval Avnery <yuvalav@nvidia.com> Signed-off-by: Yuval Avnery <yuvalav@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-02 14:08:36 -07:00
Ido Schimmel	a9b5d871ab	netdevsim: Only use sampling truncation length when valid When the sampling truncation length is invalid (zero), pass the length of the packet. Without the fix, no payload is reported to user space when the truncation length is zero. Fixes: `a8700c3dd0` ("netdevsim: Add dummy psample implementation") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-22 13:23:56 -07:00
Qiheng Lin	be107538c5	netdevsim: remove unneeded semicolon Eliminate the following coccicheck warning: drivers/net/netdevsim/fib.c:569:2-3: Unneeded semicolon Signed-off-by: Qiheng Lin <linqiheng@huawei.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-06 16:27:33 -07:00
Jakub Kicinski	0d7f76dc11	netdevsim: add FEC settings support Add support for ethtool FEC and some ethtool error injection. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:15:23 -07:00
Wei Yongjun	20fd4f421c	netdevsim: switch to memdup_user_nul() Use memdup_user_nul() helper instead of open-coding to simplify the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 16:26:58 -07:00
Ido Schimmel	a8700c3dd0	netdevsim: Add dummy psample implementation Allow netdevsim to report "sampled" packets to the psample module by periodically generating packets from a work queue. The behavior can be enabled / disabled (default) and the various meta data attributes can be controlled via debugfs knobs. This implementation enables both testing of the psample module with all the optional attributes as well as development of user space applications on top of psample such as hsflowd and a Wireshark dissector for psample generic netlink packets. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-14 15:00:43 -07:00
Ido Schimmel	c6385c0b67	netdevsim: Allow reporting activity on nexthop buckets A key component of the resilient hashing algorithm is the hash buckets' activity. If a bucket is active, it will not be populated with a new nexthop in order not to break existing flows. Therefore, in order to easily and thoroughly test the algorithm, we need to be in full control over the reported activity. Add a debugfs interface that allows user space to have netdevsim report a nexthop bucket within a resilient nexthop group as active. For example: # echo 10 23 > /sys/kernel/debug/netdevsim/netdevsim10/fib/nexthop_bucket_activity Will mark bucket 23 in nexthop group 10 as active. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	d8eaa4faca	netdevsim: Add support for resilient nexthop groups Allow resilient nexthop groups to be programmed and account their occupancy according to their number of buckets. The nexthop group itself as well as its buckets are marked with hardware flags (i.e., 'RTNH_F_TRAP'). Replacement of a single nexthop bucket can fail using the following debugfs knob: # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace N # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace Y Replacement of a resilient nexthop group can fail using the following debugfs knob: # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace N # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace Y This enables testing of various error paths. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Ido Schimmel	40ff83711f	netdevsim: Create a helper for setting nexthop hardware flags Instead of calling nexthop_set_hw_flags(), call a helper. It will be used to also set nexthop bucket flags in a subsequent patch. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Petr Machata	86927c9c4d	netdevsim: fib: Introduce a lock to guard nexthop hashtable Currently netdevsim relies on RTNL to maintain exclusivity in accessing the nexthop hash table. However, bucket notification may be called without RTNL having been held. Instead, introduce a custom lock to guard the table. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-12 17:44:10 -08:00
Jiapeng Chong	c53d21af67	netdevsim: fib: Remove redundant code Fix the following coccicheck warnings: ./drivers/net/netdevsim/fib.c:874:5-8: Unneeded variable: "err". Return "0" on line 889. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 14:32:48 -08:00
Hillf Danton	863a42b289	netdevsim: init u64 stats for 32bit hardware Init the u64 stats in order to avoid the lockdep prints on the 32bit hardware like INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 4695 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<826fc5b8>] (dump_backtrace) from [<826fc82c>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) [<826fc814>] (show_stack) from [<8270d1f8>] (__dump_stack lib/dump_stack.c:79 [inline]) [<826fc814>] (show_stack) from [<8270d1f8>] (dump_stack+0xa8/0xc8 lib/dump_stack.c:120) [<8270d150>] (dump_stack) from [<802bf9c0>] (assign_lock_key kernel/locking/lockdep.c:935 [inline]) [<8270d150>] (dump_stack) from [<802bf9c0>] (register_lock_class+0xabc/0xb68 kernel/locking/lockdep.c:1247) [<802bef04>] (register_lock_class) from [<802baa2c>] (__lock_acquire+0x84/0x32d4 kernel/locking/lockdep.c:4711) [<802ba9a8>] (__lock_acquire) from [<802be840>] (lock_acquire.part.0+0xf0/0x554 kernel/locking/lockdep.c:5442) [<802be750>] (lock_acquire.part.0) from [<802bed10>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5415) [<802beca4>] (lock_acquire) from [<81560548>] (seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (__u64_stats_fetch_begin include/linux/u64_stats_sync.h:164 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (u64_stats_fetch_begin include/linux/u64_stats_sync.h:175 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (nsim_get_stats64+0xdc/0xf0 drivers/net/netdevsim/netdev.c:70) [<8156046c>] (nsim_get_stats64) from [<81e2efa0>] (dev_get_stats+0x44/0xd0 net/core/dev.c:10405) [<81e2ef5c>] (dev_get_stats) from [<81e53204>] (rtnl_fill_stats+0x38/0x120 net/core/rtnetlink.c:1211) [<81e531cc>] (rtnl_fill_stats) from [<81e59d58>] (rtnl_fill_ifinfo+0x6d4/0x148c net/core/rtnetlink.c:1783) [<81e59684>] (rtnl_fill_ifinfo) from [<81e5ceb4>] (rtmsg_ifinfo_build_skb+0x9c/0x108 net/core/rtnetlink.c:3798) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3830 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3821 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo+0x44/0x70 net/core/rtnetlink.c:3839) [<81e5d068>] (rtmsg_ifinfo) from [<81e45c2c>] (register_netdevice+0x664/0x68c net/core/dev.c:10103) [<81e455c8>] (register_netdevice) from [<815608bc>] (nsim_create+0xf8/0x124 drivers/net/netdevsim/netdev.c:317) [<815607c4>] (nsim_create) from [<81561184>] (__nsim_dev_port_add+0x108/0x188 drivers/net/netdevsim/dev.c:941) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_port_add_all drivers/net/netdevsim/dev.c:990 [inline]) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_probe+0x5cc/0x750 drivers/net/netdevsim/dev.c:1119) [<81561b0c>] (nsim_dev_probe) from [<815661dc>] (nsim_bus_probe+0x10/0x14 drivers/net/netdevsim/bus.c:287) [<815661cc>] (nsim_bus_probe) from [<811724c0>] (really_probe+0x100/0x50c drivers/base/dd.c:554) [<811723c0>] (really_probe) from [<811729c4>] (driver_probe_device+0xf8/0x1c8 drivers/base/dd.c:740) [<811728cc>] (driver_probe_device) from [<81172fe4>] (__device_attach_driver+0x8c/0xf0 drivers/base/dd.c:846) [<81172f58>] (__device_attach_driver) from [<8116fee0>] (bus_for_each_drv+0x88/0xd8 drivers/base/bus.c:431) [<8116fe58>] (bus_for_each_drv) from [<81172c6c>] (__device_attach+0xdc/0x1d0 drivers/base/dd.c:914) [<81172b90>] (__device_attach) from [<8117305c>] (device_initial_probe+0x14/0x18 drivers/base/dd.c:961) [<81173048>] (device_initial_probe) from [<81171358>] (bus_probe_device+0x90/0x98 drivers/base/bus.c:491) [<811712c8>] (bus_probe_device) from [<8116e77c>] (device_add+0x320/0x824 drivers/base/core.c:3109) [<8116e45c>] (device_add) from [<8116ec9c>] (device_register+0x1c/0x20 drivers/base/core.c:3182) [<8116ec80>] (device_register) from [<81566710>] (nsim_bus_dev_new drivers/net/netdevsim/bus.c:336 [inline]) [<8116ec80>] (device_register) from [<81566710>] (new_device_store+0x178/0x208 drivers/net/netdevsim/bus.c:215) [<81566598>] (new_device_store) from [<8116fcb4>] (bus_attr_store+0x2c/0x38 drivers/base/bus.c:122) [<8116fc88>] (bus_attr_store) from [<805b4b8c>] (sysfs_kf_write+0x48/0x54 fs/sysfs/file.c:139) [<805b4b44>] (sysfs_kf_write) from [<805b3c90>] (kernfs_fop_write_iter+0x128/0x1ec fs/kernfs/file.c:296) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (call_write_iter include/linux/fs.h:1901 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (new_sync_write fs/read_write.c:518 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (vfs_write+0x3dc/0x57c fs/read_write.c:605) [<804d1f20>] (vfs_write) from [<804d2604>] (ksys_write+0x68/0xec fs/read_write.c:658) [<804d259c>] (ksys_write) from [<804d2698>] (__do_sys_write fs/read_write.c:670 [inline]) [<804d259c>] (ksys_write) from [<804d2698>] (sys_write+0x10/0x14 fs/read_write.c:667) [<804d2688>] (sys_write) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64) Fixes: `83c9e13aa3` ("netdevsim: add software driver for testing offloads") Reported-by: syzbot+e74a6857f2d0efe3ad81@syzkaller.appspotmail.com Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:36:26 -08:00
Amit Cohen	134c753242	netdevsim: fib: Add debugfs to debug route offload failure Add "fail_route_offload" flag to disallow offloading routes. It is needed to test "offload failed" notifications. Create the flag as part of nsim_fib_create() under fib directory and set it to false by default. When FIB_EVENT_ENTRY_{REPLACE, APPEND} are triggered and "fail_route_offload" value is true, set the appropriate hardware flag to make the kernel emit RTM_NEWROUTE notification with RTM_F_OFFLOAD_FAILED flag. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-08 16:47:03 -08:00
Ido Schimmel	f57ab5b75f	netdevsim: dev: Initialize FIB module after debugfs Initialize the dummy FIB offload module after debugfs, so that the FIB module could create its own directory there. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-08 16:47:03 -08:00
Amit Cohen	484a4dfb75	netdevsim: fib: Do not warn if route was not found for several events The next patch will add the ability to fail route offload controlled by debugfs variable called "fail_route_offload". If we vetoed the addition, we might get a delete or append notification for a route we do not have. Therefore, do not warn if route was not found. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-08 16:47:03 -08:00
Amit Cohen	0c5fcf9e24	IPv6: Add "offload failed" indication to routes After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv6 routes, so that users will have better visibility into the offload process. 'struct fib6_info' is extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase struct size. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-08 16:47:03 -08:00
Amit Cohen	36c5100e85	IPv4: Add "offload failed" indication to routes After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware. The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware. To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl. With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come. This patch adds an "offload_failed" indication to IPv4 routes, so that users will have better visibility into the offload process. 'struct fib_alias', and 'struct fib_rt_info' are extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase structs size. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-08 16:47:03 -08:00
Amit Cohen	efc42879ec	net: Do not call fib6_info_hw_flags_set() when IPv6 is disabled With the next patch mlxsw and netdevsim will fail in compilation if CONFIG_IPV6 is disabled. Do not call fib6_info_hw_flags_set() when IPv6 is disabled. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	fbaca8f895	net: Pass 'net' struct as first argument to fib6_info_hw_flags_set() The next patch will emit notification when hardware flags are changed, in case that fib_notify_on_flag_change sysctl is set to 1. To know sysctl values, net struct is needed. This change is consistent with the IPv4 version, which gets 'net' struct as its first argument. Currently, the only callers of this function are mlxsw and netdevsim. Patch the callers to pass net. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:59 -08:00
Amit Cohen	0ae3eb7b46	netdevsim: fib: Perform the route programming in a non-atomic context Currently, netdevsim implements dummy FIB offload and marks notified routes with RTM_F_TRAP flag. netdevsim does not defer route notifications to a work queue because it does not need to program any hardware. Given that netdevsim's purpose is to both give an example implementation and allow developers to test their code, align netdevsim to a "real" hardware device driver like mlxsw and have it also perform the route "programming" in a non-atomic context. It will be used to test route flags notifications which will be added in the next patches. The following changes are needed when route handling is performed in WQ: - Handle the accounting in the main context, to be able to return an error for adding route when all the routes are used. For FIB_EVENT_ENTRY_REPLACE increase the counter before scheduling the delayed work, and in case that this event replaces an existing route, decrease the counter as part of the delayed work. - For IPv6, cannot use fen6_info->rt->fib6_siblings list because it might be changed during handling the delayed work. Save an array with the nexthops as part of fib6_event struct, and take a reference for each nexthop to prevent them from being freed while event is queued. - Change GFP_ATOMIC allocations to GFP_KERNEL. - Use single work item that is handling a list of ordered routes. Handling routes must be processed in the order they were submitted to avoid logical errors that could lead to unexpected failures. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Amit Cohen	9e635a21ca	netdevsim: fib: Convert the current occupancy to an atomic variable When route is added/deleted, the appropriate counter is increased/decreased to maintain number of routes. User can limit the number of routes and then according to the appropriate counter, adding more routes than the limitation is forbidden. Currently, there is one lock which protects hashtable, list and accounting. Handling the counters will be performed from both atomic context and non-atomic context, while the hashtable and the list will be used only from non-atomic context and therefore will be protected by a separate lock. Protect accounting by using an atomic variable, so lock is not needed. v2: * Use atomic64_sub() in nsim_nexthop_account()'s error path Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-02 17:45:58 -08:00
Ido Schimmel	09ad6becf5	nexthop: Use enum to encode notification type Currently there are only two types of in-kernel nexthop notification. The two are distinguished by the 'is_grp' boolean field in 'struct nh_notifier_info'. As more notification types are introduced for more next-hop group types, a boolean is not an easily extensible interface. Instead, convert it to an enum. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:52 -08:00

1 2 3 4 5

223 Commits