linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-10 22:54:11 +08:00

Author	SHA1	Message	Date
Linus Torvalds	016decc0d8	ACPI fixes for 5.11-rc6 - Modify the ACPI thermal driver to avoid evaluating _TMP directly in its Notify () handler callback and running too many thermal checks for one thermal zone at the same time so as to address a work item accumulation issue observed on some systems that fail to shut down as a result of it (Rafael Wysocki). - Modify the ACPI uevent file creation code to avoid putting multiple "MODALIAS=" entries in one uevent file in sysfs which breaks systemd-udevd (Kai-Heng Feng). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAUMLkSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxhFcP/2G2fh0ewRbusrUOOQVsmzDiaowsjX9Z V5dIhhe8B8FP3nF/cw1PHHrvExWqPMHswIWQq84vHzvhQBl2f5VxKwg6VKhY0e0J dlK7zCNyMTkVPl/6xJdKW7+xZBRe1Bg0pgsSI4joVrn39P777g0iCeDznJvamT9g rwWUckGwGff6jzH0oWjWhMTrIkMlzgdYSL2+zHAUzZmUkxwRaId8yk+JnBZfebhc HOX8XUl2Pd0rbHdDWbuaJKOOcDVz6Fy/c1HIppVpe5dwVVFZ4jpI/DABH/h6Skyq A+arRA8oMk/YORdsp8z4wcW6F8JXneUfulOizVnyhuC+244ABytCq2R6+OT7cbCB QsDVIFuc1NRmwVVJV2c9hfsBSa53TwUOLlIoi9xtOIm5WBPwGpdyRBFVA1I58jnT td9BlvR/Lmn051FLtHhCIhxSpANv6leawWsI0LnTmO5bNwPQwg6upWl61he5K8Vi nNBBs6nptlq7RA9t9tj+x3CGgK2Dd21+lb25LuOnX0eBLL8VvtBWeR1THEasMSBs Maajb0YPWjyRrcRpXx+qjU2P++LjpqSEOJdeBvjrdrmlID39WEraLmuJAi4KDCXI oZikBMvdCIDlHN+yu8tMW5M4rFRDLQ1CfLz2ABsNtbUDBBsmYwkgQwrGYRS1KJWv VbDBlMBRPhs9 =4+WP -----END PGP SIGNATURE----- Merge tag 'acpi-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix the handling of notifications in the ACPI thermal driver and address a device enumeration issue leading to the presence of multiple 'MODALIAS=' entries in one uevent file in sysfs in some cases. Specifics: - Modify the ACPI thermal driver to avoid evaluating _TMP directly in its Notify () handler callback and running too many thermal checks for one thermal zone at the same time so as to address a work item accumulation issue observed on some systems that fail to shut down as a result of it (Rafael Wysocki) - Modify the ACPI uevent file creation code to avoid putting multiple 'MODALIAS=' entries in one uevent file in sysfs which breaks systemd-udevd (Kai-Heng Feng)" * tag 'acpi-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: thermal: Do not call acpi_thermal_check() directly ACPI: sysfs: Prefer "compatible" modalias	2021-01-29 13:23:21 -08:00
Linus Torvalds	6305d15e01	drm fixes for 5.11-rc6 nouveau: - fix svm init conditions - fix nv50 modesetting regression - fix cursor plane modifiers - fix > 64x64 cursor regression vc4: - Fix LBM size calculation - Fix high resolutions for hvs5 i915: - Fix ICL MG PHY vswing - Fix subplatform handling - Fix selftest memleak - Clear CACHE_MODE prior to clearing residuals - Always flush the active worker before returning from the wait - Always try to reserve GGTT address 0x0 amdgpu: - Fix a fan control regression on some boards - Fix clang warning -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJgE4RmAAoJEAx081l5xIa+aaUQAIzLpu+6B1JE/wYURi1ICpvQ M4+oV/5M3yC6WeWZG+E6zOpBegZApZowF7inzkZKHdsru3YTgdP/kSjlC+lyJMF2 l2QPDTckm/RXTI4vSTsFudWWiL69HGjhkgBnb+cyjT/YeReUBcHZzmeNLU23v0zl rPDZM3tIN7BZHglDB4uolC7rAQulT+TfcpcwvCA3qamkYUJOAsCnFc7dW9Q/6hDy BFaQ9n5pM9NxA8azLYcB5qCcTKQt347FzX6A936h0FCgKoJu/EfrDQRf3Bxc0o+o eizK8WUjtrPbWh8Rtvyfi8dIFiY0v/lUjWETDmiy3aBKv9t4gEAYfL2yFmdS/0Dx 60M8Bgbodz5RG63l6If0Di62Znh2Pp9kDFbfmlhdchYxCRxkSFmFqvmL6eH5QD2C YpMsfRTQ3vAolpAw4kV2XAS6ogfNoLzr5u4h8zcP5z0B4psIa/+2jaNPJh1nuYn4 R5fBRvMi3deTYIeL3KTJ6AppsaLqMazEHsjf5i25Sy7nxqLEJIoFu9xx6D6RSxcG i5Hfa4Lj/1j35IwOthZvtGyPskc+b8OPGUqREM3Am9tu2r0XM103aV7e0Ny1AEOS ZkfaKgbHHBQbyZD7AoeIBC/7/+QJjrBGmRIUpd62Cgx9OVkVHJsOPGqINGoD23+m Qx6kHnBsWSaOhwHSYj0E =caGV -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Weekly fixes for graphics, nothing too major, nouveau has a few regression fixes for various fallout from header changes previously, vc4 has two fixes, two amdgpu, and a smattering of i915 fixes. All seems on course for a quieter rc7, fingers crossed. nouveau: - fix svm init conditions - fix nv50 modesetting regression - fix cursor plane modifiers - fix > 64x64 cursor regression vc4: - Fix LBM size calculation - Fix high resolutions for hvs5 i915: - Fix ICL MG PHY vswing - Fix subplatform handling - Fix selftest memleak - Clear CACHE_MODE prior to clearing residuals - Always flush the active worker before returning from the wait - Always try to reserve GGTT address 0x0 amdgpu: - Fix a fan control regression on some boards - Fix clang warning" * tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm: drm/nouveau/kms/gk104-gp1xx: Fix > 64x64 cursors drm/nouveau/kms/nv50-: Report max cursor size to userspace drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices drm/nouveau/dispnv50: Restore pushing of all data. amdgpu: fix clang build warning Revert "drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)" drm/i915/gt: Always try to reserve GGTT address 0x0 drm/i915: Always flush the active worker before returning from the wait drm/i915/selftest: Fix potential memory leak drm/i915: Check for all subplatform bits drm/i915: Fix ICL MG PHY vswing handling drm/i915/gt: Clear CACHE_MODE prior to clearing residuals drm/vc4: Correct POS1_SCL for hvs5 drm/vc4: Correct lbm size and calculation drm/nouveau/nvif: fix method count when pushing an array	2021-01-29 13:18:23 -08:00
Linus Torvalds	a9cbbb80e3	tty: avoid using vfs_iocb_iter_write() for redirected console writes It turns out that the vfs_iocb_iter_{read,write}() functions are entirely broken, and don't actually use the passed-in file pointer for IO - only for the preparatory work (permission checking and for the write_iter function lookup). That worked fine for overlayfs, which always builds the new iocb with the same file pointer that it passes in, but in the general case it ends up doing nonsensical things (and could cause an iterator call that doesn't even match the passed-in file pointer). This subtly broke the tty conversion to write_iter in commit `9bb48c82ac` ("tty: implement write_iter"), because the console redirection didn't actually end up redirecting anything, since the passed-in file pointer was basically ignored, and the actual write was done with the original non-redirected console tty after all. The main visible effect of this is that the console messages were no longer logged to /var/log/boot.log during graphical boot. Fix the issue by simply not using the vfs write "helper" function at all, and just redirecting the write entirely internally to the tty layer. Do the target writability permission checks when actually registering the target tty with TIOCCONS instead of at write time. Fixes: `9bb48c82ac` ("tty: implement write_iter") Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-01-29 13:12:17 -08:00
Rafael J. Wysocki	b584b7e963	Merge branch 'acpi-sysfs' * acpi-sysfs: ACPI: sysfs: Prefer "compatible" modalias	2021-01-29 16:28:48 +01:00
Damien Le Moal	cd92cdb9c8	null_blk: cleanup zoned mode initialization To avoid potential compilation problems, replaced the badly written MB_TO_SECTS() macro (missing parenthesis around the argument use) with the inline function mb_to_sects(). And while at it, simplify the calculation of the total number of zones of the device using the round_up() macro. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-01-29 07:49:22 -07:00
Miguel Ojeda	1074f8ec28	clang-format: Update with the latest for_each macro list Re-run the shell fragment that generated the original list. Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2021-01-29 15:00:23 +01:00
Marc Kleine-Budde	cf8ee6de25	can: mcp251xfd: mcp251xfd_probe(): use dev_err_probe() to simplify error handling dev_err_probe() can reduce code size, uniform error handling and record the defer probe reason etc., use it to simplify the code. Link: https://lore.kernel.org/r/20210128104644.2982125-9-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:58 +01:00
Marc Kleine-Budde	dfe99ba29e	can: mcp251xfd: mcp251xfd_chip_clock_enable(): simplify return This patch simplifies the return of the mcp251xfd_chip_clock_enable() function by direct returning the error. Link: https://lore.kernel.org/r/20210128104644.2982125-8-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:58 +01:00
Marc Kleine-Budde	49ffacbc4c	can: mcp251xfd: add missing _MASK postfix to MCP251XFD_OBJ_FLAGS_DLC As MCP251XFD_OBJ_FLAGS_DLC is a mask, add the missing _MASK postfix, that all other masks in the driver have. Link: https://lore.kernel.org/r/20210128104644.2982125-7-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:57 +01:00
Marc Kleine-Budde	f93486a79a	can: mcp251xfd: unify error messages and commets This patch unifies the error messages: - have a "." and the end of each message - write controller with a small "c", if not the first word of an error message. Link: https://lore.kernel.org/r/20210128104644.2982125-6-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:57 +01:00
Marc Kleine-Budde	9f1fbc1c9c	can: mcp251xfd: mcp251xfd_probe(): add imx6 to errata table This patch adds an imx6 as known good to the errata table. Link: https://lore.kernel.org/r/20210128104644.2982125-5-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:57 +01:00
Marc Kleine-Budde	01b2a0e5a0	can: mcp251xfd: mcp251xfd_probe(): remove known bad combinations from errata tabe The published errata specify the maximum allowed SPI frequency to be max 85% of (FSYSCLK/2). So there's no need to track known bad clock settings in the driver. As the setup of known good values is a bit tricky, keep them. Link: https://lore.kernel.org/r/20210128104644.2982125-4-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:57 +01:00
Marc Kleine-Budde	b98e68e91c	can: mcp251xfd: mcp251xfd_probe(): sort errata table alphabetically, fix indention This patch sorts the errata table alphabetically and fixes the indention. Link: https://lore.kernel.org/r/20210128104644.2982125-3-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:56 +01:00
Marc Kleine-Budde	28eb119c04	can: mcp251xfd: mcp251xfd_probe(): fix errata reference This patch fixes the reference to the errata for both the mcp2517fd and the mcp2518fd. Fixes: `f5b84dedf7` ("can: mcp25xxfd: mcp25xxfd_probe(): add SPI clk limit related errata information") Link: https://lore.kernel.org/r/20210128104644.2982125-2-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-01-29 09:31:56 +01:00
Bjorn Helgaas	46eb3c108f	octeontx2-af: Fix 'physical' typos Fix misspellings of "physical". Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20210127181359.3008316-1-helgaas@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 21:24:47 -08:00
dingsenjie	1d3f9bb1be	linux/qed: fix spelling typo in qed_chain.h allocted -> allocated Signed-off-by: dingsenjie <dingsenjie@yulong.com> Link: https://lore.kernel.org/r/20210127022801.8028-1-dingsenjie@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 21:24:40 -08:00
Jakub Kicinski	06cc6e5dc6	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-01-29 1) Fix two copy_{from,to}_user() warn_on_once splats for BPF cgroup getsockopt infra when user space is trying to race against optlen, from Loris Reiff. 2) Fix a missing fput() in BPF inode storage map update helper, from Pan Bian. 3) Fix a build error on unresolved symbols on disabled networking / keys LSM hooks, from Mikko Ylinen. 4) Fix preload BPF prog build when the output directory from make points to a relative path, from Quentin Monnet. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, preload: Fix build when $(O) points to a relative path bpf: Drop disabled LSM hooks from the sleepable set bpf, inode_storage: Put file handler if no storage was found bpf, cgroup: Fix problematic bounds check bpf, cgroup: Fix optlen WARN_ON_ONCE toctou ==================== Link: https://lore.kernel.org/r/20210129001556.6648-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 21:07:45 -08:00
Jakub Kicinski	67d25ce891	Merge branch 'nexthop-preparations-for-resilient-next-hop-groups' Petr Machata says: ==================== nexthop: Preparations for resilient next-hop groups At this moment, there is only one type of next-hop group: an mpath group. Mpath groups implement the hash-threshold algorithm, described in RFC 2992[1]. To select a next hop, hash-threshold algorithm first assigns a range of hashes to each next hop in the group, and then selects the next hop by comparing the SKB hash with the individual ranges. When a next hop is removed from the group, the ranges are recomputed, which leads to reassignment of parts of hash space from one next hop to another. RFC 2992 illustrates it thus: +-------+-------+-------+-------+-------+ \| 1 \| 2 \| 3 \| 4 \| 5 \| +-------+-+-----+---+---+-----+-+-------+ \| 1 \| 2 \| 4 \| 5 \| +---------+---------+---------+---------+ Before and after deletion of next hop 3 under the hash-threshold algorithm. Note how next hop 2 gave up part of the hash space in favor of next hop 1, and 4 in favor of 5. While there will usually be some overlap between the previous and the new distribution, some traffic flows change the next hop that they resolve to. If a multipath group is used for load-balancing between multiple servers, this hash space reassignment causes an issue that packets from a single flow suddenly end up arriving at a server that does not expect them, which may lead to TCP reset. If a multipath group is used for load-balancing among available paths to the same server, the issue is that different latencies and reordering along the way causes the packets to arrive in wrong order. Resilient hashing is a technique to address the above problem. Resilient next-hop group has another layer of indirection between the group itself and its constituent next hops: a hash table. The selection algorithm uses a straightforward modulo operation to choose a hash bucket, and then reads the next hop that this bucket contains, and forwards traffic there. This indirection brings an important feature. In the hash-threshold algorithm, the range of hashes associated with a next hop must be continuous. With a hash table, mapping between the hash table buckets and the individual next hops is arbitrary. Therefore when a next hop is deleted the buckets that held it are simply reassigned to other next hops: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \|1\|1\|1\|1\|2\|2\|2\|2\|3\|3\|3\|3\|4\|4\|4\|4\|5\|5\|5\|5\| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ v v v v +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \|1\|1\|1\|1\|2\|2\|2\|2\|1\|2\|4\|5\|4\|4\|4\|4\|5\|5\|5\|5\| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Before and after deletion of next hop 3 under the resilient hashing algorithm. When weights of next hops in a group are altered, it may be possible to choose a subset of buckets that are currently not used for forwarding traffic, and use those to satisfy the new next-hop distribution demands, keeping the "busy" buckets intact. This way, established flows are ideally kept being forwarded to the same endpoints through the same paths as before the next-hop group change. This patchset prepares the next-hop code for eventual introduction of resilient hashing groups. - Patches #1-#4 carry otherwise disjoint changes that just remove certain assumptions in the next-hop code. - Patches #5-#6 extend the in-kernel next-hop notifiers to support more next-hop group types. - Patches #7-#12 refactor RTNL message handlers. Resilient next-hop groups will introduce a new logical object, a hash table bucket. It turns out that handling bucket-related messages is similar to how next-hop messages are handled. These patches extract the commonalities into reusable components. The plan is to contribute approximately the following patchsets: 1) Nexthop policy refactoring (already pushed) 2) Preparations for resilient next hop groups (this patchset) 3) Implementation of resilient next hop group 4) Netdevsim offload plus a suite of selftests 5) Preparations for mlxsw offload of resilient next-hop groups 6) mlxsw offload including selftests Interested parties can look at the current state of the code at [2] and [3]. [1] https://tools.ietf.org/html/rfc2992 [2] https://github.com/idosch/linux/commits/submit/res_integ_v1 [3] https://github.com/idosch/iproute2/commits/submit/res_v1 ==================== Link: https://lore.kernel.org/r/cover.1611836479.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:59 -08:00
Petr Machata	0bccf8ed8a	nexthop: Extract a helper for validation of get/del RTNL requests Validation of messages for get / del of a next hop is the same as will be validation of messages for get of a resilient next hop group bucket. The difference is that policy for resilient next hop group buckets is a superset of that used for next-hop get. It is therefore possible to reuse the code that validates the nhmsg fields, extracts the next-hop ID, and validates that. To that end, extract from nh_valid_get_del_req() a helper __nh_valid_get_del_req() that does just that. Make the nlh argument const so that the function can be called from the dump context, which only has a const nlh. Propagate the constness to nh_valid_get_del_req(). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:54 -08:00
Petr Machata	e948217d25	nexthop: Add a callback parameter to rtm_dump_walk_nexthops() In order to allow different handling for next-hop tree dumper and for bucket dumper, parameterize the next-hop tree walker with a callback. Add rtm_dump_nexthop_cb() with just the bits relevant for next-hop tree dumping. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:53 -08:00
Petr Machata	cbee18071e	nexthop: Extract a helper for walking the next-hop tree Extract from rtm_dump_nexthop() a helper to walk the next hop tree. A separate function for this will be reusable from the bucket dumper. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:53 -08:00
Petr Machata	a6fbbaa64c	nexthop: Strongly-type context of rtm_dump_nexthop() The dump operations need to keep state from one invocation to another. A scratch area is dedicated for this purpose in the passed-in argument, cb, namely via two aliased arrays, struct netlink_callback.args and .ctx. Dumping of buckets will end up having to iterate over next hops as well, and it would be nice to be able to reuse the iteration logic with the NH dumper. The fact that the logic currently relies on fixed index to the .args array, and the indices would have to be coordinated between the two dumpers, makes this somewhat awkward. To make the access patters clearer, introduce a helper struct with a NH index, and instead of using the .args array directly, use it through this structure. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:53 -08:00
Petr Machata	b9ebea1276	nexthop: Extract a common helper for parsing dump attributes Requests to dump nexthops have many attributes in common with those that requests to dump buckets of resilient NH groups will have. However, they have different policies. To allow reuse of this code, extract a policy-agnostic wrapper out of nh_valid_dump_req(), and convert this function into a thin wrapper around it. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:53 -08:00
Petr Machata	56450ec6b7	nexthop: Extract dump filtering parameters into a single structure Requests to dump nexthops have many attributes in common with those that requests to dump buckets of resilient NH groups will have. In order to make reuse of this code simpler, convert the code to use a single structure with filtering configuration instead of passing around the parameters one by one. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:52 -08:00
Petr Machata	da230501f2	nexthop: Dispatch notifier init()/fini() by group type After there are several next-hop group types, initialization and finalization of notifier type needs to reflect the actual type. Transform nh_notifier_grp_info_init() and _fini() to make extending them easier. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:52 -08:00
Ido Schimmel	09ad6becf5	nexthop: Use enum to encode notification type Currently there are only two types of in-kernel nexthop notification. The two are distinguished by the 'is_grp' boolean field in 'struct nh_notifier_info'. As more notification types are introduced for more next-hop group types, a boolean is not an easily extensible interface. Instead, convert it to an enum. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:52 -08:00
Petr Machata	720ccd9a72	nexthop: Assert the invariant that a NH group is of only one type Most of the code that deals with nexthop groups relies on the fact that the group is of exactly one well-known type. Currently there is only one type, "mpath", but as more next-hop group types come, it becomes desirable to have a central place where the setting is validated. Introduce such place into nexthop_create_group(), such that the check is done before the code that relies on that invariant is invoked. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:51 -08:00
Petr Machata	b9bae61be4	nexthop: Introduce to struct nh_grp_entry a per-type union The values that a next-hop group needs to keep track of depend on the group type. Introduce a union to separate fields specific to the mpath groups from fields specific to other group types. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:51 -08:00
Petr Machata	79bc55e3fe	nexthop: Dispatch nexthop_select_path() by group type The logic for selecting path depends on the next-hop group type. Adapt the nexthop_select_path() to dispatch according to the group type. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:51 -08:00
David Ahern	5d1f0f09b5	nexthop: Rename nexthop_free_mpath nexthop_free_mpath really should be nexthop_free_group. Rename it. Signed-off-by: David Ahern <dsahern@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:49:51 -08:00
Jakub Kicinski	4915a40437	Merge branch 'net-iucv-updates-2021-01-28' Julian Wiedmann says: ==================== net/iucv: updates 2021-01-28 This reworks & simplifies the TX notification path in af_iucv, so that we can send out SG skbs over TRANS_HIPER sockets. Also remove a noisy WARN_ONCE() in the RX path. ==================== Link: https://lore.kernel.org/r/20210128114108.39409-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:25 -08:00
Julian Wiedmann	2c3b4456c8	net/af_iucv: build SG skbs for TRANS_HIPER sockets The TX path no longer falls apart when some of its SG skbs are later linearized by lower layers of the stack. So enable the use of SG skbs in iucv_sock_sendmsg() again. This effectively reverts commit `dc5367bcc5` ("net/af_iucv: don't use paged skbs for TX on HiperSockets"). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:22 -08:00
Julian Wiedmann	80bc97aa0a	net/af_iucv: don't track individual TX skbs for TRANS_HIPER sockets Stop maintaining the skb_send_q list for TRANS_HIPER sockets. Not only is it extra overhead, but keeping around a list of skb clones means that we later also have to match the ->sk_txnotify() calls against these clones and free them accordingly. The current matching logic (comparing the skbs' shinfo location) is frustratingly fragile, and breaks if the skb's head is mangled in any sort of way while passing from dev_queue_xmit() to the device's HW queue. Also adjust the interface for ->sk_txnotify(), to make clear that we don't actually care about any skb internals. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:21 -08:00
Julian Wiedmann	ef6af7bdb9	net/af_iucv: count packets in the xmit path The TX code keeps track of all skbs that are in-flight but haven't actually been sent out yet. For native IUCV sockets that's not a huge deal, but with TRANS_HIPER sockets it would be much better if we didn't need to maintain a list of skb clones. Note that we actually only care about the _count_ of skbs in this stage of the TX pipeline. So as prep work for removing the skb tracking on TRANS_HIPER sockets, keep track of the skb count in a separate variable and pair any list {enqueue, unlink} with a count {increment, decrement}. Then replace all occurences where we currently look at the skb list's fill level. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:21 -08:00
Julian Wiedmann	c464444fa2	net/af_iucv: don't lookup the socket on TX notification Whoever called iucv_sk(sk)->sk_txnotify() must already know that they're dealing with an af_iucv socket. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:21 -08:00
Alexander Egorenkov	27e9c1de52	net/af_iucv: remove WARN_ONCE on malformed RX packets syzbot reported the following finding: AF_IUCV failed to receive skb, len=0 WARNING: CPU: 0 PID: 522 at net/iucv/af_iucv.c:2039 afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039 CPU: 0 PID: 522 Comm: syz-executor091 Not tainted 5.10.0-rc1-syzkaller-07082-g55027a88ec9f #0 Hardware name: IBM 3906 M04 701 (KVM/Linux) Call Trace: [<00000000b87ea538>] afiucv_hs_rcv+0x178/0x190 net/iucv/af_iucv.c:2039 ([<00000000b87ea534>] afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039) [<00000000b796533e>] __netif_receive_skb_one_core+0x13e/0x188 net/core/dev.c:5315 [<00000000b79653ce>] __netif_receive_skb+0x46/0x1c0 net/core/dev.c:5429 [<00000000b79655fe>] netif_receive_skb_internal+0xb6/0x220 net/core/dev.c:5534 [<00000000b796ac3a>] netif_receive_skb+0x42/0x318 net/core/dev.c:5593 [<00000000b6fd45f4>] tun_rx_batched.isra.0+0x6fc/0x860 drivers/net/tun.c:1485 [<00000000b6fddc4e>] tun_get_user+0x1c26/0x27f0 drivers/net/tun.c:1939 [<00000000b6fe0f00>] tun_chr_write_iter+0x158/0x248 drivers/net/tun.c:1968 [<00000000b4f22bfa>] call_write_iter include/linux/fs.h:1887 [inline] [<00000000b4f22bfa>] new_sync_write+0x442/0x648 fs/read_write.c:518 [<00000000b4f238fe>] vfs_write.part.0+0x36e/0x5d8 fs/read_write.c:605 [<00000000b4f2984e>] vfs_write+0x10e/0x148 fs/read_write.c:615 [<00000000b4f29d0e>] ksys_write+0x166/0x290 fs/read_write.c:658 [<00000000b8dc4ab4>] system_call+0xe0/0x28c arch/s390/kernel/entry.S:415 Last Breaking-Event-Address: [<00000000b8dc64d4>] __s390_indirect_jump_r14+0x0/0xc Malformed RX packets shouldn't generate any warnings because debugging info already flows to dropmon via the kfree_skb(). Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:21 -08:00
Jakub Kicinski	14a6daf3a4	Merge branch 's390-qeth-updates-2021-01-28' Julian Wiedmann says: ==================== s390/qeth: updates 2021-01-28 Nothing special, mostly fine-tuning and follow-on cleanups for earlier fixes. ==================== Link: https://lore.kernel.org/r/20210128112551.18780-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:36:01 -08:00
Julian Wiedmann	d6e5150315	s390/qeth: don't fake a TX completion interrupt after TX error When do_qdio() returns with an unexpected error, qeth_flush_buffers() kicks off a recovery action. In such a case there's no point in starting TX completion processing, the device gets torn down anyway. So take a closer look at do_qdio()'s return value, and skip the TX completion processing accordingly. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:35:58 -08:00
Julian Wiedmann	a667fee181	s390/qeth: make cast type selection for af_iucv skbs robust As part of the TX queue selection for af_iucv skbs, qeth_l3_get_cast_type_rcu() ends up calling qeth_get_ether_cast_type(). Which is rather fragile, since such skbs don't have a proper ETH header and we rely on it being zeroed out in the right places. Add a separate case for ETH_P_AF_IUCV instead that does the right thing. When later building the HW header for such skbs, don't hard-code the cast type but follow the same path as for other protocol types. Here the cast type should naturally come from the skb's queue mapping. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:35:58 -08:00
Julian Wiedmann	c61dff3c1e	s390/qeth: pass proto to qeth_l3_get_cast_type() qeth_l3_hard_start_xmit() already determined the skb's proto. Avoid doing so a second time when it calls qeth_l3_get_cast_type(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:35:58 -08:00
Julian Wiedmann	17f3a8b5f5	s390/qeth: remove qeth_get_ip_version() Replace our home-grown helper with the more robust vlan_get_protocol(). This is pretty much a 1:1 replacement, we just need to pass around a proper ETH_P_* everyhwere and convert the old value range. For readability also convert the protocol checks in qeth_l3_hard_start_xmit() to a switch statement. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:35:58 -08:00
Julian Wiedmann	ea12f1b3c8	s390/qeth: clean up load/remove code for disciplines We have two usage patterns: 1. get & ->setup() a new discipline, or 2. ->remove() & put the currently loaded one. Add corresponding helpers that hide the internals & error handling. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:35:57 -08:00
Jakub Kicinski	699e4bc8c3	Merge branch 'net-ipa-hardware-pipeline-cleanup-fixes' Alex Elder says: ==================== net: ipa: hardware pipeline cleanup fixes There is a procedure currently referred to as a "tag process" that is performed to clear the IPA hardware pipeline--either at the time of a modem crash, or when suspending modem GSI channels. One thing done in this procedure is issuing a command that sends a data packet originating from the AP->command TX endpoint, destined for the AP<-LAN RX (default) endpoint. And although we currently wait for the send to complete, we do not wait for the packet to be received. But the pipeline can't be assumed clear until we have actually received this packet. This series addresses this by detecting when the pipeline-clearing packet has been received, and using a completion to allow a waiter to know when that has happened. This uses the IPA status capability (which sends an extra status buffer for certain packets). It also uses the ability to supply a "tag" with a packet, which will be delivered with the packet's status buffer. We tag the data packet that's sent to clear the pipeline, and use the receipt of a status buffer associated with a tagged packet to determine when that packet has arrived. "Tag status" just desribes one aspect of this procedure, so some symbols are renamed to be more like "pipeline clear" so they better describe the larger purpose. Finally, two functions used in this code don't use their arguments, so those arguments are removed. ==================== Link: https://lore.kernel.org/r/20210126185703.29087-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:30 -08:00
Alex Elder	070740d389	net: ipa: don't pass size to ipa_cmd_transfer_add() The only time we transfer data (rather than issuing a command) out of the AP->command TX endpoint is when we're clearing the hardware pipeline. All that's needed is a "small" data buffer, and its contents aren't even important. For convenience, we just transfer a command structure in this case (it's already mapped for DMA). The TRE is added to a transaction using ipa_cmd_ip_tag_status_add(), but we ignore the size value provided to that function. So just get rid of the size argument. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:26 -08:00
Alex Elder	792b75b147	net: ipa: don't pass tag value to ipa_cmd_ip_tag_status_add() We only send a tagged packet from the AP->command TX endpoint when we're clearing the hardware pipeline. And when we receive the tagged packet we don't care what the actual tag value is. Stop passing a tag value to ipa_cmd_ip_tag_status_add(), and just encode 0 as the tag sent. Fix the function that encodes the tag so it uses the proper byte ordering. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:26 -08:00
Alex Elder	51c48ce264	net: ipa: signal when tag transfer completes There are times, such as when the modem crashes, when we issue commands to clear the IPA hardware pipeline. These commands include a data transfer command that delivers a small packet directly to the default (AP<-LAN RX) endpoint. The places that do this wait for the transactions that contain these commands to complete, but the pipeline can't be assumed clear until the sent packet has been received. The small transfer will be delivered with a status structure, and that status will indicate its tag is valid. This is the only place we send a tagged packet, so we use the tag to determine when the pipeline clear packet has arrived. Add a completion to the IPA structure to to be used to signal the receipt of a pipeline clear packet. Create a new function ipa_cmd_pipeline_clear_wait() that will wait for that completion. Reinitialize the completion whenever pipeline clear commands are added to a transaction. Extend ipa_endpoint_status_tag() to check whether a packet whose status contains a valid tag was sent from the AP->command TX endpoint, and if so, signal the new IPA completion. Have all callers of ipa_cmd_pipeline_clear_add() wait for the pipeline clear indication after the transaction that clears the pipeline has completed. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:26 -08:00
Alex Elder	f6aba7b519	net: ipa: drop packet if status has valid tag Introduce ipa_endpoint_status_tag(), which returns true if received status indicates its tag field is valid. The endpoint parameter is not yet used. Call this from ipa_status_drop_packet(), and drop the packet if the status indicates the tag was valid. Pass the endpoint pointer to ipa_status_drop_packet(), and rename it ipa_endpoint_status_drop(). The endpoint will be used in the next patch. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:26 -08:00
Alex Elder	162fbc6f45	net: ipa: minor update to handling of packet with status Rearrange some comments and assignments made when handling a packet that is received with status, aiming to improve understandability. Use DIV_ROUND_CLOSEST() to get a better per-packet true size estimate. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:25 -08:00
Alex Elder	aa56e3e5cd	net: ipa: rename "tag status" symbols There is a set of functions and symbols related to performing "tag_process" immediate commands to clear the IPA pipeline. The name is related to one of the commands issued when doing this, but it doesn't really convey the overall purpose of taking this action. The purpose is to take some steps to "clear out" the hardware pipeline, and to wait until that process completes, to ensure the IPA hardware is in a well-defined state. Rename these symbols to use "pipeline_clear" in their names instead. Add some comments to explain a bit more about what's going on. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:23:25 -08:00
Jesper Dangaard Brouer	28af22c6c8	net: adjust net_device layout for cacheline usage The current layout of net_device is not optimal for cacheline usage. The member adj_list.lower linked list is split between cacheline 2 and 3. The ifindex is placed together with stats (struct net_device_stats), although most modern drivers don't update this stats member. The members netdev_ops, mtu and hard_header_len are placed on three different cachelines. These members are accessed for XDP redirect into devmap, which were noticeably with perf tool. When not using the map redirect variant (like TC-BPF does), then ifindex is also used, which is placed on a separate fourth cacheline. These members are also accessed during forwarding with regular network stack. The members priv_flags and flags are on fast-path for network stack transmit path in __dev_queue_xmit (currently located together with mtu cacheline). This patch creates a read mostly cacheline, with the purpose of keeping the above mentioned members on the same cacheline. Some netdev_features_t members also becomes part of this cacheline, which is on purpose, as function netif_skb_features() is on fast-path via validate_xmit_skb(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/161168277983.410784.12401225493601624417.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 20:19:06 -08:00

... 3 4 5 6 7 ...

984636 Commits