linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-29 15:14:18 +08:00

Author	SHA1	Message	Date
Antoine Tenart	da80aa52d0	net: phy: move the mscc driver to its own directory The MSCC PHY driver is growing, with lots of space consuming features (firmware support, full initialization, MACsec...). It's becoming hard to read and navigate in its source code. This patch moves the MSCC driver to its own directory, without modifying anything, as a preparation for splitting up its features into dedicated files. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:06:45 -07:00
David S. Miller	3d572b2308	Merge branch 'RED-Introduce-an-ECN-tail-dropping-mode' Petr Machata says: ==================== RED: Introduce an ECN tail-dropping mode When the RED qdisc is currently configured to enable ECN, the RED algorithm is used to decide whether a certain SKB should be marked. If that SKB is not ECN-capable, it is early-dropped. It is also possible to keep all traffic in the queue, and just mark the ECN-capable subset of it, as appropriate under the RED algorithm. Some switches support this mode, and some installations make use of it. There is currently no way to put the RED qdiscs to this mode. Therefore this patchset adds a new RED flag, TC_RED_TAILDROP. When the qdisc is configured with this flag, non-ECT traffic is enqueued (and tail-dropped when the queue size is exhausted) instead of being early-dropped. Unfortunately, adding a new RED flag is not as simple as it sounds. RED flags are passed in tc_red_qopt.flags. However RED neglects to validate the flag field, and just copies it over wholesale to its internal structure, and later dumps it back. A broken userspace can therefore configure a RED qdisc with arbitrary unsupported flags, and later expect to see the flags on qdisc dump. The current ABI thus allows storage of 5 bits of custom data along with the qdisc instance. GRED, SFQ and CHOKE qdiscs are in the same situation. (GRED validates VQ flags, but not the flags for the main queue.) E.g. if SFQ ever needs to support TC_RED_ADAPTATIVE, it needs another way of doing it, and at the same time it needs to retain the possibility to store 6 bits of uninterpreted data. For RED, this problem is resolved in patch #2, which adds a new attribute, and a way to separate flags from userbits that can be reused by other qdiscs. The flag itself and related behavioral changes are added in patch To test the new feature, patch #1 first introduces a TDC testsuite that covers the existing RED flags. Patch #5 later extends it with taildrop coverage. Patch #6 contains a forwarding selftest for the offloaded datapath. To test the SW datapath, I took the mlxsw selftest and adapted it in mostly obvious ways. The test is stable enough to verify that RED, ECN and ECN taildrop actually work. However, I have no confidence in its portability to other people's machines or mildly different configurations. I therefore do not find it suitable for upstreaming. GRED and CHOKE can use the same method as RED if they ever need to support extra flags. SFQ uses the length of TCA_OPTIONS to dispatch on binary control structure version, and would therefore need a different approach. v2: - Patch #1 - Require nsPlugin in each RED test - Match end-of-line to catch cases of more flags reported than requested - Patch #2: - Replaced with another patch. - Patch #3: - Fix red_use_taildrop() condition in red_enqueue switch for probabilistic case. - Patch #5: - Require nsPlugin in each RED test - Match end-of-line to catch cases of more flags reported than requested - Add a test for creation of non-ECN taildrop, which should fail ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:47 -07:00
Petr Machata	63f3c1d06f	selftests: mlxsw: RED: Test RED ECN nodrop offload Extend RED testsuite to cover the new nodrop mode of RED-ECN. This test is really similar to ECN test, diverging only in the last step, where UDP traffic should go to backlog instead of being dropped. Thus extract a common helper, ecn_test_common(), make do_ecn_test() into a relatively simple wrapper, and add another one, do_ecn_nodrop_test(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Petr Machata	058e56ac9e	selftests: qdiscs: RED: Add nodrop tests Add tests for the new "nodrop" flag. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Petr Machata	8040c96b4f	mlxsw: spectrum_qdisc: Offload RED ECN nodrop mode RED ECN nodrop mode means that non-ECT traffic should not be early-dropped, but enqueued normally instead. In Spectrum systems, this is achieved by disabling CWTPM.ew (enable WRED) for a given traffic class. So far CWTPM.ew was unconditionally enabled. Instead disable it when the RED qdisc is in nodrop mode. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Petr Machata	0a7fad2376	net: sched: RED: Introduce an ECN nodrop mode When the RED Qdisc is currently configured to enable ECN, the RED algorithm is used to decide whether a certain SKB should be marked. If that SKB is not ECN-capable, it is early-dropped. It is also possible to keep all traffic in the queue, and just mark the ECN-capable subset of it, as appropriate under the RED algorithm. Some switches support this mode, and some installations make use of it. To that end, add a new RED flag, TC_RED_NODROP. When the Qdisc is configured with this flag, non-ECT traffic is enqueued instead of being early-dropped. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Petr Machata	14bc175d9c	net: sched: Allow extending set of supported RED flags The qdiscs RED, GRED, SFQ and CHOKE use different subsets of the same pool of global RED flags. These are passed in tc_red_qopt.flags. However none of these qdiscs validate the flag field, and just copy it over wholesale to internal structures, and later dump it back. (An exception is GRED, which does validate for VQs -- however not for the main setup.) A broken userspace can therefore configure a qdisc with arbitrary unsupported flags, and later expect to see the flags on qdisc dump. The current ABI therefore allows storage of several bits of custom data to qdisc instances of the types mentioned above. How many bits, depends on which flags are meaningful for the qdisc in question. E.g. SFQ recognizes flags ECN and HARDDROP, and the rest is not interpreted. If SFQ ever needs to support ADAPTATIVE, it needs another way of doing it, and at the same time it needs to retain the possibility to store 6 bits of uninterpreted data. Likewise RED, which adds a new flag later in this patchset. To that end, this patch adds a new function, red_get_flags(), to split the passed flags of RED-like qdiscs to flags and user bits, and red_validate_flags() to validate the resulting configuration. It further adds a new attribute, TCA_RED_FLAGS, to pass arbitrary flags. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Petr Machata	10ef49bdcc	selftests: qdiscs: Add TDC test for RED Add a handful of tests for creating RED with different flags. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 21:03:46 -07:00
Edward Cree	085793f038	sfc: support configuring vf spoofchk on EF10 VFs Corresponds to the MAC_SPOOFING_TX privilege in the hardware. Some firmware versions on some cards don't support the feature, so check the TX_MAC_SECURITY capability and fail EOPNOTSUPP if trying to enable spoofchk on a NIC that doesn't support it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:57:25 -07:00
David S. Miller	fa83820e5c	Merge branch 'net-phy-XLGMII-define-and-usage-in-PHYLINK' Jose Abreu says: ==================== net: phy: XLGMII define and usage in PHYLINK Adds XLGMII defines and usage in PHYLINK. Patch 1/2, adds the define for it, whilst 2/2 adds the usage of it in PHYLINK. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:55:12 -07:00
Jose Abreu	1671c42d48	net: phylink: Add XLGMII support Add XLGMII interface and the list of XLGMII speeds to PHYLINK. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:55:12 -07:00
Jose Abreu	58b05e58d1	net: phy: Add XLGMII interface define Add a define for XLGMII interface. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:55:12 -07:00
Colin Ian King	f1dc7460eb	net: ena: ethtool: clean up minor indentation issue There is a statement that is indented incorrectly, remove a space. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:50:27 -07:00
Vladimir Oltean	ec8582d134	net: dsa: sja1105: move MAC configuration to .phylink_mac_link_up The switches supported so far by the driver only have non-SerDes ports, so they should be configured in the PHYLINK callback that provides the resolved PHY link parameters. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:49:48 -07:00
Shahjada Abul Husain	724d021566	cxgb4: update T5/T6 adapter register ranges Add more T5/T6 registers to be collected in register dump: 1. MPS register range 0x9810 to 0x9864 and 0xd000 to 0xd004. 2. NCSI register range 0x1a114 to 0x1a130 and 0x1a138 to 0x1a1c4. Signed-off-by: Shahjada Abul Husain <shahjada@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-14 20:44:46 -07:00
David S. Miller	94229d4523	mlx5-updates-2020-03-13 Misc update to mlx5 core and E-Switch driver: 1) Blue-Field, Update VF vports config when num of VFs changed From Bodon, Various misc cleanups and refactoring for vport enabling/disabling routines to allow them to be called dynamically and not only on E-Switch load. This will allow ECPF (ConnectX BlueField Smartnic) support for dynamic num vf changes and dynamic vport creation and configuration as introduced in "Update VF vports config when num of VFs changed" patch. 2) From Parav and Mark, trivial clean-ups. 3) Software steering support for flow table id as destination and a clean-up patch to remove unnecessary function stubs, from Alex. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl5sFqcACgkQSD+KveBX +j6KxAf+INpm/mpG4+Vk7loUw0C3xnchOMwIZhKuanWiqsbO+tWC8g1AYgpFRKdE gkE18EN0verQEqnol7N7+HhnHboqvmqT1/FOvOF9pRVpQWDEoE73oJzT6vF42u5M Hg/Rsh0Q+R6pjDr62/MOJrysCFip87GG6TDerWhQ3Fol6ZL8vHMaQkTdyxcVTIAP TidHQWs4Cc82tpfsIirBKIylcbBxbgj4kQE4Ov81hm6FE3h4ZV3rJ+dXp4WCj9TL PbSgtnZ1kc+GOryF8AfRU197bIrXFHfraog7Qi6hVspf8QK+w4Iz9wy3Z2SCAxVA dCc/DC29o0dOcUn72Xrqc2+6M2Jnyw== =ui+6 -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2020-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-03-13 Misc update to mlx5 core and E-Switch driver: 1) Blue-Field, Update VF vports config when num of VFs changed From Bodon, Various misc cleanups and refactoring for vport enabling/disabling routines to allow them to be called dynamically and not only on E-Switch load. This will allow ECPF (ConnectX BlueField Smartnic) support for dynamic num vf changes and dynamic vport creation and configuration as introduced in "Update VF vports config when num of VFs changed" patch. 2) From Parav and Mark, trivial clean-ups. 3) Software steering support for flow table id as destination and a clean-up patch to remove unnecessary function stubs, from Alex. ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-13 21:04:30 -07:00
David S. Miller	44ef976ab3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2020-03-13 The following pull-request contains BPF updates for your net-next tree. We've added 86 non-merge commits during the last 12 day(s) which contain a total of 107 files changed, 5771 insertions(+), 1700 deletions(-). The main changes are: 1) Add modify_return attach type which allows to attach to a function via BPF trampoline and is run after the fentry and before the fexit programs and can pass a return code to the original caller, from KP Singh. 2) Generalize BPF's kallsyms handling and add BPF trampoline and dispatcher objects to be visible in /proc/kallsyms so they can be annotated in stack traces, from Jiri Olsa. 3) Extend BPF sockmap to allow for UDP next to existing TCP support in order in order to enable this for BPF based socket dispatch, from Lorenz Bauer. 4) Introduce a new bpftool 'prog profile' command which attaches to existing BPF programs via fentry and fexit hooks and reads out hardware counters during that period, from Song Liu. Example usage: bpftool prog profile id 337 duration 3 cycles instructions llc_misses 4228 run_cnt 3403698 cycles (84.08%) 3525294 instructions # 1.04 insn per cycle (84.05%) 13 llc_misses # 3.69 LLC misses per million isns (83.50%) 5) Batch of improvements to libbpf, bpftool and BPF selftests. Also addition of a new bpf_link abstraction to keep in particular BPF tracing programs attached even when the applicaion owning them exits, from Andrii Nakryiko. 6) New bpf_get_current_pid_tgid() helper for tracing to perform PID filtering and which returns the PID as seen by the init namespace, from Carlos Neira. 7) Refactor of RISC-V JIT code to move out common pieces and addition of a new RV32G BPF JIT compiler, from Luke Nelson. 8) Add gso_size context member to __sk_buff in order to be able to know whether a given skb is GSO or not, from Willem de Bruijn. 9) Add a new bpf_xdp_output() helper which reuses XDP's existing perf RB output implementation but can be called from tracepoint programs, from Eelco Chaudron. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-13 20:52:03 -07:00
Alex Vesker	bc1a02884a	net/mlx5: DR, Remove unneeded functions deceleration Remove dummy functions declaration, the dummy functions are not needed since fs_dr is the only one to call mlx5dr and both fs_dr and dr files depend on the same config flag (MLX5_SW_STEERING). Fixes: `70605ea545` ("net/mlx5: DR, Expose APIs for direct rule managing") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:28 -07:00
Alex Vesker	de346f401a	net/mlx5: DR, Add support for flow table id destination action This action allows to go to a flow table based on the table id. Goto flow table id is required for supporting user space SW. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:26 -07:00
Parav Pandit	0e6fa491e8	net/mlx5: Avoid deriving mlx5_core_dev second time All callers needs to work on mlx5_core_dev and it is already derived before calling mlx5_devlink_eswitch_check(). Hence, accept mlx5_core_dev in mlx5_devlink_eswitch_check(). Given that it works on mlx5_core_dev change helper function name to drop devlink prefix. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:24 -07:00
Parav Pandit	d6c8022dfb	net/mlx5: E-switch, Annotate esw state_lock mutex destroy Invoke mutex_destroy() to catch any esw state_lock errors. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:22 -07:00
Parav Pandit	2bb72e7e2a	net/mlx5: E-switch, Annotate termtbl_mutex mutex destroy Annotate mutex destroy to keep it symmetric to init sequence. It should be destroyed after its users (representor netdevices) are destroyed in below flow. esw_offloads_disable() esw_offloads_unload_rep() Hence, initialize the mutex before creating the representors which uses it. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:19 -07:00
Mark Bloch	5c2aa8ae3a	net/mlx5: Accept flow rules without match Allow passing NULL spec when creating a flow rule. Such rules will act as "catch all" flow rules. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:17 -07:00
Bodong Wang	4110fc59ea	net/mlx5: E-Switch, Refactor unload all reps per rep type Following introduction of per vport configuration of vport and rep, unload all reps per rep type is still needed as IB reps can be unloaded individually. However, a few internal functions exist purely for this purpose, merge them to a single function. This patch doesn't change any existing functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:15 -07:00
Bodong Wang	23bb50cf73	net/mlx5: E-Switch, Update VF vports config when num of VFs changed Currently, ECPF eswitch manager does one-time only configuration for VF vports when device switches to offloads mode. However, when num of VFs changed from host side, driver doesn't update VF vports configurations. Hence, perform VFs vport configuration update whenever num_vfs change event occurs. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:12 -07:00
Bodong Wang	c2d7712ca3	net/mlx5: E-Switch, Introduce per vport configuration for eswitch modes Both legacy and offload modes require vport setup, only offload mode requires rep setup. Before this patch, vport and rep operations are separated applied to all relevant vports in different stages. Change to use per vport configuration, so that vport and rep operations are modularized per vport. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:10 -07:00
Bodong Wang	d7c92cb56f	net/mlx5: E-switch, Make vport setup/cleanup sequence symmetric Vport enable and disable sequence is incorrect. It should be: enable() esw_vport_setup_acl, esw_vport_setup, esw_vport_enable_qos. disable() esw_vport_disable_qos, esw_vport_cleanup, esw_vport_cleanup_acl. Instead of having two setup functions for port and acl, merge acl setup to port setup function. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:08 -07:00
Bodong Wang	878a73318a	net/mlx5: E-Switch, Prepare for vport enable/disable refactor Rename esw_apply_vport_config() to esw_vport_setup(), and add new helper function esw_vport_cleanup() to make them symmetric. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:05 -07:00
Bodong Wang	a9814d7fde	net/mlx5: E-Switch, Remove redundant warning when QoS enable failed esw_vport_enable_qos can return error in cases below: 1. QoS is already enabled. Warnning is useless in this case. 2. Create scheduling element cmd failed. There is already a warning. Remove the redundant warnning if esw_vport_enable_qos returns err. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:03 -07:00
Bodong Wang	14c844cbf3	net/mlx5: E-Switch, Hold mutex when querying drop counter in legacy mode Consider scenario below, CPU 1 is at risk to query already destroyed drop counters. Need to apply the same state mutex when disabling vport. +-------------------------------+-------------------------------------+ \| CPU 0 \| CPU 1 \| +-------------------------------+-------------------------------------+ \| mlx5_device_disable_sriov \| mlx5e_get_vf_stats \| \| mlx5_eswitch_disable \| mlx5_eswitch_get_vport_stats \| \| esw_disable_vport \| mlx5_eswitch_query_vport_drop_stats \| \| mlx5_fc_destroy(drop_counter) \| mlx5_fc_query(drop_counter) \| +-------------------------------+-------------------------------------+ Fixes: `b8a0dbe3a9` ("net/mlx5e: E-switch, Add steering drop counters") Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:26:01 -07:00
Bodong Wang	86f9453c5f	net/mlx5: E-Switch, Remove redundant check of eswitch manager cap esw_vport_create_legacy_acl_tables bails out immediately for eswitch manager, hence remove all the check of esw manager cap after. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-13 16:25:59 -07:00
Daniel Borkmann	832165d225	Merge branch 'bpf-core-fixes' Andrii Nakryiko says: ==================== This patch set fixes bug in CO-RE relocation candidate finding logic, which currently allows matching against forward declarations, functions, and other named types, even though it makes no sense to even attempt. As part of verifying the fix, add test using vmlinux.h with preserve_access_index attribute and utilizing struct pt_regs heavily to trace nanosleep syscall using 5 different types of tracing BPF programs. This test also demonstrated problems using struct pt_regs in syscall tracepoints and required a new set of macro, which were added in patch #3 into bpf_tracing.h. Patch #1 fixes annoying issue with selftest failure messages being out of sync. v1->v2: - drop unused handle__probed() function (Martin). ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2020-03-13 23:31:14 +01:00
Andrii Nakryiko	acbd06206b	selftests/bpf: Add vmlinux.h selftest exercising tracing of syscalls Add vmlinux.h generation to selftest/bpf's Makefile. Use it from newly added test_vmlinux to trace nanosleep syscall using 5 different types of programs: - tracepoint; - raw tracepoint; - raw tracepoint w/ direct memory reads (tp_btf); - kprobe; - fentry. These programs are realistic variants of real-life tracing programs, excercising vmlinux.h's usage with tracing applications. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-5-andriin@fb.com	2020-03-13 23:30:53 +01:00
Andrii Nakryiko	b8ebce86ff	libbpf: Provide CO-RE variants of PT_REGS macros Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first argument. After that, reading any of pt_regs fields requires bpf_probe_read(), even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to read necessary fields. This provides relocatable architecture-agnostic pt_regs field accesses. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com	2020-03-13 23:30:53 +01:00
Andrii Nakryiko	d121e1d34b	libbpf: Ignore incompatible types with matching name during CO-RE relocation When finding target type candidates, ignore forward declarations, functions, and other named types of incompatible kind. Not doing this can cause false errors. See [0] for one such case (due to struct pt_regs forward declaration). [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645 Fixes: `ddc7c30426` ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: Wenbo Zhang <ethercflow@gmail.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com	2020-03-13 23:30:53 +01:00
Andrii Nakryiko	3e2671fb9a	selftests/bpf: Ensure consistent test failure output printf() doesn't seem to honor using overwritten stdout/stderr (as part of stdio hijacking), so ensure all "standard" invocations of printf() do fprintf(stdout, ...) instead. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-2-andriin@fb.com	2020-03-13 23:30:53 +01:00
Jakub Sitnicki	30b4cb36b1	selftests/bpf: Fix spurious failures in accept due to EAGAIN Andrii Nakryiko reports that sockmap_listen test suite is frequently failing due to accept() calls erroring out with EAGAIN: ./test_progs:connect_accept_thread:733: accept: Resource temporarily unavailable connect_accept_thread:FAIL:733 This is because we are using a non-blocking listening TCP socket to accept() connections without polling on the socket. While at first switching to blocking mode seems like the right thing to do, this could lead to test process blocking indefinitely in face of a network issue, like loopback interface being down, as Andrii pointed out. Hence, stick to non-blocking mode for TCP listening sockets but with polling for incoming connection for a limited time before giving up. Apply this approach to all socket I/O calls in the test suite that we expect to block indefinitely, that is accept() for TCP and recv() for UDP. Fixes: `44d28be2b8` ("selftests/bpf: Tests for sockmap/sockhash holding listening sockets") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200313161049.677700-1-jakub@cloudflare.com	2020-03-13 21:37:06 +01:00
Tobias Klauser	bcd66b10b5	tools/bpf: Move linux/types.h for selftests and bpftool Commit `fe4eb069ed` ("bpftool: Use linux/types.h from source tree for profiler build") added a build dependency on tools/testing/selftests/bpf to tools/bpf/bpftool. This is suboptimal with respect to a possible stand-alone build of bpftool. Fix this by moving tools/testing/selftests/bpf/include/uapi/linux/types.h to tools/include/uapi/linux/types.h. This requires an adjustment in the include search path order for the tests in tools/testing/selftests/bpf so that tools/include/linux/types.h is selected when building host binaries and tools/include/uapi/linux/types.h is selected when building bpf binaries. Verified by compiling bpftool and the bpf selftests on x86_64 with this change. Fixes: `fe4eb069ed` ("bpftool: Use linux/types.h from source tree for profiler build") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200313113105.6918-1-tklauser@distanz.ch	2020-03-13 20:56:34 +01:00
Jules Irenge	dcce11d545	bpf: Add missing annotations for __bpf_prog_enter() and __bpf_prog_exit() Sparse reports a warning at __bpf_prog_enter() and __bpf_prog_exit() warning: context imbalance in __bpf_prog_enter() - wrong count at exit warning: context imbalance in __bpf_prog_exit() - unexpected unlock The root cause is the missing annotation at __bpf_prog_enter() and __bpf_prog_exit() Add the missing __acquires(RCU) annotation Add the missing __releases(RCU) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200311010908.42366-2-jbi.octave@gmail.com	2020-03-13 20:55:07 +01:00
Carlos Neira	5996a587a4	bpf_helpers_doc.py: Fix warning when compiling bpftool When compiling bpftool the following warning is found: "declaration of 'struct bpf_pidns_info' will not be visible outside of this function." This patch adds struct bpf_pidns_info to type_fwds array to fix this. Fixes: `b4490c5c4e` ("bpf: Added new helper bpf_get_ns_current_pid_tgid") Signed-off-by: Carlos Neira <cneirabustos@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313154650.13366-1-cneirabustos@gmail.com	2020-03-13 20:53:40 +01:00
Andrii Nakryiko	4e1fd25d19	selftests/bpf: Fix usleep() implementation nanosleep syscall expects pointer to struct timespec, not nanoseconds directly. Current implementation fulfills its purpose of invoking nanosleep syscall, but doesn't really provide sleeping capabilities, which can cause flakiness for tests relying on usleep() to wait for something. Fixes: ec12a57b822c ("selftests/bpf: Guarantee that useep() calls nanosleep() syscall") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200313061837.3685572-1-andriin@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Alexei Starovoitov	1afbcd9466	Merge branch 'generalize-bpf-ksym' Jiri Olsa says: ==================== this patchset adds trampoline and dispatcher objects to be visible in /proc/kallsyms. $ sudo cat /proc/kallsyms \| tail -20 ... ffffffffa050f000 t bpf_prog_5a2b06eab81b8f51 [bpf] ffffffffa0511000 t bpf_prog_6deef7357e7b4530 [bpf] ffffffffa0542000 t bpf_trampoline_13832 [bpf] ffffffffa0548000 t bpf_prog_96f1b5bf4e4cc6dc_mutex_lock [bpf] ffffffffa0572000 t bpf_prog_d1c63e29ad82c4ab_bpf_prog1 [bpf] ffffffffa0585000 t bpf_prog_e314084d332a5338__dissect [bpf] ffffffffa0587000 t bpf_prog_59785a79eac7e5d2_mutex_unlock [bpf] ffffffffa0589000 t bpf_prog_d0db6e0cac050163_mutex_lock [bpf] ffffffffa058d000 t bpf_prog_d8f047721e4d8321_bpf_prog2 [bpf] ffffffffa05df000 t bpf_trampoline_25637 [bpf] ffffffffa05e3000 t bpf_prog_d8f047721e4d8321_bpf_prog2 [bpf] ffffffffa05e5000 t bpf_prog_3b185187f1855c4c [bpf] ffffffffa05e7000 t bpf_prog_d8f047721e4d8321_bpf_prog2 [bpf] ffffffffa05eb000 t bpf_prog_93cebb259dd5c4b2_do_sys_open [bpf] ffffffffa0677000 t bpf_dispatcher_xdp [bpf] v5 changes: - keeping just 1 bpf_tree for all the objects and adding flag to recognize bpf_objects when searching for exception tables [Alexei] - no need for is_bpf_image_address call in kernel_text_address [Alexei] - removed the bpf_image tree, because it's no longer needed v4 changes: - add trampoline and dispatcher to kallsyms once the it's allocated [Alexei] - omit the symbols sorting for kallsyms [Alexei] - small title change in one patch [Song] - some function renames: bpf_get_prog_name to bpf_prog_ksym_set_name bpf_get_prog_addr_region to bpf_prog_ksym_set_addr - added acks to changelogs - I checked and there'll be conflict on perftool side with upcoming changes from Adrian Hunter (text poke events), so I think it's better if Arnaldo takes the perf changes via perf tree and we will solve all conflicts there v3 changes: - use container_of directly in bpf_get_ksym_start [Daniel] - add more changelog explanations for ksym addresses [Daniel] v2 changes: - omit extra condition in __bpf_ksym_add for sorting code (Andrii) - rename bpf_kallsyms_tree_ops to bpf_ksym_tree (Andrii) - expose only executable code in kallsyms (Andrii) - use full trampoline key as its kallsyms id (Andrii) - explained the BPF_TRAMP_REPLACE case (Andrii) - small format changes in bpf_trampoline_link_prog/bpf_trampoline_unlink_prog (Andrii) - propagate error value in bpf_dispatcher_update and update kallsym if it's successful (Andrii) - get rid of __always_inline for bpf_ksym_tree callbacks (Andrii) - added KSYMBOL notification for bpf_image add/removal - added perf tools changes to properly display trampoline/dispatcher ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Jiri Olsa	7ac88eba18	bpf: Remove bpf_image tree Now that we have all the objects (bpf_prog, bpf_trampoline, bpf_dispatcher) linked in bpf_tree, there's no need to have separate bpf_image tree for images. Reverting the bpf_image tree together with struct bpf_image, because it's no longer needed. Also removing bpf_image_alloc function and adding the original bpf_jit_alloc_exec_page interface instead. The kernel_text_address function can now rely only on is_bpf_text_address, because it checks the bpf_tree that contains all the objects. Keeping bpf_image_ksym_add and bpf_image_ksym_del because they are useful wrappers with perf's ksymbol interface calls. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-13-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Jiri Olsa	517b75e44c	bpf: Add dispatchers to kallsyms Adding dispatchers to kallsyms. It's displayed as bpf_dispatcher_<NAME> where NAME is the name of dispatcher. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-12-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Jiri Olsa	a108f7dcfa	bpf: Add trampolines to kallsyms Adding trampolines to kallsyms. It's displayed as bpf_trampoline_<ID> [bpf] where ID is the BTF id of the trampoline function. Adding bpf_image_ksym_add/del functions that setup the start/end values and call KSYMBOL perf events handlers. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-11-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Jiri Olsa	dba122fb5e	bpf: Add bpf_ksym_add/del functions Separating /proc/kallsyms add/del code and adding bpf_ksym_add/del functions for that. Moving bpf_prog_ksym_node_add/del functions to __bpf_ksym_add/del and changing their argument to 'struct bpf_ksym' object. This way we can call them for other bpf objects types like trampoline and dispatcher. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-10-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Jiri Olsa	cbd76f8d5a	bpf: Add prog flag to struct bpf_ksym object Adding 'prog' bool flag to 'struct bpf_ksym' to mark that this object belongs to bpf_prog object. This change allows having bpf_prog objects together with other types (trampolines and dispatchers) in the single bpf_tree. It's used when searching for bpf_prog exception tables by the bpf_prog_ksym_find function, where we need to get the bpf_prog pointer. >From now we can safely add bpf_ksym support for trampoline or dispatcher objects, because we can differentiate them from bpf_prog objects. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-9-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:52 -07:00
Andrii Nakryiko	9886866836	bpf: Abstract away entire bpf_link clean up procedure Instead of requiring users to do three steps for cleaning up bpf_link, its anon_inode file, and unused fd, abstract that away into bpf_link_cleanup() helper. bpf_link_defunct() is removed, as it shouldn't be needed as an individual operation anymore. v1->v2: - keep bpf_link_cleanup() static for now (Daniel). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313002128.2028680-1-andriin@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:51 -07:00
Jiri Olsa	eda0c92902	bpf: Add bpf_ksym_find function Adding bpf_ksym_find function that is used bpf bpf address lookup functions: __bpf_address_lookup is_bpf_text_address while keeping bpf_prog_kallsyms_find to be used only for lookup of bpf_prog objects (will happen in following changes). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200312195610.346362-8-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:51 -07:00
Andrii Nakryiko	4cd729fa02	selftests/bpf: Make tcp_rtt test more robust to failures Switch to non-blocking accept and wait for server thread to exit before proceeding. I noticed that sometimes tcp_rtt server thread failure would "spill over" into other tests (that would run after tcp_rtt), probably just because server thread exits much later and tcp_rtt doesn't wait for it. v1->v2: - add usleep() while waiting on initial non-blocking accept() (Stanislav); Fixes: `8a03222f50` ("selftests/bpf: test_progs: fix client/server race in tcp_rtt") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20200311222749.458015-1-andriin@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-03-13 12:49:51 -07:00

1 2 3 4 5 ...

903399 Commits