Commit Graph

1138350 Commits

Author SHA1 Message Date
Saeed Mahameed
e07c4924a7 net/mlx5e: ethtool: get_link_ext_stats for PHY down events
Implement ethtool_op get_link_ext_stats for PHY down events

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
2022-11-12 02:20:20 -08:00
Oz Shlomo
05bb74c29d net/mlx5e: CT, optimize pre_ct table lookup
The pre_ct table realizes in hardware the act_ct cache logic, bypassing
the CT table if the ct state was already set by a previous ct lookup.
As such, the pre_ct table will always miss for chain 0 filters.

Optimize the pre_ct table lookup for rules installed on chain 0.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:20 -08:00
Tariq Toukan
3413615330 net/mlx5e: kTLS, Use a single async context object per a callback bulk
A single async context object is sufficient to wait for the completions
of many callbacks.  Switch to using one instance per a bulk of commands.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:20 -08:00
Tariq Toukan
4d78a2ebbd net/mlx5e: kTLS, Remove unnecessary per-callback completion
Waiting on a completion object for each callback before cleaning up their
async contexts is not necessary, as this is already implied in the
mlx5_cmd_cleanup_async_ctx() API.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:20 -08:00
Tariq Toukan
1f74399fd1 net/mlx5e: kTLS, Remove unused work field
Work field in struct mlx5e_async_ctx is not used. Remove it.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Roi Dayan
9897229061 net/mlx5e: TC, Remove redundant WARN_ON()
The case where the packet is not offloaded and needs to be restored
to slow path and couldn't find expected tunnel information should not
dump a call trace to the user. there is a debug call.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Guy Truzman
e74ae1faeb net/mlx5e: Add error flow when failing update_rx
Up until now, return value of update_rx was ignored. Therefore, flow
continues even if it fails. Add error flow in case of update_rx fails in
mlx5e_open_locked, mlx5i_open and mlx5i_pkey_open.

Signed-off-by: Guy Truzman <gtruzman@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Tariq Toukan
38438d39a9 net/mlx5e: Move params kernel log print to probe function
Params info print was meant to be printed on load.
With time, new calls to mlx5e_init_rq_type_params and
mlx5e_build_rq_params were added, mistakenly printing
the params once again.

Move the print to were it belongs, in mlx5e_probe.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Ofer Levi
2c925db0a7 net/mlx5e: Support enhanced CQE compression
CQE compression feature improves performance by reducing PCI bandwidth
bottleneck on CQEs write.
Enhanced CQE compression introduced in ConnectX-6 and it aims to reduce
CPU utilization of SW side packets decompression by eliminating the
need to rewrite ownership bit, which is likely to cost a cache-miss, is
replaced by validity byte handled solely by HW.
Another advantage of the enhanced feature is that session packets are
available to SW as soon as a single CQE slot is filled, instead of
waiting for session to close, this improves packet latency from NIC to
host.

Performance:
Following are tested scenarios and reults comparing basic and enahnced
CQE compression.

setup: IXIA 100GbE connected directly to port 0 and port 1 of
ConnectX-6 Dx 100GbE dual port.

Case #1 RX only, single flow goes to single queue:
IRQ rate reduced by ~ 30%, CPU utilization improved by 2%.

Case #2 IP forwarding from port 1 to port 0 single flow goes to
single queue:
Avg latency improved from 60us to 21us, frame loss improved from 0.5% to 0.0%.

Case #3 IP forwarding from port 1 to port 0 Max Throughput IXIA sends
100%, 8192 UDP flows, goes to 24 queues:
Enhanced is equal or slightly better than basic.

Testing the basic compression feature with this patch shows there is
no perfrormance degradation of the basic compression feature.

Signed-off-by: Ofer Levi <oferle@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Gal Pressman
9458108040 net/mlx5e: Use clamp operation instead of open coding it
Replace the min/max operations with a single clamp.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Anisse Astier
60551e95a8 net/mlx5e: remove unused list in arfs
This is never used, and probably something that was intended to be used
before per-protocol hash tables were chosen instead.

Signed-off-by: Anisse Astier <anisse@astier.eu>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:19 -08:00
Eli Cohen
dd3dd7263c net/mlx5: Expose vhca_id to debugfs
hca_id is an identifier of an mlx5_core instance within the hardware.
This identifier may be required for troubleshooting.

Expose it to debugfs.

Example:

$ cat /sys/kernel/debug/mlx5/mlx5_core.sf.2/vhca_id
0x12

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:18 -08:00
Moshe Shemesh
71b75f0e02 net/mlx5: Unregister traps on driver unload flow
Before this patch, devlink traps are registered only on full driver
probe and unregistered on driver removal. As devlink traps are not
usable once driver functionality is unloaded, it should be unrgeistered
also on flows that unload the driver and then registered when loaded
back, e.g. devlink reload flow.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:18 -08:00
Colin Ian King
d23b928bef net/mlx5: Fix spelling mistake "destoy" -> "destroy"
There is a spelling mistake in an error message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:18 -08:00
Roi Dayan
ea645f97bc net/mlx5: Bridge, Use debug instead of warn if entry doesn't exists
There is no need for the warn if entry already removed.
Use debug print like in the update flow.
Also update the messages so user can identify if the it's
from the update flow or remove flow.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12 02:20:18 -08:00
Eric Dumazet
b548b17a93 tcp: tcp_wfree() refactoring
Use try_cmpxchg() (instead of cmpxchg()) in a more readable way.

oval = smp_load_acquire(&sk->sk_tsq_flags);
do {
	...
} while (!try_cmpxchg(&sk->sk_tsq_flags, &oval, nval));

Reduce indentation level.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221110190239.3531280-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:38:03 -08:00
Eric Dumazet
fac30731b9 tcp: adopt try_cmpxchg() in tcp_release_cb()
try_cmpxchg() is slighly more efficient (at least on x86),
and smp_load_acquire(&sk->sk_tsq_flags) could avoid a KCSAN report.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221110174829.3403442-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:37:49 -08:00
Ido Schimmel
3e35f26d33 bridge: Add missing parentheses
No changes in generated code.

Reported-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20221110085422.521059-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:34:55 -08:00
Jakub Kicinski
d77be49309 Merge branch 'dt-bindings-net-qcom-ipa-relax-some-restrictions'
Alex Elder says:

====================
dt-bindings: net: qcom,ipa: relax some restrictions

The first patch in this series simply removes an unnecessary
requirement in the IPA binding.  Previously, if the modem was doing
GSI firmware loading, the firmware name property was required to
*not* be present.  There is no harm in having the firmware name be
specified, so this restriction isn't needed.

The second patch restates a requirement on the "memory-region"
property more accurately.

These binding changes have no impact on existing code or DTS files.
These aren't really bug fixes, so no need to back-port.
====================

Link: https://lore.kernel.org/r/20221110195619.1276302-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:23:23 -08:00
Alex Elder
7a6ca44c1e dt-bindings: net: qcom,ipa: restate a requirement
Either the AP or modem loads GSI firmware.  If the modem-init
property is present, the modem loads it.  Otherwise, the AP loads
it, and in that case the memory-region property must be defined.

Currently this requirement is expressed as one or the other of the
modem-init or the memory-region property being required.  But it's
harmless for the memory-region to be present if the modem is loading
firmware (it'll just be ignored).

Restate the requirement so that the memory-region property is
required only if modem-init is not present.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:23:20 -08:00
Alex Elder
9d26628a4c dt-bindings: net: qcom,ipa: remove an unnecessary restriction
Commit d8604b209e ("dt-bindings: net: qcom,ipa: add firmware-name
property") added a requirement for a "firmware-name" property that
is more restrictive than necessary.

If the AP loads GSI firmware, the name of the firmware file to use
may optionally be provided via a "firmware-name" property.  If the
*modem* loads GSI firmware, "firmware-name" doesn't need to be
supplied--but it's harmless to do so (it will simply be ignored).

Remove the unnecessary restriction, and allow "firware-name" to be
supplied even if it's not needed.

Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:23:19 -08:00
Angelo Dureghello
45f22f2fdc net: dsa: mv88e6xxx: enable set_policy
Enabling set_policy capability for mv88e6321.

Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221110091027.998073-1-angelo.dureghello@timesys.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:21:29 -08:00
Jakub Kicinski
4a432e068c Merge branch 'mptcp-miscellaneous-refactoring-and-small-fixes'
Mat Martineau says:

====================
mptcp: Miscellaneous refactoring and small fixes

Patches 1-3 do some refactoring to more consistently handle sock casts,
and to remove some duplicate code. No functional changes.

Patch 4 corrects a variable name in a self test, but does not change
functionality since the same value gets used due to bash's
scoping rules.

Patch 5 rewords a comment.
====================

Link: https://lore.kernel.org/r/20221110232322.125068-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:51 -08:00
Mat Martineau
4373bf4b72 mptcp: Fix grammar in a comment
We kept getting initial patches from new contributors to remove a
duplicate 'the' (since grammar checking scripts flag it), but submitters
never followed up after code review.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:47 -08:00
Geliang Tang
31b4e63eb2 selftests: mptcp: use max_time instead of time
'time' is the local variable of run_test() function, while 'max_time' is
the local variable of do_transfer() function. So in do_transfer(),
$max_time should be used, not $time.

Please note that here $time == $max_time so the behaviour is not changed
but the right variable is used.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:47 -08:00
Geliang Tang
80638684e8 mptcp: get sk from msk directly
Use '(struct sock *)msk' to get 'sk' from 'msk' in a more direct way
instead of using '&msk->sk.icsk_inet.sk'.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:47 -08:00
Geliang Tang
73a0052a61 mptcp: change 'first' as a parameter
The function mptcp_subflow_process_delegated() uses the input ssk first,
while __mptcp_check_push() invokes the packet scheduler first.

So this patch adds a new parameter named 'first' for the function
__mptcp_subflow_push_pending() to deal with these two cases separately.

With this change, the code that invokes the packet scheduler in the
function __mptcp_check_push() can be removed, and replaced by invoking
__mptcp_subflow_push_pending() directly.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:46 -08:00
Geliang Tang
00df24f191 mptcp: use msk instead of mptcp_sk
Use msk instead of mptcp_sk(sk) in the functions where the variable
"msk = mptcp_sk(sk)" has been defined.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 21:19:46 -08:00
Jakub Kicinski
f4c4ca70de bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEET63h6RnJhTJHuKTjXOwUVIRcSScFAmNu2EkACgkQXOwUVIRc
 SSebKhAA0ffmp5jJgEJpQYNABGLYIJcwKkBrGClDbMJLtwCjevGZJajT9fpbCLb1
 eK6EIhdfR0NTO+0KtUVkZ8WMa81OmLEJYdTNtJfNE23ENMpssiAWhlhDF8AoXeKv
 Bo3j719gn3Cw9PWXQoircH3wpj+5RMDnjxy4iYlA5yNrvzC7XVmssMF+WALvQnuK
 CGrfR57hxdgmphmasRqeCzEoriwihwPsG3k6eQN8rf7ZytLhs90tMVgT9L3Cd2u9
 DafA0Xl8mZdz2mHhThcJhQVq4MUymZj44ufuHDiOs1j6nhUlWToyQuvegPOqxKti
 uLGtZul0ls+3UP0Lbrv1oEGU/MWMxyDz4IBc0EVs0k3ItQbmSKs6r9WuPFGd96Sb
 GHk68qFVySeLGN0LfKe3rCHJ9ZoIOPYJg9qT8Rd5bOhetgGwSsxZTxUI39BxkFup
 CEqwIDnts1TMU37GDjj+vssKW91k4jEzMZVtRfsL3J36aJs28k/Ez4AqLXg6WU6u
 ADqFaejVPcXbN9rX90onIYxxiL28gZSeT+i8qOPELZtqTQmNWz+tC/ySVuWnD8Mn
 Nbs7PZ1IWiNZpsKS8pZnpd6j4mlBeJnwXkPKiFy+xHGuwRSRdYl6G9e5CtlRely/
 rwQ8DtaOpRYMrGhnmBEdAOCa9t/iqzrzHzjoigjJ7iAST4ToJ5s=
 =Y+/e
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Andrii Nakryiko says:

====================
bpf-next 2022-11-11

We've added 49 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 3592 insertions(+), 1371 deletions(-).

The main changes are:

1) Veristat tool improvements to support custom filtering, sorting, and replay
   of results, from Andrii Nakryiko.

2) BPF verifier precision tracking fixes and improvements,
   from Andrii Nakryiko.

3) Lots of new BPF documentation for various BPF maps, from Dave Tucker,
   Donald Hunter, Maryam Tahhan, Bagas Sanjaya.

4) BTF dedup improvements and libbpf's hashmap interface clean ups, from
   Eduard Zingerman.

5) Fix veth driver panic if XDP program is attached before veth_open, from
   John Fastabend.

6) BPF verifier clean ups and fixes in preparation for follow up features,
   from Kumar Kartikeya Dwivedi.

7) Add access to hwtstamp field from BPF sockops programs,
   from Martin KaFai Lau.

8) Various fixes for BPF selftests and samples, from Artem Savkov,
   Domenico Cerasuolo, Kang Minchul, Rong Tao, Yang Jihong.

9) Fix redirection to tunneling device logic, preventing skb->len == 0, from
   Stanislav Fomichev.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
  selftests/bpf: fix veristat's singular file-or-prog filter
  selftests/bpf: Test skops->skb_hwtstamp
  selftests/bpf: Fix incorrect ASSERT in the tcp_hdr_options test
  bpf: Add hwtstamp field for the sockops prog
  selftests/bpf: Fix xdp_synproxy compilation failure in 32-bit arch
  bpf, docs: Document BPF_MAP_TYPE_ARRAY
  docs/bpf: Document BPF map types QUEUE and STACK
  docs/bpf: Document BPF ARRAY_OF_MAPS and HASH_OF_MAPS
  docs/bpf: Document BPF_MAP_TYPE_CPUMAP map
  docs/bpf: Document BPF_MAP_TYPE_LPM_TRIE map
  libbpf: Hashmap.h update to fix build issues using LLVM14
  bpf: veth driver panics when xdp prog attached before veth_open
  selftests: Fix test group SKIPPED result
  selftests/bpf: Tests for btf_dedup_resolve_fwds
  libbpf: Resolve unambigous forward declarations
  libbpf: Hashmap interface update to allow both long and void* keys/values
  samples/bpf: Fix sockex3 error: Missing BPF prog type
  selftests/bpf: Fix u32 variable compared with less than zero
  Documentation: bpf: Escape underscore in BPF type name prefix
  selftests/bpf: Use consistent build-id type for liburandom_read.so
  ...
====================

Link: https://lore.kernel.org/r/20221111233733.1088228-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 18:33:04 -08:00
Jakub Kicinski
f1a7178b44 Merge branch 'net-vlan-claim-one-bit-from-sk_buff'
Eric Dumazet says:

====================
net: vlan: claim one bit from sk_buff

First patch claims skb->vlan_present.
This means some bpf changes, eg for sparc32 that I could not test.

Second patch removes one conditional test in gro_list_prepare().
====================

Link: https://lore.kernel.org/r/20221109095759.1874969-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 18:18:10 -08:00
Eric Dumazet
be3ed48683 net: gro: no longer use skb_vlan_tag_present()
We can remove a conditional test in gro_list_prepare()
by comparing vlan_all fields of the two skbs.

Notes:

While comparing the vlan_proto is not strictly needed,
because part of the following compare_ether_header() call,
using 32bit word is actually faster than using 16bit values.

napi_reuse_skb() makes sure to clear skb->vlan_all,
as it already calls __vlan_hwaccel_clear_tag()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 18:18:05 -08:00
Eric Dumazet
354259fa73 net: remove skb->vlan_present
skb->vlan_present seems redundant.

We can instead derive it from this boolean expression:

vlan_present = skb->vlan_proto != 0 || skb->vlan_tci != 0

Add a new union, to access both fields in a single load/store
when possible.

	union {
		u32	vlan_all;
		struct {
		__be16	vlan_proto;
		__u16	vlan_tci;
		};
	};

This allows following patch to remove a conditional test in GRO stack.

Note:
  We move remcsum_offload to keep TC_AT_INGRESS_MASK
  and SKB_MONO_DELIVERY_TIME_MASK unchanged.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11 18:18:05 -08:00
Andrii Nakryiko
eb6af4ceda selftests/bpf: fix veristat's singular file-or-prog filter
Fix the bug of filtering out filename too early, before we know the
program name, if using unified file-or-prog filter (i.e., -f
<any-glob>). Because we try to filter BPF object file early without
opening and parsing it, if any_glob (file-or-prog) filter is used we
have to accept any filename just to get program name, which might match
any_glob.

Fixes: 10b1b3f3e5 ("selftests/bpf: consolidate and improve file/prog filtering in veristat")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20221111181242.2101192-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-11-11 14:06:20 -08:00
Andrii Nakryiko
0f7dc423a5 Merge branch 'bpf: Add hwtstamp field for the sockops prog'
Martin KaFai Lau says:

====================

From: Martin KaFai Lau <martin.lau@kernel.org>

The bpf-tc prog has already been able to access the
skb_hwtstamps(skb)->hwtstamp.  This set extends the same hwtstamp
access to the sockops prog.

v2:
- Fixed the btf_dump selftest which depends on the
  last member of 'struct bpf_sock_ops'.
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-11-11 13:18:39 -08:00
Martin KaFai Lau
8cac7a59b2 selftests/bpf: Test skops->skb_hwtstamp
This patch tests reading the skops->skb_hwtstamp field.

A local test was also done such that the shinfo hwtstamp was temporary
set to a non zero value in the kernel bpf_skops_parse_hdr()
and the same value can be read by the skops test.

An adjustment is needed to the btf_dump selftest because
the changes in the 'struct bpf_sock_ops'.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221107230420.4192307-4-martin.lau@linux.dev
2022-11-11 13:18:36 -08:00
Martin KaFai Lau
52929912d7 selftests/bpf: Fix incorrect ASSERT in the tcp_hdr_options test
This patch fixes the incorrect ASSERT test in tcp_hdr_options during
the CHECK to ASSERT macro cleanup.

Fixes: 3082f8cd4b ("selftests/bpf: Convert tcp_hdr_options test to ASSERT_* macros")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221107230420.4192307-3-martin.lau@linux.dev
2022-11-11 13:18:23 -08:00
Martin KaFai Lau
9bb053490f bpf: Add hwtstamp field for the sockops prog
The bpf-tc prog has already been able to access the
skb_hwtstamps(skb)->hwtstamp.  This patch extends the same hwtstamp
access to the sockops prog.

In sockops, the skb is also available to the bpf prog during
the BPF_SOCK_OPS_PARSE_HDR_OPT_CB event.  There is a use case
that the hwtstamp will be useful to the sockops prog to better
measure the one-way-delay when the sender has put the tx
timestamp in the tcp header option.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221107230420.4192307-2-martin.lau@linux.dev
2022-11-11 13:18:14 -08:00
Yang Jihong
e4c9cf0ce8 selftests/bpf: Fix xdp_synproxy compilation failure in 32-bit arch
xdp_synproxy fails to be compiled in the 32-bit arch, log is as follows:

  xdp_synproxy.c: In function 'parse_options':
  xdp_synproxy.c:175:36: error: left shift count >= width of type [-Werror=shift-count-overflow]
    175 |                 *tcpipopts = (mss6 << 32) | (ttl << 24) | (wscale << 16) | mss4;
        |                                    ^~
  xdp_synproxy.c: In function 'syncookie_open_bpf_maps':
  xdp_synproxy.c:289:28: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
    289 |                 .map_ids = (__u64)map_ids,
        |                            ^

Fix it.

Fixes: fb5cd0ce70 ("selftests/bpf: Add selftests for raw syncookie helpers")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221111030836.37632-1-yangjihong1@huawei.com
2022-11-11 12:22:21 -08:00
Dave Tucker
1cfa97b30c bpf, docs: Document BPF_MAP_TYPE_ARRAY
Add documentation for the BPF_MAP_TYPE_ARRAY including kernel version
introduced, usage and examples. Also document BPF_MAP_TYPE_PERCPU_ARRAY
which is similar.

Co-developed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Maryam Tahhan <mtahhan@redhat.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/bpf/20221109174604.31673-2-donald.hunter@gmail.com
2022-11-11 11:37:59 -08:00
Donald Hunter
64488ca57a docs/bpf: Document BPF map types QUEUE and STACK
Add documentation for BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK,
including usage and examples.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221108093314.44851-1-donald.hunter@gmail.com
2022-11-11 11:34:39 -08:00
Donald Hunter
f720b84811 docs/bpf: Document BPF ARRAY_OF_MAPS and HASH_OF_MAPS
Add documentation for the ARRAY_OF_MAPS and HASH_OF_MAPS map types,
including usage and examples.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221108102215.47297-1-donald.hunter@gmail.com
2022-11-11 11:32:54 -08:00
Maryam Tahhan
161939abc8 docs/bpf: Document BPF_MAP_TYPE_CPUMAP map
Add documentation for BPF_MAP_TYPE_CPUMAP including
kernel version introduced, usage and examples.

Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221107165207.2682075-2-mtahhan@redhat.com
2022-11-11 11:32:54 -08:00
Donald Hunter
83177c0dca docs/bpf: Document BPF_MAP_TYPE_LPM_TRIE map
Add documentation for BPF_MAP_TYPE_LPM_TRIE including kernel
BPF helper usage, userspace usage and examples.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221101114542.24481-2-donald.hunter@gmail.com
2022-11-11 11:32:49 -08:00
Eduard Zingerman
42597aa372 libbpf: Hashmap.h update to fix build issues using LLVM14
A fix for the LLVM compilation error while building bpftool.
Replaces the expression:

  _Static_assert((p) == NULL || ...)

by expression:

  _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) || ...)

When "p" is not a constant the former is not considered to be a
constant expression by LLVM 14.

The error was introduced in the following patch-set: [1].
The error was reported here: [2].

  [1] https://lore.kernel.org/bpf/20221109142611.879983-1-eddyz87@gmail.com/
  [2] https://lore.kernel.org/all/202211110355.BcGcbZxP-lkp@intel.com/

Reported-by: kernel test robot <lkp@intel.com>
Fixes: c302378bc1 ("libbpf: Hashmap interface update to allow both long and void* keys/values")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20221110223240.1350810-1-eddyz87@gmail.com
2022-11-11 10:24:23 -08:00
David S. Miller
2cf7e87fc4 Merge branch 'ptp-adjfreq-copnvert'
Jacob Keller says:

====================
ptp: convert remaining users of .adjfreq

A handful of drivers remain which still use the .adjfreq interface instead
of the newer .adjfine interface. The new interface is preferred as it has a
more precise adjustment using scaled parts per million.

A handful of the remaining drivers are implemented with a common pattern
that can be refactored to use the adjust_by_scaled_ppm and
diff_by_scaled_ppm helper functions. These include the ptp_phc, ptp_ixp64x,
tg3, hclge, stmac, cpts and bnxt drivers. These are each refactored in a
separate change.

The remaining drivers, bnx2x, liquidio, cxgb4, fec, and qede implement
.adjfreq in a way different from the normal pattern expected by
adjust_by_scaled_ppm. Fixing these drivers to properly use .adjfine requires
specific knowledge of the hardware implementation. Instead I simply refactor
them to use .adjfine and convert scaled_ppm into ppb using the
scaled_ppm_to_ppb function.

Finally, the .adjfreq implementation interface is removed entirely. This
simplifies the interface and ensures that new drivers must implement the new
interface as they no longer have an alternative.

This still leaves parts per billion used as part of the max_adj interface,
and the core PTP stack still converts scaled_ppm to ppb to check this. I
plan to investigate fixing this in the future.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Jacob Keller
75ab70ec5c ptp: remove the .adjfreq interface function
Now that all drivers have been converted to .adjfine, we can remove the
.adjfreq from the interface structure.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Jacob Keller
e2bd9c76c8 ptp: convert remaining drivers to adjfine interface
Convert all remaining drivers that still use .adjfreq to the newer .adjfine
implementation. These drivers are not straightforward, as they use
non-standard methods of programming their hardware. They are all converted
to use scaled_ppm_to_ppb to get the parts per billion value that their
logic depends on.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Cc: Derek Chickles <dchickles@marvell.com>
Cc: Satanand Burla <sburla@marvell.com>
Cc: Felix Manlunas <fmanlunas@marvell.com>
Cc: Raju Rangoju <rajur@chelsio.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Jacob Keller
a29c132f92 ptp: bnxt: convert .adjfreq to .adjfine
When the BNXT_FW_CAP_PTP_RTC flag is not set, the bnxt driver implements
.adjfreq on a cyclecounter in terms of the straightforward "base * ppb / 1
billion" calculation. When BNXT_FW_CAP_PTP_RTC is set, the driver forwards
the ppb value to firmware for configuration.

Convert the driver to the newer .adjfine interface, updating the
cyclecounter calculation to use adjust_by_scaled_ppm to perform the
calculation. Use scaled_ppm_to_ppb when forwarding the correction to
firmware.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Jacob Keller
a45392071c ptp: cpts: convert .adjfreq to .adjfine
The cpts implementation of .adjfreq is implemented in terms of a
straight forward "base * ppb / 1 billion" calculation.

Convert this to the newer .adjfine, using the recently added
adjust_by_scaled_ppm helper function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Jacob Keller
2d96099f50 ptp: stmac: convert .adjfreq to .adjfine
The stmac implementation of .adjfreq is implemented in terms of a
straight forward "base * ppb / 1 billion" calculation.

Convert this to the newer .adjfine, using the recently added
adjust_by_scaled_ppm helper function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00