2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-22 20:23:57 +08:00
Commit Graph

27996 Commits

Author SHA1 Message Date
Kui-Feng Lee
0da2596f34 libbpf: Mark bpf_object__find_program_by_title API deprecated.
Deprecate this API since v0.7.  All callers should move to
bpf_object__find_program_by_name if possible, otherwise use
bpf_object__for_each_program to find a program out from a given
section.

[0] Closes: https://github.com/libbpf/libbpf/issues/292

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211214035931.1148209-5-kuifeng@fb.com
2021-12-14 14:38:06 -08:00
Kui-Feng Lee
b098f33692 tools/perf: Stop using bpf_object__find_program_by_title API.
bpf_obj__find_program_by_title() in libbpf is going to be deprecated.
Call bpf_object_for_each_program to find a program in the section with
a given name instead.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211214035931.1148209-4-kuifeng@fb.com
2021-12-14 14:38:05 -08:00
Kui-Feng Lee
a393ea80a2 selftests/bpf: Stop using bpf_object__find_program_by_title API.
bpf_object__find_program_by_title is going to be deprecated.  Replace
all use cases in tools/testing/selftests/bpf with
bpf_object__find_program_by_name or bpf_object__for_each_program.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211214035931.1148209-2-kuifeng@fb.com
2021-12-14 14:38:05 -08:00
Andrii Nakryiko
c164b8b404 selftests/bpf: Remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests
As libbpf now is able to automatically take care of RLIMIT_MEMLOCK
increase (or skip it altogether on recent enough kernels), remove
explicit setrlimit() invocations in bench, test_maps, test_verifier, and
test_progs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214195904.1785155-3-andrii@kernel.org
2021-12-14 22:16:54 +01:00
Andrii Nakryiko
e542f2c4cd libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
one of the first extremely frustrating gotchas that all new BPF users go
through and in some cases have to learn it a very hard way.

Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
dropped the dependency on memlock and uses memcg-based memory accounting
instead. Unfortunately, detecting memcg-based BPF memory accounting is
far from trivial (as can be evidenced by this patch), so in practice
most BPF applications still do unconditional RLIMIT_MEMLOCK increase.

As we move towards libbpf 1.0, it would be good to allow users to forget
about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
automatically. This patch paves the way forward in this matter. Libbpf
will do feature detection of memcg-based accounting, and if detected,
will do nothing. But if the kernel is too old, just like BCC, libbpf
will automatically increase RLIMIT_MEMLOCK on behalf of user
application ([0]).

As this is technically a breaking change, during the transition period
applications have to opt into libbpf 1.0 mode by setting
LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
libbpf_set_strict_mode().

Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
called before the first bpf_prog_load(), bpf_btf_load(), or
bpf_object__load() call, otherwise it has no effect and will return
-EBUSY.

  [0] Closes: https://github.com/libbpf/libbpf/issues/369

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214195904.1785155-2-andrii@kernel.org
2021-12-14 22:16:45 +01:00
Andrii Nakryiko
9fc205b413 libbpf: Add sane strncpy alternative and use it internally
strncpy() has a notoriously error-prone semantics which makes GCC
complain about it a lot (and quite often completely completely falsely
at that). Instead of pleasing GCC all the time (-Wno-stringop-truncation
is unfortunately only supported by GCC, so it's a bit too messy to just
enable it in Makefile), add libbpf-internal libbpf_strlcpy() helper
which follows what FreeBSD's strlcpy() does and what most people would
expect from strncpy(): copies up to N-1 first bytes from source string
into destination string and ensures zero-termination afterwards.

Replace all the relevant uses of strncpy/strncat/memcpy in libbpf with
libbpf_strlcpy().

This also fixes the issue reported by Emmanuel Deloget in xsk.c where
memcpy() could access source string beyond its end.

Fixes: 2f6324a393 (libbpf: Support shared umems between queues and devices)
Reported-by: Emmanuel Deloget <emmanuel.deloget@eho.link>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211211004043.2374068-1-andrii@kernel.org
2021-12-14 15:47:56 +01:00
Andrii Nakryiko
4581e676d3 libbpf: Fix potential uninit memory read
In case of BPF_CORE_TYPE_ID_LOCAL we fill out target result explicitly.
But targ_res itself isn't initialized in such a case, and subsequent
call to bpf_core_patch_insn() might read uninitialized field (like
fail_memsz_adjust in this case). So ensure that targ_res is
zero-initialized for BPF_CORE_TYPE_ID_LOCAL case.

This was reported by Coverity static analyzer.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214010032.3843804-1-andrii@kernel.org
2021-12-14 15:37:58 +01:00
Kumar Kartikeya Dwivedi
f7abc4c8df selftests/bpf: Fix OOB write in test_verifier
The commit referenced below added fixup_map_timer support (to create a
BPF map containing timers), but failed to increase the size of the
map_fds array, leading to out of bounds write. Fix this by changing
MAX_NR_MAPS to 22.

Fixes: e60e6962c5 ("selftests/bpf: Add tests for restricted helpers")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214014800.78762-1-memxor@gmail.com
2021-12-14 15:28:15 +01:00
Danielle Ratson
20617717cd selftests: mlxsw: Add a test case for MAC profiles consolidation
Add a test case to cover the bug fixed by the previous patch.

Edit the MAC address of one netdev so that it matches the MAC address of
the second netdev. Verify that the two MAC profiles were consolidated by
testing that the MAC profiles occupancy decreased by one.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:56:10 +00:00
Grant Seltzer
f742fc68ac libbpf: Add doc comments for bpf_program__(un)pin()
This adds doc comments for the two bpf_program pinning functions,
bpf_program__pin() and bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211209232222.541733-1-grantseltzer@gmail.com
2021-12-13 13:55:53 -08:00
Jean-Philippe Brucker
acd143eefb selftests/bpf: Fix segfault in bpf_tcp_ca
Since commit ad9a7f9644 ("libbpf: Improve logging around BPF program
loading"), libbpf_debug_print() gets an additional prog_name parameter
but doesn't pass it to printf(). Since the format string now expects two
arguments, printf() may read uninitialized data and segfault. Pass
prog_name through.

Fixes: ad9a7f9644 ("libbpf: Improve logging around BPF program loading")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211213183058.346066-1-jean-philippe@linaro.org
2021-12-13 13:36:08 -08:00
Jiri Olsa
006004b715 selftests/bpf: Add tests for get_func_[arg|ret|arg_cnt] helpers
Adding tests for get_func_[arg|ret|arg_cnt] helpers.
Using these helpers in fentry/fexit/fmod_ret programs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-6-jolsa@kernel.org
2021-12-13 09:25:59 -08:00
Jiri Olsa
f92c1e1836 bpf: Add get_func_[arg|ret|arg_cnt] helpers
Adding following helpers for tracing programs:

Get n-th argument of the traced function:
  long bpf_get_func_arg(void *ctx, u32 n, u64 *value)

Get return value of the traced function:
  long bpf_get_func_ret(void *ctx, u64 *value)

Get arguments count of the traced function:
  long bpf_get_func_arg_cnt(void *ctx)

The trampoline now stores number of arguments on ctx-8
address, so it's easy to verify argument index and find
return value argument's position.

Moving function ip address on the trampoline stack behind
the number of functions arguments, so it's now stored on
ctx-16 address if it's needed.

All helpers above are inlined by verifier.

Also bit unrelated small change - using newly added function
bpf_prog_has_trampoline in check_get_func_ip.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-5-jolsa@kernel.org
2021-12-13 09:25:59 -08:00
Jiri Olsa
2b070c2bc8 selftests/bpf: Add test to access int ptr argument in tracing program
Adding verifier test for accessing int pointer argument in
tracing programs.

The test program loads 2nd argument of bpf_modify_return_test
function which is int pointer and checks that verifier allows
that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-3-jolsa@kernel.org
2021-12-13 09:24:22 -08:00
Hangbin Liu
71da1aec21 selftest/net/forwarding: declare NETIFS p9 p10
The recent GRE selftests defined NUM_NETIFS=10. If the users copy
forwarding.config.sample to forwarding.config directly, they will get
error "Command line is not complete" when run the GRE tests, because
create_netif_veth() failed with no interface name defined.

Fix it by extending the NETIFS with p9 and p10.

Fixes: 2800f24854 ("selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnel")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:54:15 +00:00
Willem de Bruijn
87f7282e76 selftests/net: expand gro with two machine test
The test is currently run on a single host with private addresses,
either over veth or by setting a nic in loopback mode with macvlan.

Support running between two real devices. Allow overriding addresses.

Also cut timeout to fail faster on error and explicitly log success.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:38:26 +00:00
Willem de Bruijn
a8d13611b4 selftests/net: toeplitz: fix udp option
Tiny fix. Option -u ("use udp") does not take an argument.

It can cause the next argument to silently be ignored.

Fixes: 5ebfb4cc30 ("selftests/net: toeplitz test")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:37:15 +00:00
Andrii Nakryiko
4b443bc178 libbpf: Don't validate TYPE_ID relo's original imm value
During linking, type IDs in the resulting linked BPF object file can
change, and so ldimm64 instructions corresponding to
BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL CO-RE relos can get
their imm value out of sync with actual CO-RE relocation information
that's updated by BPF linker properly during linking process.

We could teach BPF linker to adjust such instructions, but it feels
a bit too much for linker to re-implement good chunk of
bpf_core_patch_insns logic just for this. This is a redundant safety
check for TYPE_ID relocations, as the real validation is in matching
CO-RE specs, so if that works fine, it's very unlikely that there is
something wrong with the instruction itself.

So, instead, teach libbpf (and kernel) to ignore insn->imm for
BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL relos.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211213010706.100231-1-andrii@kernel.org
2021-12-12 17:26:24 -08:00
Andrii Nakryiko
f12468828c selftests/bpf: Remove last bpf_create_map_xattr from test_verifier
bpf_create_map_xattr() call was reintroduced after merging bpf tree into
bpf-next tree. Convert the last instance into bpf_map_create() call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211212191341.2529573-1-andrii@kernel.org
2021-12-12 16:46:15 -08:00
David Ahern
28a2686c18 selftests: Fix IPv6 address bind tests
IPv6 allows binding a socket to a device then binding to an address
not on the device (__inet6_bind -> ipv6_chk_addr with strict flag
not set). Update the bind tests to reflect legacy behavior.

Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 16:24:27 +00:00
David Ahern
0f108ae445 selftests: Fix raw socket bind tests with VRF
Commit referenced below added negative socket bind tests for VRF. The
socket binds should fail since the address to bind to is in a VRF yet
the socket is not bound to the VRF or a device within it. Update the
expected return code to check for 1 (bind failure) so the test passes
when the bind fails as expected. Add a 'show_hint' comment to explain
why the bind is expected to fail.

Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 16:24:27 +00:00
David Ahern
7e0147592b selftests: Add duplicate config only for MD5 VRF tests
Commit referenced below added configuration in the default VRF that
duplicates a VRF to check MD5 passwords are properly used and fail
when expected. That config should not be added all the time as it
can cause tests to pass that should not (by matching on default VRF
setup when it should not). Move the duplicate setup to a function
that is only called for the MD5 tests and add a cleanup function
to remove it after the MD5 tests.

Fixes: 5cad8bce26 ("fcnal-test: Add TCP MD5 tests for VRF")
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 16:24:27 +00:00
Po-Hsu Lin
3748939bce selftests: icmp_redirect: pass xfail=0 to log_test()
If any sub-test in this icmp_redirect.sh is failing but not expected
to fail. The script will complain:
    ./icmp_redirect.sh: line 72: [: 1: unary operator expected

This is because when the sub-test is not expected to fail, we won't
pass any value for the xfail local variable in log_test() and thus
it's empty. Fix this by passing 0 as the 4th variable to log_test()
for non-xfail cases.

v2: added fixes tag

Fixes: 0a36a75c68 ("selftests: icmp_redirect: support expected failures")
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:56:41 +00:00
Hou Tao
bdbee82bec selftests/bpf: Add test cases for bpf_strncmp()
Four test cases are added:
(1) ensure the return value is expected
(2) ensure no const string size is rejected
(3) ensure writable target is rejected
(4) ensure no null-terminated target is rejected

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-5-houtao1@huawei.com
2021-12-11 17:40:23 -08:00
Hou Tao
9c42652f8b selftests/bpf: Add benchmark for bpf_strncmp() helper
Add benchmark to compare the performance between home-made strncmp()
in bpf program and bpf_strncmp() helper. In summary, the performance
win of bpf_strncmp() under x86-64 is greater than 18% when the compared
string length is greater than 64, and is 179% when the length is 4095.
Under arm64 the performance win is even bigger: 33% when the length
is greater than 64 and 600% when the length is 4095.

The following is the details:

no-helper-X: use home-made strncmp() to compare X-sized string
helper-Y: use bpf_strncmp() to compare Y-sized string

Under x86-64:

no-helper-1          3.504 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-1             3.347 ± 0.001M/s (drops 0.000 ± 0.000M/s)

no-helper-8          3.357 ± 0.001M/s (drops 0.000 ± 0.000M/s)
helper-8             3.307 ± 0.001M/s (drops 0.000 ± 0.000M/s)

no-helper-32         3.064 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-32            3.253 ± 0.001M/s (drops 0.000 ± 0.000M/s)

no-helper-64         2.563 ± 0.001M/s (drops 0.000 ± 0.000M/s)
helper-64            3.040 ± 0.001M/s (drops 0.000 ± 0.000M/s)

no-helper-128        1.975 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-128           2.641 ± 0.000M/s (drops 0.000 ± 0.000M/s)

no-helper-512        0.759 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-512           1.574 ± 0.000M/s (drops 0.000 ± 0.000M/s)

no-helper-2048       0.329 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-2048          0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s)

no-helper-4095       0.117 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-4095          0.327 ± 0.000M/s (drops 0.000 ± 0.000M/s)

Under arm64:

no-helper-1          2.806 ± 0.004M/s (drops 0.000 ± 0.000M/s)
helper-1             2.819 ± 0.002M/s (drops 0.000 ± 0.000M/s)

no-helper-8          2.797 ± 0.109M/s (drops 0.000 ± 0.000M/s)
helper-8             2.786 ± 0.025M/s (drops 0.000 ± 0.000M/s)

no-helper-32         2.399 ± 0.011M/s (drops 0.000 ± 0.000M/s)
helper-32            2.703 ± 0.002M/s (drops 0.000 ± 0.000M/s)

no-helper-64         2.020 ± 0.015M/s (drops 0.000 ± 0.000M/s)
helper-64            2.702 ± 0.073M/s (drops 0.000 ± 0.000M/s)

no-helper-128        1.604 ± 0.001M/s (drops 0.000 ± 0.000M/s)
helper-128           2.516 ± 0.002M/s (drops 0.000 ± 0.000M/s)

no-helper-512        0.699 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-512           2.106 ± 0.003M/s (drops 0.000 ± 0.000M/s)

no-helper-2048       0.215 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-2048          1.223 ± 0.003M/s (drops 0.000 ± 0.000M/s)

no-helper-4095       0.112 ± 0.000M/s (drops 0.000 ± 0.000M/s)
helper-4095          0.796 ± 0.000M/s (drops 0.000 ± 0.000M/s)

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-4-houtao1@huawei.com
2021-12-11 17:40:23 -08:00
Hou Tao
9a93bf3fda selftests/bpf: Fix checkpatch error on empty function parameter
Fix checkpatch error: "ERROR: Bad function definition - void foo()
should probably be void foo(void)". Most replacements are done by
the following command:

  sed -i 's#\([a-z]\)()$#\1(void)#g' testing/selftests/bpf/benchs/*.c

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-3-houtao1@huawei.com
2021-12-11 17:40:23 -08:00
Hou Tao
c5fb199374 bpf: Add bpf_strncmp helper
The helper compares two strings: one string is a null-terminated
read-only string, and another string has const max storage size
but doesn't need to be null-terminated. It can be used to compare
file name in tracing or LSM program.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-2-houtao1@huawei.com
2021-12-11 17:40:23 -08:00
Alexei Starovoitov
259172bb65 libbpf: Fix gen_loader assumption on number of programs.
libbpf's obj->nr_programs includes static and global functions. That number
could be higher than the actual number of bpf programs going be loaded by
gen_loader. Passing larger nr_programs to bpf_gen__init() doesn't hurt. Those
exra stack slots will stay as zero. bpf_gen__finish() needs to check that
actual number of progs that gen_loader saw is less than or equal to
obj->nr_programs.

Fixes: ba05fd36b8 ("libbpf: Perform map fd cleanup for gen_loader in case of error")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-11 17:16:19 -08:00
Linus Torvalds
bbdff6d583 perf tools fixes for v5.16: 3rd batch
- Prevent out-of-bounds access to per sample registers.
 
 - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.
 
 - Intel PT fixes, half of those are one-liners:
 
   - Fix some PGE (packet generation enable/control flow packets) usage.
   - Fix sync state when a PSB (synchronization) packet is found.
   - Fix intel_pt_fup_event() assumptions about setting state type.
   - Fix state setting when receiving overflow (OVF) packet.
   - Fix next 'err' value, walking trace.
   - Fix missing 'instruction' events with 'q' option.
   - Fix error timestamp setting on the decoder error path.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYbTE4gAKCRCyPKLppCJ+
 JyuQAQC6hgdIIY+1aXlTer4OKep8AdnH5y6q1GLxcNX0UX/iUgD+OjxBFiFk50Ix
 jCWFjM1rsDHLJbPZmztRX4IxNHFJzAg=
 =ZkBH
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Prevent out-of-bounds access to per sample registers.

 - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.

 - Intel PT fixes, half of those are one-liners:
      - Fix some PGE (packet generation enable/control flow packets) usage.
      - Fix sync state when a PSB (synchronization) packet is found.
      - Fix intel_pt_fup_event() assumptions about setting state type.
      - Fix state setting when receiving overflow (OVF) packet.
      - Fix next 'err' value, walking trace.
      - Fix missing 'instruction' events with 'q' option.
      - Fix error timestamp setting on the decoder error path.

* tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf python: Fix NULL vs IS_ERR_OR_NULL() checking
  perf intel-pt: Fix error timestamp setting on the decoder error path
  perf intel-pt: Fix missing 'instruction' events with 'q' option
  perf intel-pt: Fix next 'err' value, walking trace
  perf intel-pt: Fix state setting when receiving overflow (OVF) packet
  perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
  perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
  perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
  perf tools: Prevent out-of-bounds access to registers
2021-12-11 13:28:02 -08:00
Linus Torvalds
df442a4ec7 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "21 patches.

  Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
  (mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
  mm: bdi: initialize bdi_min_ratio when bdi is unregistered
  hugetlbfs: fix issue of preallocation of gigantic pages can't work
  mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
  mm/slub: fix endianness bug for alloc/free_traces attributes
  selftests/damon: split test cases
  selftests/damon: test debugfs file reads/writes with huge count
  selftests/damon: test wrong DAMOS condition ranges input
  selftests/damon: test DAMON enabling with empty target_ids case
  selftests/damon: skip test if DAMON is running
  mm/damon/vaddr-test: remove unnecessary variables
  mm/damon/vaddr-test: split a test function having >1024 bytes frame size
  mm/damon/vaddr: remove an unnecessary warning message
  mm/damon/core: remove unnecessary error messages
  mm/damon/dbgfs: remove an unnecessary error message
  mm/damon/core: use better timer mechanisms selection threshold
  mm/damon/core: fix fake load reports due to uninterruptible sleeps
  timers: implement usleep_idle_range()
  filemap: remove PageHWPoison check from next_uptodate_page()
  mailmap: update email address for Guo Ren
  MAINTAINERS: update kdump maintainers
  ...
2021-12-11 08:46:52 -08:00
Miaoqian Lin
9937e8daab perf python: Fix NULL vs IS_ERR_OR_NULL() checking
The function trace_event__tp_format_id may return ERR_PTR(-ENOMEM).  Use
IS_ERR_OR_NULL to check tp_format.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20211211053856.19827-1-linmq006@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:23:54 -03:00
Adrian Hunter
6665b8e483 perf intel-pt: Fix error timestamp setting on the decoder error path
An error timestamp shows the last known timestamp for the queue, but this
is not updated on the error path. Fix by setting it.

Fixes: f4aa081949 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
a882cc9497 perf intel-pt: Fix missing 'instruction' events with 'q' option
FUP packets contain IP information, which makes them also an 'instruction'
event in 'hop' mode i.e. the itrace 'q' option.  That wasn't happening, so
restructure the logic so that FUP events are added along with appropriate
'instruction' and 'branch' events.

Fixes: 7c1b16ba0e ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
a32e6c5da5 perf intel-pt: Fix next 'err' value, walking trace
Code after label 'next:' in intel_pt_walk_trace() assumes 'err' is zero,
but it may not be, if arrived at via a 'goto'. Ensure it is zero.

Fixes: 7c1b16ba0e ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
c79ee2b216 perf intel-pt: Fix state setting when receiving overflow (OVF) packet
An overflow (OVF packet) is treated as an error because it represents a
loss of trace data, but there is no loss of synchronization, so the packet
state should be INTEL_PT_STATE_IN_SYNC not INTEL_PT_STATE_ERR_RESYNC.

To support that, some additional variables must be reset, and the FUP
packet that may follow OVF is treated as an FUP event.

Fixes: f4aa081949 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
4c761d805b perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
intel_pt_fup_event() assumes it can overwrite the state type if there has
been an FUP event, but this is an unnecessary and unexpected constraint on
callers.

Fix by touching only the state type flags that are affected by an FUP
event.

Fixes: a472e65fc4 ("perf intel-pt: Add decoder support for ptwrite and power event packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
ad106a26ae perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
When syncing, it may be that branch packet generation is not enabled at
that point, in which case there will not immediately be a control-flow
packet, so some packets before a control flow packet turns up, get
ignored.  However, the decoder is in sync as soon as a PSB is found, so
the state should be set accordingly.

Fixes: f4aa081949 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
Adrian Hunter
057ae59f5a perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
Packet generation enable (PGE) refers to whether control flow (COFI)
packets are being produced.

PGE may be false even when branch-tracing is enabled, due to being
out-of-context, or outside a filter address range.  Fix some missing PGE
usage.

Fixes: 7c1b16ba0e ("perf intel-pt: Add support for decoding FUP/TIP only")
Fixes: 839598176b ("perf intel-pt: Allow decoding with branch tracing disabled")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:47 -03:00
German Gomez
c897899752 perf tools: Prevent out-of-bounds access to registers
The size of the cache of register values is arch-dependant
(PERF_REGS_MAX). This has the potential of causing an out-of-bounds
access in the function "perf_reg_value" if the local architecture
contains less registers than the one the perf.data file was recorded on.

Since the maximum number of registers is bound by the bitmask "u64
cache_mask", and the size of the cache when running under x86 systems is
64 already, fix the size to 64 and add a range-check to the function
"perf_reg_value" to prevent out-of-bounds access.

Reported-by: Alexandre Truong <alexandre.truong@arm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211201123334.679131-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-11 08:19:46 -03:00
SeongJae Park
9ab3b0c8ef selftests/damon: split test cases
Currently, the single test program, debugfs.sh, contains all test cases
for DAMON.  When one of the cases fails, finding which case is failed
from the test log is not so easy, and all remaining tests will be
skipped.  To improve the situation, this commit splits the single
program into small test programs having their own names.

Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10 17:10:56 -08:00
SeongJae Park
b4a002889d selftests/damon: test debugfs file reads/writes with huge count
DAMON debugfs interface users were able to trigger warning by writing
some files with arbitrarily large 'count' parameter.  The issue is fixed
with commit db7a347b26 ("mm/damon/dbgfs: use '__GFP_NOWARN' for
user-specified size buffer allocation").  This commit adds a test case
for the issue in DAMON selftests to avoid future regressions.

Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10 17:10:56 -08:00
SeongJae Park
d85570c655 selftests/damon: test wrong DAMOS condition ranges input
A patch titled "mm/damon/schemes: add the validity judgment of
thresholds"[1] makes DAMON debugfs interface to validate DAMON scheme
inputs.  This commit adds a test case for the validation logic in DAMON
selftests.

[1] https://lore.kernel.org/linux-mm/d78360e52158d786fcbf20bc62c96785742e76d3.1637239568.git.xhao@linux.alibaba.com/

Link: https://lkml.kernel.org/r/20211201150440.1088-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10 17:10:56 -08:00
SeongJae Park
c6980e30af selftests/damon: test DAMON enabling with empty target_ids case
DAMON debugfs didn't check empty targets when starting monitoring, and
the issue is fixed with commit b5ca3e83dd ("mm/damon/dbgfs: add
adaptive_targets list check before enable monitor_on").  To avoid future
regression, this commit adds a test case for that in DAMON selftests.

Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10 17:10:56 -08:00
SeongJae Park
964e17016c selftests/damon: skip test if DAMON is running
Testing the DAMON debugfs files while DAMON is running makes no sense,
as any write to the debugfs files will fail.  This commit makes the test
be skipped in this case.

Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-10 17:10:56 -08:00
Jakub Kicinski
be3158290d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-12-10 v2

We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).

The main changes are:

1) Various samples fixes, from Alexander Lobakin.

2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.

3) A batch of new unified APIs for libbpf, logging improvements, version
   querying, etc. Also a batch of old deprecations for old APIs and various
   bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.

4) BPF documentation reorganization and improvements, from Christoph Hellwig
   and Dave Tucker.

5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
   libbpf, from Hengqi Chen.

6) Verifier log fixes, from Hou Tao.

7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.

8) Extend branch record capturing to all platforms that support it,
   from Kajol Jain.

9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.

10) bpftool doc-generating script improvements, from Quentin Monnet.

11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.

12) Deprecation warning fix for perf/bpf_counter, from Song Liu.

13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
    from Tiezhu Yang.

14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.

15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
    Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
    and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
  libbpf: Add "bool skipped" to struct bpf_map
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
  bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
  selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
  selftests/bpf: Add test for libbpf's custom log_buf behavior
  selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
  libbpf: Deprecate bpf_object__load_xattr()
  libbpf: Add per-program log buffer setter and getter
  libbpf: Preserve kernel error code and remove kprobe prog type guessing
  libbpf: Improve logging around BPF program loading
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Allow passing preallocated log_buf when loading BTF into kernel
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  samples/bpf: Remove unneeded variable
  bpf: Remove redundant assignment to pointer t
  selftests/bpf: Fix a compilation warning
  perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
  samples: bpf: Fix 'unknown warning group' build warning on Clang
  samples: bpf: Fix xdp_sample_user.o linking with Clang
  ...
====================

Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 15:56:13 -08:00
Shuyi Cheng
229fae38d0 libbpf: Add "bool skipped" to struct bpf_map
Fix error: "failed to pin map: Bad file descriptor, path:
/sys/fs/bpf/_rodata_str1_1."

In the old kernel, the global data map will not be created, see [0]. So
we should skip the pinning of the global data map to avoid
bpf_object__pin_maps returning error. Therefore, when the map is not
created, we mark “map->skipped" as true and then check during relocation
and during pinning.

Fixes: 16e0c35c6f ("libbpf: Load global data maps lazily on legacy kernels")
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10 15:35:30 -08:00
Vincent Minet
b69c5c07a6 libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
The btf__dedup_deprecated name was misspelled in the definition of the
compat symbol for btf__dedup. This leads it to be missing from the
shared library.

This fixes it.

Fixes: 957d350a8b ("libbpf: Turn btf_dedup_opts into OPTS-based struct")
Signed-off-by: Vincent Minet <vincent@vincent-minet.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211210063112.80047-1-vincent@vincent-minet.net
2021-12-10 15:35:29 -08:00
Andrii Nakryiko
b59e4ce8bc bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
Switch all the uses of to-be-deprecated bpf_object__load_xattr() into
a simple bpf_object__load() calls with optional log_level passed through
open_opts.kernel_log_level, if -d option is specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-13-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
3fc5fdcca1 selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
Switch from bpf_object__load_xattr() to bpf_object__load() and
kernel_log_level in bpf_object_open_opts.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-12-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
57e889269a selftests/bpf: Add test for libbpf's custom log_buf behavior
Add a selftest that validates that per-program and per-object log_buf
overrides work as expected. Also test same logic for low-level
bpf_prog_load() and bpf_btf_load() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-11-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
dc94121b5c selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
Switch all selftests uses of to-be-deprecated bpf_load_btf() with
equivalent bpf_btf_load() calls.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-10-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
e7b924ca71 libbpf: Deprecate bpf_object__load_xattr()
Deprecate non-extensible bpf_object__load_xattr() in v0.8 ([0]).

With log_level control through bpf_object_open_opts or
bpf_program__set_log_level(), we are finally at the point where
bpf_object__load_xattr() doesn't provide any functionality that can't be
accessed through other (better) ways. The other feature,
target_btf_path, is also controllable through bpf_object_open_opts.

  [0] Closes: https://github.com/libbpf/libbpf/issues/289

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-9-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
b3ce907950 libbpf: Add per-program log buffer setter and getter
Allow to set user-provided log buffer on a per-program basis ([0]). This
gives great deal of flexibility in terms of which programs are loaded
with logging enabled and where corresponding logs go.

Log buffer set with bpf_program__set_log_buf() overrides kernel_log_buf
and kernel_log_size settings set at bpf_object open time through
bpf_object_open_opts, if any.

Adjust bpf_object_load_prog_instance() logic to not perform own log buf
allocation and load retry if custom log buffer is provided by the user.

  [0] Closes: https://github.com/libbpf/libbpf/issues/418

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-8-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
2eda2145eb libbpf: Preserve kernel error code and remove kprobe prog type guessing
Instead of rewriting error code returned by the kernel of prog load with
libbpf-sepcific variants pass through the original error.

There is now also no need to have a backup generic -LIBBPF_ERRNO__LOAD
fallback error as bpf_prog_load() guarantees that errno will be properly
set no matter what.

Also drop a completely outdated and pretty useless BPF_PROG_TYPE_KPROBE
guess logic. It's not necessary and neither it's helpful in modern BPF
applications.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-7-andrii@kernel.org
2021-12-10 15:29:18 -08:00
Andrii Nakryiko
ad9a7f9644 libbpf: Improve logging around BPF program loading
Add missing "prog '%s': " prefixes in few places and use consistently
markers for beginning and end of program load logs. Here's an example of
log output:

libbpf: prog 'handler': BPF program load failed: Permission denied
libbpf: -- BEGIN PROG LOAD LOG ---
arg#0 reference type('UNKNOWN ') size cannot be determined: -22
; out1 = in1;
0: (18) r1 = 0xffffc9000cdcc000
2: (61) r1 = *(u32 *)(r1 +0)

...

81: (63) *(u32 *)(r4 +0) = r5
 R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0)
invalid access to map value, value_size=16 off=400 size=4
R4 min value is outside of the allowed memory range
processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
 -- END PROG LOAD LOG --
libbpf: failed to load program 'handler'
libbpf: failed to load object 'test_skeleton'

The entire verifier log, including BEGIN and END markers are now always
youtput during a single print callback call. This should make it much
easier to post-process or parse it, if necessary. It's not an explicit
API guarantee, but it can be reasonably expected to stay like that.

Also __bpf_object__open is renamed to bpf_object_open() as it's always
an adventure to find the exact function that implements bpf_object's
open phase, so drop the double underscored and use internal libbpf
naming convention.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org
2021-12-10 15:29:17 -08:00
Andrii Nakryiko
e0e3ea888c libbpf: Allow passing user log setting through bpf_object_open_opts
Allow users to provide their own custom log_buf, log_size, and log_level
at bpf_object level through bpf_object_open_opts. This log_buf will be
used during BTF loading. Subsequent patch will use same log_buf during
BPF program loading, unless overriden at per-bpf_program level.

When such custom log_buf is provided, libbpf won't be attempting
retrying loading of BTF to try to provide its own log buffer to capture
kernel's error log output. User is responsible to provide big enough
buffer, otherwise they run a risk of getting -ENOSPC error from the
bpf() syscall.

See also comments in bpf_object_open_opts regarding log_level and
log_buf interactions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-5-andrii@kernel.org
2021-12-10 15:29:17 -08:00
Andrii Nakryiko
1a190d1e8e libbpf: Allow passing preallocated log_buf when loading BTF into kernel
Add libbpf-internal btf_load_into_kernel() that allows to pass
preallocated log_buf and custom log_level to be passed into kernel
during BPF_BTF_LOAD call. When custom log_buf is provided,
btf_load_into_kernel() won't attempt an retry with automatically
allocated internal temporary buffer to capture BTF validation log.

It's important to note the relation between log_buf and log_level, which
slightly deviates from stricter kernel logic. From kernel's POV, if
log_buf is specified, log_level has to be > 0, and vice versa. While
kernel has good reasons to request such "sanity, this, in practice, is
a bit unconvenient and restrictive for libbpf's high-level bpf_object APIs.

So libbpf will allow to set non-NULL log_buf and log_level == 0. This is
fine and means to attempt to load BTF without logging requested, but if
it failes, retry the load with custom log_buf and log_level 1. Similar
logic will be implemented for program loading. In practice this means
that users can provide custom log buffer just in case error happens, but
not really request slower verbose logging all the time. This is also
consistent with libbpf behavior when custom log_buf is not set: libbpf
first tries to load everything with log_level=0, and only if error
happens allocates internal log buffer and retries with log_level=1.

Also, while at it, make BTF validation log more obvious and follow the log
pattern libbpf is using for dumping BPF verifier log during
BPF_PROG_LOAD. BTF loading resulting in an error will look like this:

libbpf: BTF loading error: -22
libbpf: -- BEGIN BTF LOAD LOG ---
magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 1040
str_off: 1040
str_len: 2063598257
btf_total_size: 1753
Total section length too long
-- END BTF LOAD LOG --
libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.

This makes it much easier to find relevant parts in libbpf log output.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-4-andrii@kernel.org
2021-12-10 15:29:17 -08:00
Andrii Nakryiko
0ed08d6725 libbpf: Add OPTS-based bpf_btf_load() API
Similar to previous bpf_prog_load() and bpf_map_create() APIs, add
bpf_btf_load() API which is taking optional OPTS struct. Schedule
bpf_load_btf() for deprecation in v0.8 ([0]).

This makes naming consistent with BPF_BTF_LOAD command, sets up an API
for extensibility in the future, moves options parameters (log-related
fields) into optional options, and also allows to pass log_level
directly.

It also removes log buffer auto-allocation logic from low-level API
(consistent with bpf_prog_load() behavior), but preserves a special
treatment of log_level == 0 with non-NULL log_buf, which matches
low-level bpf_prog_load() and high-level libbpf APIs for BTF and program
loading behaviors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/419

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-3-andrii@kernel.org
2021-12-10 15:29:17 -08:00
Andrii Nakryiko
4cf23a3c63 libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix
bpf_prog_load() to follow the same logic as bpf_btf_load() and
high-level bpf_object__load() API will follow in the subsequent patches:
  - if log_level is 0 and non-NULL log_buf is provided by a user, attempt
    load operation initially with no log_buf and log_level set;
  - if successful, we are done, return new FD;
  - on error, retry the load operation with log_level bumped to 1 and
    log_buf set; this way verbose logging will be requested only when we
    are sure that there is a failure, but will be fast in the
    common/expected success case.

Of course, user can still specify log_level > 0 from the very beginning
to force log collection.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-2-andrii@kernel.org
2021-12-10 15:29:17 -08:00
Linus Torvalds
9e65da135b ACPI fix for 5.16-rc5
Create the output directory for the ACPI tools during build if it has
 not been present before and prevent the compilation from failing in
 that case (Chen Yu).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGzrMQSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxEZAP/RUAlFyiYOgL8iyRJxFkRNL0VdzS7ATY
 2p+rT53mPRkcUN0e4ZdrUtNr50p1rC2Qx614Tm++2QLsQ08tz7IOb2PDX7GcKx34
 Gs6z/tYQzjgwy+bElfSa+lnbczGndkm1m4hWgIcs0MkIQzZwaGRfQ3mc5TE7FfyJ
 F1ALLsMM8fY2K/3akbQ5ysPLXA3AYEmnC/V1tw4b5sxH7Mi1vkfFYyTmk5Te5VyW
 Zp9u63DoEmaNYKZeWsfkHUJM9QjvUOAhdIGK+OVBjlF9/qvSiLOTL5jYuRt/74ab
 7vOSWWzEcihW5xa2mPWCz3YWQzZyB6qNM1b1DVN3sDZvv7zLhd1ntcZ5B/M/sT+I
 SuRB1AMj92t8bgvp+U21ehUIygQdJuJ0Ny16q89yDgDv0qY0m8ZDq28IrXEkMVc6
 g5iAwI7LlYovD3wnSGXL391sG6dSUQzWxuLp+h3bcJJZkSTiYYW9M2iQHVJgDI1h
 T4Scdm0aUPr6lO1o/J8Chg8G9dMtPX89+WK5+6JA0P2PwHYHd9s79KDzq14zrltZ
 vZWG7ishOqMoRJhxd5mI9Y63DZesO4kcuIzCADqO/K2WVCEgJ2sooNQAF0Ys440R
 HypRBRs8FZmohS/gznFDmtrS/lhXfacXQVS+CKhM9XbNI+E8AQq9F6AmqHotqZRF
 0MZ8HPOg3kKP
 =lLS7
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Create the output directory for the ACPI tools during build if it has
  not been present before and prevent the compilation from failing in
  that case (Chen Yu)"

* tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: tools: Fix compilation when output directory is not present
2021-12-10 14:43:16 -08:00
Linus Torvalds
b9172f9e88 More x86 fixes:
* Logic bugs in CR0 writes and Hyper-V hypercalls
 * Don't use Enlightened MSR Bitmap for L3
 * Remove user-triggerable WARN
 
 Plus a few selftest fixes and a regression test for the
 user-triggerable WARN.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGzZuIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMD4Qf+Im7Q0XNRZGzK6x4Blu3ZZJSuIIkW
 gEK5mDX/BWSYxoGhRN0IOkyf1Tx/A5qYwbZts87wZSvKONG2MuVzdeQ0mkDxgKc3
 cYwvvIPxCKaW/dQLD2OKVlqdAv6YbeJiFURWXgszMkrcgHvw39H5Tn6ldi0B5nvg
 Gvpj8LtbPDXGXab//Xrhia3+1F9TKOrcOG+obGC5G2mrGKTkG2+pi9L6LohvENhd
 sOSWdpmvQTU4PeqGlhW8RCwcN+vpa+NasHT2i2tHcWZA9Lqp4P81+4ZyQLIBsRB3
 psANG0c40EW+lfjFGbLL/6VR5kypxa6zy9RgX+QiRcj6C0+dJOgnwNY35A==
 =cREz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "More x86 fixes:

   - Logic bugs in CR0 writes and Hyper-V hypercalls

   - Don't use Enlightened MSR Bitmap for L3

   - Remove user-triggerable WARN

  Plus a few selftest fixes and a regression test for the
  user-triggerable WARN"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
  KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
  KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
  selftests: KVM: avoid failures due to reserved HyperTransport region
  KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
  KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
  KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
  KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
2021-12-10 14:09:12 -08:00
Ye Guojin
db10415448 selftests: mptcp: remove duplicate include in mptcp_inq.c
'sys/ioctl.h' included in 'mptcp_inq.c' is duplicated.

Reported-by: ZealRobot <zealci@zte.com.cn>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20211210071424.425773-1-ye.guojin@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 11:00:32 -08:00
Paul Chaignon
0be2516f86 selftests/bpf: Tests for state pruning with u32 spill/fill
This patch adds tests for the verifier's tracking for spilled, <8B
registers. The first two test cases ensure the verifier doesn't
incorrectly prune states in case of <8B spill/fills. The last one simply
checks that a filled u64 register is marked unknown if the register
spilled in the same slack slot was less than 8B.

The map value access at the end of the first program is only incorrect
for the path R6=32. If the precision bit for register R8 isn't
backtracked through the u32 spill/fill, the R6=32 path is pruned at
instruction 9 and the program is incorrectly accepted. The second
program is a variation of the same with u32 spills and a u64 fill.

The additional instructions to introduce the first pruning point may be
a bit fragile as they depend on the heuristics for pruning points in the
verifier (currently at least 8 instructions and 2 jumps). If the
heuristics are changed, the pruning point may move (e.g., to the
subsequent jump) or disappear, which would cause the test to always pass.

Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-10 09:13:21 -08:00
Sean Christopherson
10e7a099bf selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
if userspace modifies RCX during a userspace exit to handle string I/O.
This is a regression test for a user-triggerable WARN introduced by
commit 3b27de2718 ("KVM: x86: split the two parts of emulator_pio_in").

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211025201311.1881846-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10 09:38:02 -05:00
Paolo Bonzini
c8cc43c1ea selftests: KVM: avoid failures due to reserved HyperTransport region
AMD proceessors define an address range that is reserved by HyperTransport
and causes a failure if used for guest physical addresses.  Avoid
selftests failures by reserving those guest physical addresses; the
rules are:

- On parts with <40 bits, its fully hidden from software.

- Before Fam17h, it was always 12G just below 1T, even if there was more
RAM above this location.  In this case we just not use any RAM above 1T.

- On Fam17h and later, it is variable based on SME, and is either just
below 2^48 (no encryption) or 2^43 (encryption).

Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210805105423.412878-1-pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10 07:42:20 -05:00
Jie2x Zhou
92816e2629 selftests: net: Correct ping6 expected rc from 2 to 1
./fcnal-test.sh -v -t ipv6_ping
TEST: ping out, VRF bind - ns-B IPv6 LLA                                      [FAIL]
TEST: ping out, VRF bind - multicast IP                                       [FAIL]

ping6 is failing as it should.
COMMAND: ip netns exec ns-A /bin/ping6 -c1 -w1 fe80::7c4c:bcff:fe66:a63a%red
strace of ping6 shows it is failing with '1',
so change the expected rc from 2 to 1.

Fixes: c0644e71df ("selftests: Add ipv6 ping tests to fcnal-test")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com>
Link: https://lore.kernel.org/r/20211209020230.37270-1-jie2x.zhou@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 19:08:05 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Linus Torvalds
ded746bfc9 Networking fixes for 5.16-rc5, including fixes from bpf, can and netfilter.
Current release - regressions:
 
  - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap
 
 Current release - new code bugs:
 
  - bpf: fix bpf_check_mod_kfunc_call for built-in modules
 
  - ice: fixes for TC classifier offloads
 
  - vrf: don't run conntrack on vrf with !dflt qdisc
 
 Previous releases - regressions:
 
  - bpf: fix the off-by-two error in range markings
 
  - seg6: fix the iif in the IPv6 socket control block
 
  - devlink: fix netns refcount leak in devlink_nl_cmd_reload()
 
  - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
 
  - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
 
 Previous releases - always broken:
 
  - ethtool: do not perform operations on net devices being unregistered
 
  - udp: use datalen to cap max gso segments
 
  - ice: fix races in stats collection
 
  - fec: only clear interrupt of handling queue in fec_enet_rx_queue()
 
  - m_can: pci: fix incorrect reference clock rate
 
  - m_can: disable and ignore ELO interrupt
 
  - mvpp2: fix XDP rx queues registering
 
 Misc:
 
  - treewide: add missing includes masked by cgroup -> bpf.h dependency
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGyN1AACgkQMUZtbf5S
 IrtgMA/8D0qk3c75ts0hCzGXwdNdEBs+e7u1bJVPqdyU8x/ZLAp2c0EKB/7IWuxA
 CtsnbanPcmibqvQJDI1hZEBdafi43BmF5VuFSIxYC4EM/1vLoRprurXlIwL2YWki
 aWi//tyOIGBl6/ClzJ9Vm51HTJQwDmdv8GRnKAbsC1eOTM3pmmcg+6TLbDhycFEQ
 F9kkDCvyB9kWIH645QyJRH+Y5qQOvneCyQCPkkyjTgEADzV5i7YgtRol6J3QIbw3
 umPHSckCBTjMacYcCLsbhQaF2gTMgPV1basNLPMjCquJVrItE0ZaeX3MiD6nBFae
 yY5+Wt5KAZDzjERhneX8AINHoRPA/tNIahC1+ytTmsTA8Hj230FHE5hH1ajWiJ9+
 GSTBCBqjtZXce3r2Efxfzy0Kb9JwL3vDi7LS2eKQLv0zBLfYp2ry9Sp9qe4NhPkb
 OYrxws9kl5GOPvrFB5BWI9XBINciC9yC3PjIsz1noi0vD8/Hi9dPwXeAYh36fXU3
 rwRg9uAt6tvFCpwbuQ9T2rsMST0miur2cDYd8qkJtuJ7zFvc+suMXwBZyI29nF2D
 uyymIC2XStHJfAjUkFsGVUSXF5FhML9OQsqmisdQ8KdH26jMnDeMjIWJM7UWK+zY
 E/fqWT8UyS3mXWqaggid4ZbotipCwA0gxiDHuqqUGTM+dbKrzmk=
 =F6rS
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - bpf, sockmap: re-evaluate proto ops when psock is removed from
     sockmap

  Current release - new code bugs:

   - bpf: fix bpf_check_mod_kfunc_call for built-in modules

   - ice: fixes for TC classifier offloads

   - vrf: don't run conntrack on vrf with !dflt qdisc

  Previous releases - regressions:

   - bpf: fix the off-by-two error in range markings

   - seg6: fix the iif in the IPv6 socket control block

   - devlink: fix netns refcount leak in devlink_nl_cmd_reload()

   - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"

   - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports

  Previous releases - always broken:

   - ethtool: do not perform operations on net devices being
     unregistered

   - udp: use datalen to cap max gso segments

   - ice: fix races in stats collection

   - fec: only clear interrupt of handling queue in fec_enet_rx_queue()

   - m_can: pci: fix incorrect reference clock rate

   - m_can: disable and ignore ELO interrupt

   - mvpp2: fix XDP rx queues registering

  Misc:

   - treewide: add missing includes masked by cgroup -> bpf.h
     dependency"

* tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
  net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
  net: wwan: iosm: fixes unable to send AT command during mbim tx
  net: wwan: iosm: fixes net interface nonfunctional after fw flash
  net: wwan: iosm: fixes unnecessary doorbell send
  net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering
  MAINTAINERS: s390/net: remove myself as maintainer
  net/sched: fq_pie: prevent dismantle issue
  net: mana: Fix memory leak in mana_hwc_create_wq
  seg6: fix the iif in the IPv6 socket control block
  nfp: Fix memory leak in nfp_cpp_area_cache_add()
  nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
  nfc: fix segfault in nfc_genl_dump_devices_done
  udp: using datalen to cap max gso segments
  net: dsa: mv88e6xxx: error handling for serdes_power functions
  can: kvaser_usb: get CAN clock frequency from device
  can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
  net: mvpp2: fix XDP rx queues registering
  vmxnet3: fix minimum vectors alloc issue
  net, neigh: clear whole pneigh_entry at alloc time
  net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
  ...
2021-12-09 11:26:44 -08:00
Maciej S. Szmigiero
ee3a4f6662 KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
INTERCEPT_x are bit positions, but the code was using the raw value of
INTERCEPT_VINTR (4) instead of BIT(INTERCEPT_VINTR).
This resulted in masking of bit 2 - that is, SMI instead of VINTR.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <49b9571d25588870db5380b0be1a41df4bbaaf93.1638486479.git.maciej.szmigiero@oracle.com>
2021-12-09 12:44:39 -05:00
Sasha Levin
3a49cc22d3 tools/lib/lockdep: drop leftover liblockdep headers
Clean up remaining headers that are specific to liblockdep but lived in
the shared header directory.  These are all unused after the liblockdep
code was removed in commit 7246f4dcac ("tools/lib/lockdep: drop
liblockdep").

Note that there are still headers that were originally created for
liblockdep, that still have liblockdep references, but they are used by
other tools/ code at this point.

Signed-off-by: Sasha Levin <sashal@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-09 09:37:49 -08:00
Yonghong Song
b540358e6c selftests/bpf: Fix a compilation warning
The following warning is triggered when I used clang compiler
to build the selftest.

  /.../prog_tests/btf_dedup_split.c:368:6: warning: variable 'btf2' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
        if (!ASSERT_OK(err, "btf_dedup"))
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /.../prog_tests/btf_dedup_split.c:424:12: note: uninitialized use occurs here
        btf__free(btf2);
                  ^~~~
  /.../prog_tests/btf_dedup_split.c:368:2: note: remove the 'if' if its condition is always false
        if (!ASSERT_OK(err, "btf_dedup"))
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /.../prog_tests/btf_dedup_split.c:343:25: note: initialize the variable 'btf2' to silence this warning
        struct btf *btf1, *btf2;
                               ^
                                = NULL

Initialize local variable btf2 = NULL and the warning is gone.

Fixes: 9a49afe6f5 ("selftests/bpf: Add btf_dedup case with duplicated structs within CU")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211209050403.1770836-1-yhs@fb.com
2021-12-08 23:05:26 -08:00
Jakub Kicinski
fd31cb0c6a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal.

2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel.

3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit
   groups, from Stefano Brivio.

4) Break rule evaluation on malformed TCP options.

5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh,
   also from Florian

6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: conntrack: annotate data-races around ct->timeout
  selftests: netfilter: switch zone stress to socat
  netfilter: nft_exthdr: break evaluation if setting TCP option fails
  selftests: netfilter: Add correctness test for mac,net set type
  nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
  vrf: don't run conntrack on vrf with !dflt qdisc
  netfilter: nfnetlink_queue: silence bogus compiler warning
====================

Link: https://lore.kernel.org/r/20211209000847.102598-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 17:02:35 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Song Liu
8d0f9e73ef perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
bpf_create_map is deprecated. Replace it with bpf_map_create. Also add a
__weak bpf_map_create() so that when older version of libbpf is linked as
a shared library, it falls back to bpf_create_map().

Fixes: 992c422541 ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211207232340.2561471-1-song@kernel.org
2021-12-08 11:55:45 -08:00
Maxim Mikityanskiy
b560b21f71 bpf: Add selftests to cover packet access corner cases
This commit adds BPF verifier selftests that cover all corner cases by
packet boundary checks. Specifically, 8-byte packet reads are tested at
the beginning of data and at the beginning of data_meta, using all kinds
of boundary checks (all comparison operators: <, >, <=, >=; both
permutations of operands: data + length compared to end, end compared to
data + length). For each case there are three tests:

1. Length is just enough for an 8-byte read. Length is either 7 or 8,
   depending on the comparison.

2. Length is increased by 1 - should still pass the verifier. These
   cases are useful, because they failed before commit 2fa7d94afc
   ("bpf: Fix the off-by-two error in range markings").

3. Length is decreased by 1 - should be rejected by the verifier.

Some existing tests are just renamed to avoid duplication.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211207081521.41923-1-maximmi@nvidia.com
2021-12-08 15:42:26 +01:00
Chen Yu
11f8cb8903 ACPI: tools: Fix compilation when output directory is not present
Compiling the ACPI tools when output directory parameter is specified,
but the output directory is not present, triggers the following error:

make O=/data/test/tmp/ -C tools/power/acpi/

make: Entering directory '/data/src/kernel/linux/tools/power/acpi'
  DESCEND tools/acpidbg
make[1]: Entering directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg'
  MKDIR    include
  CP       include
  CC       tools/acpidbg/acpidbg.o
Assembler messages:
Fatal error: can't create /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o: No such file or directory
make[1]: *** [../../Makefile.rules:24: /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o] Error 1
make[1]: Leaving directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg'
make: *** [Makefile:18: acpidbg] Error 2
make: Leaving directory '/data/src/kernel/linux/tools/power/acpi'

which occurs because the output directory has not been created yet.

Fix this issue by creating the output directory before compiling.

Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ rjw: New subject, changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-08 15:10:27 +01:00
Vadim Fedorenko
13bf99ab21 selftests: tls: add missing AES256-GCM cipher
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:18:07 -08:00
Vadim Fedorenko
d76c51f976 selftests: tls: add missing AES-CCM cipher tests
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:18:07 -08:00
Florian Westphal
d46cea0e69 selftests: netfilter: switch zone stress to socat
centos9 has nmap-ncat which doesn't like the '-q' option, use socat.
While at it, mark test skipped if needed tools are missing.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08 01:29:15 +01:00
Stefano Brivio
0de53b0ffb selftests: netfilter: Add correctness test for mac,net set type
The existing net,mac test didn't cover the issue recently reported
by Nikita Yushchenko, where MAC addresses wouldn't match if given
as first field of a concatenated set with AVX2 and 8-bit groups,
because there's a different code path covering the lookup of six
8-bit groups (MAC addresses) if that's the first field.

Add a similar mac,net test, with MAC address and IPv4 address
swapped in the set specification.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08 01:05:55 +01:00
Nicolas Dichtel
d43b75fbc2 vrf: don't run conntrack on vrf with !dflt qdisc
After the below patch, the conntrack attached to skb is set to "notrack" in
the context of vrf device, for locally generated packets.
But this is true only when the default qdisc is set to the vrf device. When
changing the qdisc, notrack is not set anymore.
In fact, there is a shortcut in the vrf driver, when the default qdisc is
set, see commit dcdd43c41e ("net: vrf: performance improvements for
IPv4") for more details.

This patch ensures that the behavior is always the same, whatever the qdisc
is.

To demonstrate the difference, a new test is added in conntrack_vrf.sh.

Fixes: 8c9c296adf ("vrf: run conntrack only in context of lower/physdev for locally generated packets")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08 01:05:55 +01:00
Florian Westphal
edb596e80c selftests: mptcp: check IP_TOS in/out are the same
Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set
right before.

Also check that socklen_t == 0 and -1 input values match those
of normal tcp sockets.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 11:36:30 -08:00
Florian Westphal
b51880568f selftests: mptcp: add inq test case
client & server use a unix socket connection to communicate
outside of the mptcp connection.

This allows the consumer to know in advance how many bytes have been
(or will be) sent by the peer.
This allows stricter checks on the bytecounts reported by TCP_INQ cmsg.

Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 11:36:30 -08:00
Florian Westphal
5cbd886ce2 selftests: mptcp: add TCP_INQ support
Do checks on the returned inq counter.

Fail on:
1. Huge value (> 1 kbyte, test case files are 1 kb)
2. last hint larger than returned bytes when read was short
3. erronenous indication of EOF.

3) happens when a hint of X bytes reads X-1 on next call
   but next recvmsg returns more data (instead of EOF).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 11:36:29 -08:00
Song Liu
5a897531e0 perf bpf_skel: Do not use typedef to avoid error on old clang
When building bpf_skel with clang-10, typedef causes confusions like:

  libbpf: map 'prev_readings': unexpected def kind var.

Fix this by removing the typedef.

Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/BEF5C312-4331-4A60-AEC0-AD7617CB2BC4@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Song Liu
f7c4e85bcc perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1
to then make this the default he found problems in some distros where
the system linux/bpf.h file was being used and lacked this:

   util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS'
           __uint(map_flags, BPF_F_PRESERVE_ELEMS);

So use instead the vmlinux.h file generated by bpftool from BTF info.

This fixed these as well, getting the build back working on debian:11,
debian:experimental and ubuntu:21.10:

  In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33:
  :
  In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111:
  :
  /usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from  util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error:
  : : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found:

  /usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
  #include <asm/types.h>#include <asm/types.h>

           ^~~~~~~~~~~~~         ^~~~~~~~~~~~~

  #include <asm/types.h>
           ^~~~~~~~~~~~~
  1 error generated.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CF175681-8101-43D1-ABDB-449E644BE986@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4747395082 perf header: Fix memory leaks when processing feature headers
These leaks were found with leak sanitizer running "perf pipe recording
and injection test".

In pipe mode feat_fd may hold onto an events struct that needs freeing.

When string features are processed they may overwrite an already created
string, so free this before the overwrite.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211118201730.2302927-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
1aa79e5773 perf test: Reset shadow counts before loading
Otherwise load counting is an average. Without this change
duration_time in test_memory_bandwidth will alter its value if an
earlier test contains duration_time.

This patch fixes an issue that's introduced in the proposed patch:
https://lore.kernel.org/lkml/20211124015226.3317994-1-irogers@google.com/
in perf test "Parse and process metrics".

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211128085810.4027314-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Thomas Richter
6c481031c9 perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
Some platforms do not have CPU die support, for example s390.

Commit
Cc: Ian Rogers <irogers@google.com>
Fixes: fdf1e29b61 ("perf expr: Add metric literals for topology.")
fails on s390:

  # perf test -Fv 7
    ...
  # FAILED tests/expr.c:173 #num_dies >= #num_packages
    ---- end ----
    Simple expression parser: FAILED!
  #

Investigating this issue leads to these functions:

 build_cpu_topology()
   +--> has_die_topology(void)
        {
           struct utsname uts;

           if (uname(&uts) < 0)
                  return false;
           if (strncmp(uts.machine, "x86_64", 6))
                  return false;
           ....
        }

which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the the struct
cpu_topology::die_cpu_list is not contructed and has zero entries. This
leads to the failing comparison: #num_dies >= #num_packages.  s390 of
course has a positive number of packages.

Fix this and check if the function build_cpu_topology() did build up
a die_cpus_list. The number of entries in this list should be larger
than 0. If the number of list element is zero, the die_cpus_list has
not been created and the check in function test__expr():

    TEST_ASSERT_VAL("#num_dies >= #num_packages", \
		    num_dies >= num_packages)

always fails.

Output after:

  # perf test -Fv 7
   7: Simple expression parser                                        :
   --- start ---
   division by zero
   syntax error
   ---- end ----
   Simple expression parser: Ok
  #

Fixes: fdf1e29b61 ("perf expr: Add metric literals for topology.")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20211129112339.3003036-1-tmricht@linux.ibm.com
[ Added comment in the added 'if (num_dies)' line about architectures not having die topology ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
3d1d57debe tools build: Remove needless libpython-version feature check that breaks test-all fast path
Since 66dfdff03d ("perf tools: Add Python 3 support") we don't use
the tools/build/feature/test-libpython-version.c version in any Makefile
feature check:

  $ find tools/ -type f | xargs grep feature-libpython-version
  $

The only place where this was used was removed in 66dfdff03d:

  -        ifneq ($(feature-libpython-version), 1)
  -          $(warning Python 3 is not yet supported; please set)
  -          $(warning PYTHON and/or PYTHON_CONFIG appropriately.)
  -          $(warning If you also have Python 2 installed, then)
  -          $(warning try something like:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make PYTHON=python2)
  -          $(warning $(and ,))
  -          $(warning Otherwise, disable Python support entirely:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make NO_LIBPYTHON=1)
  -          $(warning $(and ,))
  -          $(error   $(and ,))
  -        else
  -          LDFLAGS += $(PYTHON_EMBED_LDFLAGS)
  -          EXTLIBS += $(PYTHON_EMBED_LIBADD)
  -          LANG_BINDINGS += $(obj-perf)python/perf.so
  -          $(call detected,CONFIG_LIBPYTHON)
  -        endif

And nowadays we either build with PYTHON=python3 or just install the
python3 devel packages and perf will build against it.

But the leftover feature-libpython-version check made the fast path
feature detection to break in all cases except when python2 devel files
were installed:

  $ rpm -qa | grep python.*devel
  python3-devel-3.9.7-1.fc34.x86_64
  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
  <SNIP>
  $ cat /tmp/build/perf/feature/test-all.make.output
  In file included from test-all.c:18:
  test-libpython-version.c:5:10: error: #error
      5 |         #error
        |          ^~~~~
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000)
  $

As python3 is the norm these days, fix this by just removing the unused
feature-libpython-version feature check, making the test-all fast path
to work with the common case.

With this:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin |& head
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
    HOSTLD  /tmp/build/perf/fixdep-in.o
    LINK    /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000)
  $ cat /tmp/build/perf/feature/test-all.make.output
  $

Reviewed-by: James Clark <james.clark@arm.com>
Fixes: 66dfdff03d ("perf tools: Add Python 3 support")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YaYmeeC6CS2b8OSz@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4ffbe87e2d perf tools: Fix SMT detection fast read path
sysfs__read_int() returns 0 on success, and so the fast read path was
always failing.

Fixes: bb629484d9 ("perf tools: Simplify checking if SMT is active.")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124001231.3277836-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
cba43fcf7a tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall
To pick the changes in this cset:

  a0eb2da92b ("futex: Wireup futex_waitv syscall")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449)
  mmap size 528384B
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
  221	32	futex				sys_futex_time32
  221	64	futex				sys_futex
  221	spu	futex				sys_futex
  422	32	futex_time64			sys_futex			sys_futex
  449	common  futex_waitv                     sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YZ%2F1OU9mJuyS2HMa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Adrian Hunter
c29d979260 perf inject: Fix itrace space allowed for new attributes
The space allowed for new attributes can be too small if existing header
information is large. That can happen, for example, if there are very
many CPUs, due to having an event ID per CPU per event being stored in the
header information.

Fix by adding the existing header.data_offset. Also increase the extra
space allowed to 8KiB and align to a 4KiB boundary for neatness.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20211125071457.2066863-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Arnaldo Carvalho de Melo
71a16df164 tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall
To pick the changes in these csets:

  6c122360cf ("s390: wire up sys_futex_waitv system call")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449)
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl
  238  common	futex			sys_futex			sys_futex_time32
  422	32	futex_time64		-				sys_futex
  449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/lkml/YZ%2F2qRW%2FTScYTP1U@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Jiri Olsa
3f8d657716 Revert "perf bench: Fix two memory leaks detected with ASan"
This: This reverts commit 92723ea0f1.

  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRR Ok

yep, it seems the perf bench is broken so the counts won't correlated if
I revert this one:

  92723ea0f1 perf bench: Fix two memory leaks detected with ASan

it works for me again.. it seems to break -t option

   [root@dell-r440-01 perf]# ./perf bench sched messaging -g 1 -l 100 -t
   # Running 'sched/messaging' benchmark:
   RRRperf: CLIENT: ready write: Bad file descriptor
   Rperf: SENDER: write: Bad file descriptor

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/lkml/YZev7KClb%2Fud43Lc@krava/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Grant Seltzer
d5284dedcc libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:

- bpf_object__open_file()
- bpf_object__open_mem()
- bpf_program__attach_uprobe()
- bpf_program__attach_uprobe_opts()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211206203709.332530-1-grantseltzer@gmail.com
2021-12-06 14:22:37 -08:00
huangxuesen
222c98c797 libbpf: Fix trivial typo
Fix typo in comment from 'bpf_skeleton_map' to 'bpf_map_skeleton'
and from 'bpf_skeleton_prog' to 'bpf_prog_skeleton'.

Signed-off-by: huangxuesen <huangxuesen@kuaishou.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1638755236-3851199-1-git-send-email-hxseverything@gmail.com
2021-12-06 13:23:40 -08:00
Alexei Starovoitov
942df4dc5e bpftool: Add debug mode for gen_loader.
Make -d flag functional for gen_loader style program loading.

For example:
$ bpftool prog load -L -d test_d_path.o
... // will print:
libbpf: loading ./test_d_path.o
libbpf: elf: section(3) fentry/security_inode_getattr, size 280, link 0, flags 6, type=1
...
libbpf: prog 'prog_close': found data map 0 (test_d_p.bss, sec 7, off 0) for insn 30
libbpf: gen: load_btf: size 5376
libbpf: gen: map_create: test_d_p.bss idx 0 type 2 value_type_id 118
libbpf: map 'test_d_p.bss': created successfully, fd=0
libbpf: gen: map_update_elem: idx 0
libbpf: sec 'fentry/filp_close': found 1 CO-RE relocations
libbpf: record_relo_core: prog 1 insn[15] struct file 0:1 final insn_idx 15
libbpf: gen: prog_load: type 26 insns_cnt 35 progi_idx 0
libbpf: gen: find_attach_tgt security_inode_getattr 12
libbpf: gen: prog_load: type 26 insns_cnt 37 progi_idx 1
libbpf: gen: find_attach_tgt filp_close 12
libbpf: gen: finish 0
... // at this point libbpf finished generating loader program
   0: (bf) r6 = r1
   1: (bf) r1 = r10
   2: (07) r1 += -136
   3: (b7) r2 = 136
   4: (b7) r3 = 0
   5: (85) call bpf_probe_read_kernel#113
   6: (05) goto pc+104
... // this is the assembly dump of the loader program
 390: (63) *(u32 *)(r6 +44) = r0
 391: (18) r1 = map[idx:0]+5584
 393: (61) r0 = *(u32 *)(r1 +0)
 394: (63) *(u32 *)(r6 +24) = r0
 395: (b7) r0 = 0
 396: (95) exit
err 0  // the loader program was loaded and executed successfully
(null)
func#0 @0
...  // CO-RE in the kernel logs:
CO-RE relocating STRUCT file: found target candidate [500]
prog '': relo #0: kind <byte_off> (0), spec is [8] STRUCT file.f_path (0:1 @ offset 16)
prog '': relo #0: matching candidate #0 [500] STRUCT file.f_path (0:1 @ offset 16)
prog '': relo #0: patched insn #15 (ALU/ALU64) imm 16 -> 16
vmlinux_cand_cache:[11]file(500),
module_cand_cache:
... // verifier logs when it was checking test_d_path.o program:
R1 type=ctx expected=fp
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
; int BPF_PROG(prog_close, struct file *file, void *id)
0: (79) r6 = *(u64 *)(r1 +0)
func 'filp_close' arg0 has btf_id 500 type STRUCT 'file'
1: R1=ctx(id=0,off=0,imm=0) R6_w=ptr_file(id=0,off=0,imm=0) R10=fp0
; pid_t pid = bpf_get_current_pid_tgid() >> 32;
1: (85) call bpf_get_current_pid_tgid#14

... // if there are multiple programs being loaded by the loader program
... // only the last program in the elf file will be printed, since
... // the same verifier log_buf is used for all PROG_LOAD commands.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211204194623.27779-1-alexei.starovoitov@gmail.com
2021-12-05 11:40:32 -08:00
Linus Torvalds
f5d54a42d3 - Fix a couple of SWAPGS fencing issues in the x86 entry code
- Use the proper operand types in __{get,put}_user() to prevent
 truncation in SEV-ES string io
 
 - Make sure the kernel mappings are present in trampoline_pgd in order
 to prevent any potential accesses to unmapped memory after switching to
 it
 
 - Fix a trivial list corruption in objtool's pv_ops validation
 
 - Disable the clocksource watchdog for TSC on platforms which claim
 that the TSC is constant, doesn't stop in sleep states, CPU has TSC
 adjust and the number of sockets of the platform are max 2, to prevent
 erroneous markings of the TSC as unstable.
 
 - Make sure TSC adjust is always checked not only when going idle
 
 - Prevent a stack leak by initializing struct _fpx_sw_bytes properly in
 the FPU code
 
 - Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmGsnWcACgkQEsHwGGHe
 VUoR+g/9FcOP0/XLH+LKHumYc9JHXsp5BvYGihyypMFgU0fXQBORtGqdls8jZtiJ
 kEdbW6iL0MRlyN8aHJCqr7dJqs7KJlpWes6hky7BY+U+7uewtjL5y3eSyZnA34T3
 M/Raecx27Hh0L0kHQlHXTUN73v1cgDvq3dCXWsP7Jqgjf5cEmCcV/tPEateqhq/f
 8TkLVIm55rJlbJ0LBO/cT0V3Q8QH9JPKm7nviOZuKCh9gcttFEPaM9MkaJyKUhoy
 O13jlenDoVkVWRXIQec1EZp2pTLxVAm/3Y0plge1yEVsejzh07gsQnMpoNeF+yFC
 8mDgSv8ZAED/vbsnB+BcgoRVj6ajG0+ilpLzcfPwUquiqS9pZrBSTddlvYDPjRMC
 MEXO548xiYgxmipu3r62H89nqmLEYQPk914rJu6bDnDeJ1gaabh8RXbNtQcRqqj3
 RETgVOp78iWn+aT33RLLD1EyodZb2IkMy087a3+TZICIXG81aDj9VgHvrVRnWnfY
 yKuldyrEKzi60yMQkV6h1oc8KSWQhspUSLtOVS9zrulCinYphFOfYFrzFmcKUWIq
 GdVb9eaP2oNBGfPybXP+TBLGZ4Zv9iXZmaEUk7ZGCjgv3ZmGMWJ18Hs/ufs2cwWK
 RNNUo3sz/y3OsreHowkWIk1eSxI16MabB7G/PDMnBSHlioVT390=
 =d6nS
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix a couple of SWAPGS fencing issues in the x86 entry code

 - Use the proper operand types in __{get,put}_user() to prevent
   truncation in SEV-ES string io

 - Make sure the kernel mappings are present in trampoline_pgd in order
   to prevent any potential accesses to unmapped memory after switching
   to it

 - Fix a trivial list corruption in objtool's pv_ops validation

 - Disable the clocksource watchdog for TSC on platforms which claim
   that the TSC is constant, doesn't stop in sleep states, CPU has TSC
   adjust and the number of sockets of the platform are max 2, to
   prevent erroneous markings of the TSC as unstable.

 - Make sure TSC adjust is always checked not only when going idle

 - Prevent a stack leak by initializing struct _fpx_sw_bytes properly in
   the FPU code

 - Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention

* tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
  x86/entry: Use the correct fence macro after swapgs in kernel CR3
  x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
  x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
  x86/64/mm: Map all kernel memory into trampoline_pgd
  objtool: Fix pv_ops noinstr validation
  x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
  x86/tsc: Add a timer to make sure TSC_adjust is always checked
  x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog()
  x86/cpu: Drop spurious underscore from RAPTOR_LAKE #define
2021-12-05 08:43:35 -08:00
Maxim Mikityanskiy
da54ab1495 bpf: Fix the test_task_vma selftest to support output shorter than 1 kB
The test for bpf_iter_task_vma assumes that the output will be longer
than 1 kB, as the comment above the loop says. Due to this assumption,
the loop becomes infinite if the output turns to be shorter than 1 kB.
The return value of read_fd_into_buffer is 0 when the end of file was
reached, and len isn't being increased any more.

This commit adds a break on EOF to handle short output correctly. For
the reference, this is the contents that I get when running test_progs
under vmtest.sh, and it's shorter than 1 kB:

00400000-00401000 r--p 00000000 fe:00 25867     /root/bpf/test_progs
00401000-00674000 r-xp 00001000 fe:00 25867     /root/bpf/test_progs
00674000-0095f000 r--p 00274000 fe:00 25867     /root/bpf/test_progs
0095f000-00983000 r--p 0055e000 fe:00 25867     /root/bpf/test_progs
00983000-00a8a000 rw-p 00582000 fe:00 25867     /root/bpf/test_progs
00a8a000-0484e000 rw-p 00000000 00:00 0
7f6c64000000-7f6c64021000 rw-p 00000000 00:00 0
7f6c64021000-7f6c68000000 ---p 00000000 00:00 0
7f6c6ac8f000-7f6c6ac90000 r--s 00000000 00:0d 8032
anon_inode:bpf-map
7f6c6ac90000-7f6c6ac91000 ---p 00000000 00:00 0
7f6c6ac91000-7f6c6b491000 rw-p 00000000 00:00 0
7f6c6b491000-7f6c6b492000 r--s 00000000 00:0d 8032
anon_inode:bpf-map
7f6c6b492000-7f6c6b493000 rw-s 00000000 00:0d 8032
anon_inode:bpf-map
7ffc1e23d000-7ffc1e25e000 rw-p 00000000 00:00 0
7ffc1e3b8000-7ffc1e3bc000 r--p 00000000 00:00 0
7ffc1e3bc000-7ffc1e3bd000 r-xp 00000000 00:00 0
7fffffffe000-7ffffffff000 --xp 00000000 00:00 0

Fixes: e8168840e1 ("selftests/bpf: Add test for bpf_iter_task_vma")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211130181811.594220-1-maximmi@nvidia.com
2021-12-03 23:49:14 +01:00
Alexei Starovoitov
78c1f8d063 libbpf: Reduce bpf_core_apply_relo_insn() stack usage.
Reduce bpf_core_apply_relo_insn() stack usage and bump
BPF_CORE_SPEC_MAX_LEN limit back to 64.

Fixes: 29db4bea1d ("bpf: Prepare relo_core.c for kernel duty.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.com
2021-12-03 13:21:59 -08:00
Maxim Mikityanskiy
2fa7d94afc bpf: Fix the off-by-two error in range markings
The first commit cited below attempts to fix the off-by-one error that
appeared in some comparisons with an open range. Due to this error,
arithmetically equivalent pieces of code could get different verdicts
from the verifier, for example (pseudocode):

  // 1. Passes the verifier:
  if (data + 8 > data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

  // 2. Rejected by the verifier (should still pass):
  if (data + 7 >= data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

The attempted fix, however, shifts the range by one in a wrong
direction, so the bug not only remains, but also such piece of code
starts failing in the verifier:

  // 3. Rejected by the verifier, but the check is stricter than in #1.
  if (data + 8 >= data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

The change performed by that fix converted an off-by-one bug into
off-by-two. The second commit cited below added the BPF selftests
written to ensure than code chunks like #3 are rejected, however,
they should be accepted.

This commit fixes the off-by-two error by adjusting new_range in the
right direction and fixes the tests by changing the range into the
one that should actually fail.

Fixes: fb2a311a31 ("bpf: fix off by one for range markings with L{T, E} patterns")
Fixes: b37242c773 ("bpf: add test cases to bpf selftests to cover all access tests")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211130181607.593149-1-maximmi@nvidia.com
2021-12-03 21:44:42 +01:00
Andrii Nakryiko
0bf40542c0 perf: Mute libbpf API deprecations temporarily
Libbpf development version was bumped to 0.7 in c93faaaf2f
("libbpf: Deprecate bpf_prog_load_xattr() API"), activating a bunch of
previously scheduled deprecations. Most APIs are pretty straightforward
to replace with newer APIs, but perf has a complicated mixed setup with
libbpf used both as static and shared configurations, which makes it
non-trivial to migrate the APIs.

Further, bpf_program__set_prep() needs more involved refactoring, which
will require help from Arnaldo and/or Jiri.

So for now, mute deprecation warnings and work on migrating perf off of
deprecated APIs separately with the input from owners of the perf tool.

Fixes: c93faaaf2f ("libbpf: Deprecate bpf_prog_load_xattr() API")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211203004640.2455717-1-andrii@kernel.org
2021-12-03 11:54:51 -08:00
Li Zhijian
0f8a3b48f9 selftests: net/fcnal-test.sh: add exit code
Previously, the selftest framework always treats it as *ok* even though
some of them are failed actually. That's because the script always
returns 0.

It supports PASS/FAIL/SKIP exit code now.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 14:17:34 +00:00
Li Zhijian
db925bca33 selftests/tc-testing: Fix cannot create /sys/bus/netdevsim/new_device: Directory nonexistent
Install netdevsim to provide /sys/bus/netdevsim/new_device interface.

It helps to fix:
 # ok 97 9a7d - Change ETS strict band without quantum # skipped - skipped - previous setup failed 11 ce7d
 #
 #
 # -----> prepare stage *** Could not execute: "echo "1 1 4" > /sys/bus/netdevsim/new_device"
 #
 # -----> prepare stage *** Error message: "/bin/sh: 1: cannot create /sys/bus/netdevsim/new_device: Directory nonexistent
 # "
 #
 # -----> prepare stage *** Aborting test run.
 #
 #
 # <_io.BufferedReader name=5> *** stdout ***
 #

Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:46:41 +00:00
Li Zhijian
a8c9505c53 selftests/tc-testing: add missing config
qdiscs/fq_pie requires CONFIG_NET_SCH_FQ_PIE, otherwise tc will fail
to create a fq_pie qdisc.

It fixes following issue:
 # not ok 57 83be - Create FQ-PIE with invalid number of flows
 #       Command exited with 2, expected 0
 # Error: Specified qdisc not found.

Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:46:41 +00:00
Li Zhijian
96f3896780 selftests/tc-testing: add exit code
Mark the summary result as FAIL to prevent from confusing the selftest
framework if some of them are failed.

Previously, the selftest framework always treats it as *ok* even though
some of them are failed actually. That's because the script tdc.sh always
return 0.

 # All test results:
 #
 # 1..97
 # ok 1 83be - Create FQ-PIE with invalid number of flows
 # ok 2 8b6e - Create RED with no flags
[...snip]
 # ok 6 5f15 - Create RED with flags ECN, harddrop
 # ok 7 53e8 - Create RED with flags ECN, nodrop
 # ok 8 d091 - Fail to create RED with only nodrop flag
 # ok 9 af8e - Create RED with flags ECN, nodrop, harddrop
 # not ok 10 ce7d - Add mq Qdisc to multi-queue device (4 queues)
 #       Could not match regex pattern. Verify command output:
 # qdisc mq 1: root
 # qdisc fq_codel 0: parent 1:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
 # qdisc fq_codel 0: parent 1:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
[...snip]
 # ok 96 6979 - Change quantum of a strict ETS band
 # ok 97 9a7d - Change ETS strict band without quantum
 #
 #
 #
 #
 ok 1 selftests: tc-testing: tdc.sh <<< summary result

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-03 11:46:41 +00:00
Peter Zijlstra
988f01683c objtool: Fix pv_ops noinstr validation
Boris reported that in one of his randconfig builds, objtool got
infinitely stuck. Turns out there's trivial list corruption in the
pv_ops tracking when a function is both in a static table and in a code
assignment.

Avoid re-adding function to the pv_ops[] lists when they're already on
it.

Fixes: db2b0c5d7b ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net
2021-12-03 09:11:42 +01:00
Peilin Ye
f6071e5e39 selftests/fib_tests: Rework fib_rp_filter_test()
Currently rp_filter tests in fib_tests.sh:fib_rp_filter_test() are
failing.  ping sockets are bound to dummy1 using the "-I" option
(SO_BINDTODEVICE), but socket lookup is failing when receiving ping
replies, since the routing table thinks they belong to dummy0.

For example, suppose ping is using a SOCK_RAW socket for ICMP messages.
When receiving ping replies, in __raw_v4_lookup(), sk->sk_bound_dev_if
is 3 (dummy1), but dif (skb_rtable(skb)->rt_iif) says 2 (dummy0), so the
raw_sk_bound_dev_eq() check fails.  Similar things happen in
ping_lookup() for SOCK_DGRAM sockets.

These tests used to pass due to a bug [1] in iputils, where "ping -I"
actually did not bind ICMP message sockets to device.  The bug has been
fixed by iputils commit f455fee41c07 ("ping: also bind the ICMP socket
to the specific device") in 2016, which is why our rp_filter tests
started to fail.  See [2] .

Fixing the tests while keeping everything in one netns turns out to be
nontrivial.  Rework the tests and build the following topology:

 ┌─────────────────────────────┐    ┌─────────────────────────────┐
 │  network namespace 1 (ns1)  │    │  network namespace 2 (ns2)  │
 │                             │    │                             │
 │  ┌────┐     ┌─────┐         │    │  ┌─────┐            ┌────┐  │
 │  │ lo │<───>│veth1│<────────┼────┼─>│veth2│<──────────>│ lo │  │
 │  └────┘     ├─────┴──────┐  │    │  ├─────┴──────┐     └────┘  │
 │             │192.0.2.1/24│  │    │  │192.0.2.1/24│             │
 │             └────────────┘  │    │  └────────────┘             │
 └─────────────────────────────┘    └─────────────────────────────┘

Consider sending an ICMP_ECHO packet A in ns2.  Both source and
destination IP addresses are 192.0.2.1, and we use strict mode rp_filter
in both ns1 and ns2:

  1. A is routed to lo since its destination IP address is one of ns2's
     local addresses (veth2);
  2. A is redirected from lo's egress to veth2's egress using mirred;
  3. A arrives at veth1's ingress in ns1;
  4. A is redirected from veth1's ingress to lo's ingress, again, using
     mirred;
  5. In __fib_validate_source(), fib_info_nh_uses_dev() returns false,
     since A was received on lo, but reverse path lookup says veth1;
  6. However A is not dropped since we have relaxed this check for lo in
     commit 66f8209547 ("fib: relax source validation check for loopback
     packets");

Making sure A is not dropped here in this corner case is the whole point
of having this test.

  7. As A reaches the ICMP layer, an ICMP_ECHOREPLY packet, B, is
     generated;
  8. Similarly, B is redirected from lo's egress to veth1's egress (in
     ns1), then redirected once again from veth2's ingress to lo's
     ingress (in ns2), using mirred.

Also test "ping 127.0.0.1" from ns2.  It does not trigger the relaxed
check in __fib_validate_source(), but just to make sure the topology
works with loopback addresses.

Tested with ping from iputils 20210722-41-gf9fb573:

$ ./fib_tests.sh -t rp_filter

IPv4 rp_filter tests
    TEST: rp_filter passes local packets		[ OK ]
    TEST: rp_filter passes loopback packets		[ OK ]

[1] https://github.com/iputils/iputils/issues/55
[2] f455fee41c

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: adb701d6cf ("selftests: add a test case for rp_filter")
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211201004720.6357-1-yepeilin.cs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02 17:59:34 -08:00
Andrii Nakryiko
c93faaaf2f libbpf: Deprecate bpf_prog_load_xattr() API
bpf_prog_load_xattr() is high-level API that's named as a low-level
BPF_PROG_LOAD wrapper APIs, but it actually operates on struct
bpf_object. It's badly and confusingly misnamed as it will load all the
progs insige bpf_object, returning prog_fd of the very first BPF
program. It also has a bunch of ad-hoc things like log_level override,
map_ifindex auto-setting, etc. All this can be expressed more explicitly
and cleanly through existing libbpf APIs. This patch marks
bpf_prog_load_xattr() for deprecation in libbpf v0.8 ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/308

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-10-andrii@kernel.org
2021-12-02 15:23:41 -08:00
Andrii Nakryiko
186d1a8600 selftests/bpf: Remove all the uses of deprecated bpf_prog_load_xattr()
Migrate all the selftests that were still using bpf_prog_load_xattr().
Few are converted to skeleton, others will use bpf_object__open_file()
API.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-7-andrii@kernel.org
2021-12-02 15:23:40 -08:00
Andrii Nakryiko
00872de6e1 selftests/bpf: Mute xdpxceiver.c's deprecation warnings
xdpxceiver.c is using AF_XDP APIs that are deprecated starting from
libbpf 0.7. Until we migrate the test to libxdp or solve this issue in
some other way, mute deprecation warnings within xdpxceiver.c.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-6-andrii@kernel.org
2021-12-02 15:23:40 -08:00
Andrii Nakryiko
045b233a29 selftests/bpf: Remove recently reintroduced legacy btf__dedup() use
We've added one extra patch that added back the use of legacy
btf__dedup() variant. Clean that up.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-5-andrii@kernel.org
2021-12-02 15:23:40 -08:00
Andrii Nakryiko
a15d408b83 bpftool: Migrate off of deprecated bpf_create_map_xattr() API
Switch to bpf_map_create() API instead.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-4-andrii@kernel.org
2021-12-02 15:23:40 -08:00
Andrii Nakryiko
dbdd2c7f8c libbpf: Add API to get/set log_level at per-program level
Add bpf_program__set_log_level() and bpf_program__log_level() to fetch
and adjust log_level sent during BPF_PROG_LOAD command. This allows to
selectively request more or less verbose output in BPF verifier log.

Also bump libbpf version to 0.7 and make these APIs the first in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-3-andrii@kernel.org
2021-12-02 15:23:40 -08:00
Andrii Nakryiko
74d9807023 libbpf: Use __u32 fields in bpf_map_create_opts
Corresponding Linux UAPI struct uses __u32, not int, so keep it
consistent.

Fixes: 992c422541 ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-2-andrii@kernel.org
2021-12-02 15:23:08 -08:00
Kumar Kartikeya Dwivedi
3345193f6f tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
resolve_btfids prints a warning when it finds an unresolved symbol,
(id == 0) in id_patch. This can be the case for BTF sets that are empty
(due to disabled config options), hence printing warnings for certain
builds, most recently seen in [0].

The reason behind this is because id->cnt aliases id->id in btf_id
struct, leading to empty set showing up as ID 0 when we get to id_patch,
which triggers the warning. Since sets are an exception here, accomodate
by reusing hole in btf_id for bool is_set member, setting it to true for
BTF set when setting id->cnt, and use that to skip extraneous warning.

  [0]: https://lore.kernel.org/all/1b99ae14-abb4-d18f-cc6a-d7e523b25542@gmail.com

Before:

; ./tools/bpf/resolve_btfids/resolve_btfids -v -b vmlinux net/ipv4/tcp_cubic.ko
adding symbol tcp_cubic_kfunc_ids
WARN: resolve_btfids: unresolved symbol tcp_cubic_kfunc_ids
patching addr     0: ID       0 [tcp_cubic_kfunc_ids]
sorting  addr     4: cnt      0 [tcp_cubic_kfunc_ids]
update ok for net/ipv4/tcp_cubic.ko

After:

; ./tools/bpf/resolve_btfids/resolve_btfids -v -b vmlinux net/ipv4/tcp_cubic.ko
adding symbol tcp_cubic_kfunc_ids
patching addr     0: ID       0 [tcp_cubic_kfunc_ids]
sorting  addr     4: cnt      0 [tcp_cubic_kfunc_ids]
update ok for net/ipv4/tcp_cubic.ko

Fixes: 0e32dfc80b ("bpf: Enable TCP congestion control kfunc from modules")
Reported-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122144742.477787-4-memxor@gmail.com
2021-12-02 13:39:46 -08:00
Paul E. McKenney
8b4ff5f8bb selftests/bpf: Update test names for xchg and cmpxchg
The test_cmpxchg() and test_xchg() functions say "test_run add".
Therefore, make them say "test_run cmpxchg" and "test_run xchg",
respectively.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201005030.GA3071525@paulmck-ThinkPad-P17-Gen-1
2021-12-02 12:10:15 -08:00
Jean-Philippe Brucker
eee9a6df0e selftests/bpf: Build testing_helpers.o out of tree
Add $(OUTPUT) prefix to testing_helpers.o, so it can be built out of
tree when necessary. At the moment, in addition to being built in-tree
even when out-of-tree is required, testing_helpers.o is not built with
the right recipe when cross-building.

For consistency the other helpers, cgroup_helpers and trace_helpers, can
also be passed as objects instead of source. Use *_HELPERS variable to
keep the Makefile readable.

Fixes: f87c1930ac ("selftests/bpf: Merge test_stub.c into testing_helpers.c")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201145101.823159-1-jean-philippe@linaro.org
2021-12-02 11:55:41 -08:00
Jakub Kicinski
fc993be36f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02 11:44:56 -08:00
Linus Torvalds
a51e3ac43d Networking fixes for 5.16-rc4, including fixes from wireless,
and wireguard.
 
 Current release - regressions:
 
  - smc: keep smc_close_final()'s error code during active close
 
 Current release - new code bugs:
 
  - iwlwifi: various static checker fixes (int overflow, leaks, missing
    error codes)
 
  - rtw89: fix size of firmware header before transfer, avoid crash
 
  - mt76: fix timestamp check in tx_status; fix pktid leak;
 
  - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
 
 Previous releases - regressions:
 
  - smc: fix list corruption in smc_lgr_cleanup_early
 
  - ipv4: convert fib_num_tclassid_users to atomic_t
 
 Previous releases - always broken:
 
  - tls: fix authentication failure in CCM mode
 
  - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
    incorrect processing
 
  - dsa: mv88e6xxx: fixes for various device errata
 
  - rds: correct socket tunable error in rds_tcp_tune()
 
  - ipv6: fix memory leak in fib6_rule_suppress
 
  - wireguard: reset peer src endpoint when netns exits
 
  - wireguard: improve resilience to DoS around incoming handshakes
 
  - tcp: fix page frag corruption on page fault which involves TCP
 
  - mpls: fix missing attributes in delete notifications
 
  - mt7915: fix NULL pointer dereference with ad-hoc mode
 
 Misc:
 
  - rt2x00: be more lenient about EPROTO errors during start
 
  - mlx4_en: update reported link modes for 1/10G
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGo6KMACgkQMUZtbf5S
 Iruq+BAAhRMTcL+X4eRIL9lIEvWEKHMKLCA/pUaQWNlSxsbEeydJWRNSc37Cs3pv
 z0rYIEhfieOz8+QXS1Kq+yZwJVXjA8Jvgld2qw9V9Y5w+N15Mj8RUtG8NaUw+o4E
 U8PCAbaamnbzyPdlCYcVHschd8MD0BCXm5+jAGeIyCP+KQCnhEpFZv+bvHaWzQR8
 FZLYrhXTR9W0DFsrKG9+haqFwFBR3+VDqTGILhaHPE+r2o6wKQQ5yJMhd8fq0SaC
 nne8zDkGuFEeW3cxj0VbhdRMyrV97eMK+P4dZ2P0Z7xcrsed9/2XJkNQNJGtuRnj
 GGJV6utupJRAY+lnJNUkifqS4Wt7KirfZsSsyaKKa4plyoVgtGhiqEYFTQVLagC0
 CF4Qe+3qks6rESbRu6PEFN4oWSkMEhRzdcDpg7vBDURUKcrRs9fgtNUJUCi8nKFA
 A/F/K+7IHBoBZyQYZbYmnGdNsNauKbF3rUY3hwMGBfQZIr/wsql9+jhtLsmZX77m
 V/L7KzT2jhhNc5gDzuLps25K3P7snKuV19qQSsY2LeuGj1x3gmWZ+ibN6ynhB+Gt
 KBnfHDMTI/4aciZBIbwJmwfeRhCF8tOfw0WZdUP7FRIXukbfVuDBoznWLz4BKKgf
 GSYSTNDs/PHZQo5vCQ/onvTwUK5aN6zoPNy5ih7lp9YZBYtN2TI=
 =r0Jh
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, and wireguard.

  Mostly scattered driver changes this week, with one big clump in
  mv88e6xxx. Nothing of note, really.

  Current release - regressions:

   - smc: keep smc_close_final()'s error code during active close

  Current release - new code bugs:

   - iwlwifi: various static checker fixes (int overflow, leaks, missing
     error codes)

   - rtw89: fix size of firmware header before transfer, avoid crash

   - mt76: fix timestamp check in tx_status; fix pktid leak;

   - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()

  Previous releases - regressions:

   - smc: fix list corruption in smc_lgr_cleanup_early

   - ipv4: convert fib_num_tclassid_users to atomic_t

  Previous releases - always broken:

   - tls: fix authentication failure in CCM mode

   - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
     incorrect processing

   - dsa: mv88e6xxx: fixes for various device errata

   - rds: correct socket tunable error in rds_tcp_tune()

   - ipv6: fix memory leak in fib6_rule_suppress

   - wireguard: reset peer src endpoint when netns exits

   - wireguard: improve resilience to DoS around incoming handshakes

   - tcp: fix page frag corruption on page fault which involves TCP

   - mpls: fix missing attributes in delete notifications

   - mt7915: fix NULL pointer dereference with ad-hoc mode

  Misc:

   - rt2x00: be more lenient about EPROTO errors during start

   - mlx4_en: update reported link modes for 1/10G"

* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: dsa: b53: Add SPI ID table
  gro: Fix inconsistent indenting
  selftests: net: Correct case name
  net/rds: correct socket tunable error in rds_tcp_tune()
  mctp: Don't let RTM_DELROUTE delete local routes
  net/smc: Keep smc_close_final rc during active close
  ibmvnic: drop bad optimization in reuse_tx_pools()
  ibmvnic: drop bad optimization in reuse_rx_pools()
  net/smc: fix wrong list_del in smc_lgr_cleanup_early
  Fix Comment of ETH_P_802_3_MIN
  ethernet: aquantia: Try MAC address from device tree
  ipv4: convert fib_num_tclassid_users to atomic_t
  net: avoid uninit-value from tcp_conn_request
  net: annotate data-races on txq->xmit_lock_owner
  octeontx2-af: Fix a memleak bug in rvu_mbox_init()
  net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
  vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
  net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
  net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
  net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
  ...
2021-12-02 11:22:06 -08:00
Alexei Starovoitov
098dc5335a selftests/bpf: Add CO-RE relocations to verifier scale test.
Add 182 CO-RE relocations to verifier scale test.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-18-alexei.starovoitov@gmail.com
2021-12-02 11:18:36 -08:00
Alexei Starovoitov
3268f0316a selftests/bpf: Revert CO-RE removal in test_ksyms_weak.
The commit 087cba799c ("selftests/bpf: Add weak/typeless ksym test for light skeleton")
added test_ksyms_weak to light skeleton testing, but remove CO-RE access.
Revert that part of commit, since light skeleton can use CO-RE in the kernel.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-17-alexei.starovoitov@gmail.com
2021-12-02 11:18:36 -08:00
Alexei Starovoitov
26b367e366 selftests/bpf: Additional test for CO-RE in the kernel.
Add a test where randmap() function is appended to three different bpf
programs. That action checks struct bpf_core_relo replication logic
and offset adjustment in gen loader part of libbpf.

Fourth bpf program has 360 CO-RE relocations from vmlinux, bpf_testmod,
and non-existing type. It tests candidate cache logic.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-16-alexei.starovoitov@gmail.com
2021-12-02 11:18:36 -08:00
Alexei Starovoitov
650c9dbd10 selftests/bpf: Convert map_ptr_kern test to use light skeleton.
To exercise CO-RE in the kernel further convert map_ptr_kern
test to light skeleton.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-15-alexei.starovoitov@gmail.com
2021-12-02 11:18:36 -08:00
Alexei Starovoitov
d82fa9b708 selftests/bpf: Improve inner_map test coverage.
Check that hash and array inner maps are properly initialized.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-14-alexei.starovoitov@gmail.com
2021-12-02 11:18:36 -08:00
Alexei Starovoitov
bc5f75da97 selftests/bpf: Add lskel version of kfunc test.
Add light skeleton version of kfunc_call_test_subprog test.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-13-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
19250f5fc0 libbpf: Clean gen_loader's attach kind.
The gen_loader has to clear attach_kind otherwise the programs
without attach_btf_id will fail load if they follow programs
with attach_btf_id.

Fixes: 6723474373 ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-12-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
be05c94476 libbpf: Support init of inner maps in light skeleton.
Add ability to initialize inner maps in light skeleton.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-11-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
d0e928876e libbpf: Use CO-RE in the kernel in light skeleton.
Without lskel the CO-RE relocations are processed by libbpf before any other
work is done. Instead, when lskel is needed, remember relocation as RELO_CORE
kind. Then when loader prog is generated for a given bpf program pass CO-RE
relos of that program to gen loader via bpf_gen__record_relo_core(). The gen
loader will remember them as-is and pass it later as-is into the kernel.

The normal libbpf flow is to process CO-RE early before call relos happen. In
case of gen_loader the core relos have to be added to other relos to be copied
together when bpf static function is appended in different places to other main
bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for
normal relos and for RELO_CORE kind too. When that is done each struct
reloc_desc has good relos for specific main prog.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-10-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Andrii Nakryiko
03d5b99138 libbpf: Cleanup struct bpf_core_cand.
Remove two redundant fields from struct bpf_core_cand.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-8-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
fbd94c7afc bpf: Pass a set of bpf_core_relo-s to prog_load command.
struct bpf_core_relo is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
and let the kernel perform CO-RE relocations.

Note the struct bpf_line_info and struct bpf_func_info have the same
layout when passed from LLVM to libbpf and from libbpf to the kernel
except "insn_off" fields means "byte offset" when LLVM generates it.
Then libbpf converts it to "insn index" to pass to the kernel.
The struct bpf_core_relo's "insn_off" field is always "byte offset".

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
46334a0cd2 bpf: Define enum bpf_core_relo_kind as uapi.
enum bpf_core_relo_kind is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure.
Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in
bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind,
bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf
program into llvm.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-5-alexei.starovoitov@gmail.com
2021-12-02 11:18:35 -08:00
Alexei Starovoitov
29db4bea1d bpf: Prepare relo_core.c for kernel duty.
Make relo_core.c to be compiled for the kernel and for user space libbpf.

Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
This is the maximum number of nested structs and arrays.
For example:
 struct sample {
     int a;
     struct {
         int b[10];
     };
 };

 struct sample *s = ...;
 int *y = &s->b[5];
This field access is encoded as "0:1:0:5" and spec len is 4.

The follow up patch might bump it back to 64.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.com
2021-12-02 11:18:34 -08:00
Alexei Starovoitov
74753e1462 libbpf: Replace btf__type_by_id() with btf_type_by_id().
To prepare relo_core.c to be compiled in the kernel and the user space
replace btf__type_by_id with btf_type_by_id.

In libbpf btf__type_by_id and btf_type_by_id have different behavior.

bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id
vs internal btf_type_by_id, but type_id range check is already done
in bpf_core_apply_relo(), so it's safe to replace it everywhere.
The kernel btf_type_by_id() does the check anyway. It doesn't hurt.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-2-alexei.starovoitov@gmail.com
2021-12-02 11:18:34 -08:00
Li Zhijian
36d7d36fcf selftests: net: remove meaningless help option
$ ./fcnal-test.sh -t help
Test names: help

Looks it intent to list the available tests but it didn't do the right
thing. I will add another option the do that in the later patch.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 13:12:27 +00:00
Li Zhijian
a05431b22b selftests: net: Correct case name
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names

Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:19:08 +00:00
Kumar Kartikeya Dwivedi
d995816b77 libbpf: Avoid reload of imm for weak, unresolved, repeating ksym
Alexei pointed out that we can use BPF_REG_0 which already contains imm
from move_blob2blob computation. Note that we now compare the second
insn's imm, but this should not matter, since both will be zeroed out
for the error case for the insn populated earlier.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122235733.634914-4-memxor@gmail.com
2021-11-30 15:48:15 -08:00
Kumar Kartikeya Dwivedi
0270090d39 libbpf: Avoid double stores for success/failure case of ksym relocations
Instead, jump directly to success case stores in case ret >= 0, else do
the default 0 value store and jump over the success case. This is better
in terms of readability. Readjust the code for kfunc relocation as well
to follow a similar pattern, also leads to easier to follow code now.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122235733.634914-3-memxor@gmail.com
2021-11-30 15:48:14 -08:00
Joanne Koong
ec151037af selftest/bpf/benchs: Add bpf_loop benchmark
Add benchmark to measure the throughput and latency of the bpf_loop
call.

Testing this on my dev machine on 1 thread, the data is as follows:

        nr_loops: 10
bpf_loop - throughput: 198.519 ± 0.155 M ops/s, latency: 5.037 ns/op

        nr_loops: 100
bpf_loop - throughput: 247.448 ± 0.305 M ops/s, latency: 4.041 ns/op

        nr_loops: 500
bpf_loop - throughput: 260.839 ± 0.380 M ops/s, latency: 3.834 ns/op

        nr_loops: 1000
bpf_loop - throughput: 262.806 ± 0.629 M ops/s, latency: 3.805 ns/op

        nr_loops: 5000
bpf_loop - throughput: 264.211 ± 1.508 M ops/s, latency: 3.785 ns/op

        nr_loops: 10000
bpf_loop - throughput: 265.366 ± 3.054 M ops/s, latency: 3.768 ns/op

        nr_loops: 50000
bpf_loop - throughput: 235.986 ± 20.205 M ops/s, latency: 4.238 ns/op

        nr_loops: 100000
bpf_loop - throughput: 264.482 ± 0.279 M ops/s, latency: 3.781 ns/op

        nr_loops: 500000
bpf_loop - throughput: 309.773 ± 87.713 M ops/s, latency: 3.228 ns/op

        nr_loops: 1000000
bpf_loop - throughput: 262.818 ± 4.143 M ops/s, latency: 3.805 ns/op

>From this data, we can see that the latency per loop decreases as the
number of loops increases. On this particular machine, each loop had an
overhead of about ~4 ns, and we were able to run ~250 million loops
per second.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-5-joannekoong@fb.com
2021-11-30 10:56:28 -08:00
Joanne Koong
f6e659b7f9 selftests/bpf: Measure bpf_loop verifier performance
This patch tests bpf_loop in pyperf and strobemeta, and measures the
verifier performance of replacing the traditional for loop
with bpf_loop.

The results are as follows:

~strobemeta~

Baseline
    verification time 6808200 usec
    stack depth 496
    processed 554252 insns (limit 1000000) max_states_per_insn 16
    total_states 15878 peak_states 13489  mark_read 3110
    #192 verif_scale_strobemeta:OK (unrolled loop)

Using bpf_loop
    verification time 31589 usec
    stack depth 96+400
    processed 1513 insns (limit 1000000) max_states_per_insn 2
    total_states 106 peak_states 106 mark_read 60
    #193 verif_scale_strobemeta_bpf_loop:OK

~pyperf600~

Baseline
    verification time 29702486 usec
    stack depth 368
    processed 626838 insns (limit 1000000) max_states_per_insn 7
    total_states 30368 peak_states 30279 mark_read 748
    #182 verif_scale_pyperf600:OK (unrolled loop)

Using bpf_loop
    verification time 148488 usec
    stack depth 320+40
    processed 10518 insns (limit 1000000) max_states_per_insn 10
    total_states 705 peak_states 517 mark_read 38
    #183 verif_scale_pyperf600_bpf_loop:OK

Using the bpf_loop helper led to approximately a 99% decrease
in the verification time and in the number of instructions.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-4-joannekoong@fb.com
2021-11-30 10:56:28 -08:00
Joanne Koong
4e5070b64b selftests/bpf: Add bpf_loop test
Add test for bpf_loop testing a variety of cases:
various nr_loops, null callback ctx, invalid flags, nested callbacks.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-3-joannekoong@fb.com
2021-11-30 10:56:28 -08:00
Joanne Koong
e6f2dd0f80 bpf: Add bpf_loop helper
This patch adds the kernel-side and API changes for a new helper
function, bpf_loop:

long bpf_loop(u32 nr_loops, void *callback_fn, void *callback_ctx,
u64 flags);

where long (*callback_fn)(u32 index, void *ctx);

bpf_loop invokes the "callback_fn" **nr_loops** times or until the
callback_fn returns 1. The callback_fn can only return 0 or 1, and
this is enforced by the verifier. The callback_fn index is zero-indexed.

A few things to please note:
~ The "u64 flags" parameter is currently unused but is included in
case a future use case for it arises.
~ In the kernel-side implementation of bpf_loop (kernel/bpf/bpf_iter.c),
bpf_callback_t is used as the callback function cast.
~ A program can have nested bpf_loop calls but the program must
still adhere to the verifier constraint of its stack depth (the stack depth
cannot exceed MAX_BPF_STACK))
~ Recursive callback_fns do not pass the verifier, due to the call stack
for these being too deep.
~ The next patch will include the tests and benchmark

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-2-joannekoong@fb.com
2021-11-30 10:56:28 -08:00
Linus Torvalds
f080815fdb ARM64:
* Fix constant sign extension affecting TCR_EL2 and preventing
 running on ARMv8.7 models due to spurious bits being set
 
 * Fix use of helpers using PSTATE early on exit by always sampling
 it as soon as the exit takes place
 
 * Move pkvm's 32bit handling into a common helper
 
 RISC-V:
 
 * Fix incorrect KVM_MAX_VCPUS value
 
 * Unmap stage2 mapping when deleting/moving a memslot
 
 x86:
 
 * Fix and downgrade BUG_ON due to uninitialized cache
 
 * Many APICv and MOVE_ENC_CONTEXT_FROM fixes
 
 * Correctly emulate TLB flushes around nested vmentry/vmexit
 and when the nested hypervisor uses VPID
 
 * Prevent modifications to CPUID after the VM has run
 
 * Other smaller bugfixes
 
 Generic:
 
 * Memslot handling bugfixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGmHBEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOkGgf/RBjt1d7H6Um7tD7oA5QiIHmNY4ko
 K/90OAa8h62rilxpqxkRgLNmphBc5AzcbufVXN4J1hVhw2M+u1ouDxKeHS1GEZTA
 /XdNb0dwK99TpOJkIcuV/NQVIZUxkM00VbIiCoLkX06VuIc1Gie1G4bqzLhWCP8Y
 ts9l/pkfafvfEmjmcjVd7gkDOnEPbT+JPDJcuo/RA7C7Z2L4+8DsFeyfWGqBP647
 J6omUUxD82QRm28OVOK4V7aNALWsAdlaqHrVFAPZywQl7QTWMO0UTcKTdCCB2B4Q
 QnHejFV6pFh55q3/fhe7epy9e2Sw+NOsmWKTEGPbU5nn94R8lyW1GV4ZUQ==
 =Nduu
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix constant sign extension affecting TCR_EL2 and preventing
     running on ARMv8.7 models due to spurious bits being set

   - Fix use of helpers using PSTATE early on exit by always sampling it
     as soon as the exit takes place

   - Move pkvm's 32bit handling into a common helper

  RISC-V:

   - Fix incorrect KVM_MAX_VCPUS value

   - Unmap stage2 mapping when deleting/moving a memslot

  x86:

   - Fix and downgrade BUG_ON due to uninitialized cache

   - Many APICv and MOVE_ENC_CONTEXT_FROM fixes

   - Correctly emulate TLB flushes around nested vmentry/vmexit and when
     the nested hypervisor uses VPID

   - Prevent modifications to CPUID after the VM has run

   - Other smaller bugfixes

  Generic:

   - Memslot handling bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
  KVM: fix avic_set_running for preemptable kernels
  KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
  KVM: SEV: accept signals in sev_lock_two_vms
  KVM: SEV: do not take kvm->lock when destroying
  KVM: SEV: Prohibit migration of a VM that has mirrors
  KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs locked
  selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
  KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: initialize regions_list of a mirror VM
  KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: do not use list_replace_init on an empty list
  KVM: x86: Use a stable condition around all VT-d PI paths
  KVM: x86: check PIR even for vCPUs with disabled APICv
  KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
  KVM: x86/mmu: Handle "default" period when selectively waking kthread
  KVM: MMU: shadow nested paging does not have PKU
  KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
  KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
  KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
  ...
2021-11-30 09:22:15 -08:00
Matthew Wilcox (Oracle)
d6e6a27d96 tools: Fix math.h breakage
Commit 98e1385ef2 ("include/linux/radix-tree.h: replace kernel.h with
the necessary inclusions") broke the radix tree test suite in two
different ways; first by including math.h which didn't exist in the
tools directory, and second by removing an implicit include of
spinlock.h before lockdep.h.  Fix both issues.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-30 09:14:42 -08:00
Mehrdad Arshad Rad
c291d0a4d1 libbpf: Remove duplicate assignments
There is a same action when load_attr.attach_btf_id is initialized.

Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211128193337.10628-1-arshad.rad@gmail.com
2021-11-30 15:33:57 +01:00
Hangbin Liu
5944b5abd8 Bonding: add arp_missed_max option
Currently, we use hard code number to verify if we are in the
arp_interval timeslice. But some user may want to reduce/extend
the verify timeslice. With the similar team option 'missed_max'
the uers could change that number based on their own environment.

Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30 12:15:58 +00:00
Paolo Bonzini
17d44a96f0 KVM: SEV: Prohibit migration of a VM that has mirrors
VMs that mirror an encryption context rely on the owner to keep the
ASID allocated.  Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
would cause a dangling ASID:

1. copy context from A to B (gets ref to A)
2. move context from A to L (moves ASID from A to L)
3. close L (releases ASID from L, B still references it)

The right way to do the handoff instead is to create a fresh mirror VM
on the destination first:

1. copy context from A to B (gets ref to A)
[later] 2. close B (releases ref to A)
3. move context from A to L (moves ASID from A to L)
4. copy context from L to M

So, catch the situation by adding a count of how many VMs are
mirroring this one's encryption context.

Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:14 -05:00
Paolo Bonzini
dc79c9f4eb selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
I am putting the tests in sev_migrate_tests because the failure conditions are
very similar and some of the setup code can be reused, too.

The tests cover both successful creation of a mirror VM, and error
conditions.

Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Message-Id: <20211123005036.2954379-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:13 -05:00
Maciej S. Szmigiero
81835ee113 KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
A kvm_page_table_test run with its default settings fails on VMX due to
memory region add failure:
> ==== Test Assertion Failure ====
>  lib/kvm_util.c:952: ret == 0
>  pid=10538 tid=10538 errno=17 - File exists
>     1  0x00000000004057d1: vm_userspace_mem_region_add at kvm_util.c:947
>     2  0x0000000000401ee9: pre_init_before_test at kvm_page_table_test.c:302
>     3   (inlined by) run_test at kvm_page_table_test.c:374
>     4  0x0000000000409754: for_each_guest_mode at guest_modes.c:53
>     5  0x0000000000401860: main at kvm_page_table_test.c:500
>     6  0x00007f82ae2d8554: ?? ??:0
>     7  0x0000000000401894: _start at ??:?
>  KVM_SET_USER_MEMORY_REGION IOCTL failed,
>  rc: -1 errno: 17
>  slot: 1 flags: 0x0
>  guest_phys_addr: 0xc0000000 size: 0x40000000

This is because the memory range that this test is trying to add
(0x0c0000000 - 0x100000000) conflicts with LAPIC mapping at 0x0fee00000.

Looking at the code it seems that guest_test_*phys*_mem variable gets
mistakenly overwritten with guest_test_*virt*_mem while trying to adjust
the former for alignment.
With the correct variable adjusted this test runs successfully.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <52e487458c3172923549bbcf9dfccfbe6faea60b.1637940473.git.maciej.szmigiero@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:12:13 -05:00