linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-19 00:54:41 +08:00

Author	SHA1	Message	Date
Dave Marchevsky	f10ca5da5b	bpf: Don't explicitly emit BTF for struct btf_iter_num Commit `6018e1f407` ("bpf: implement numbers iterator") added the BTF_TYPE_EMIT line that this patch is modifying. The struct btf_iter_num doesn't exist, so only a forward declaration is emitted in BTF: FWD 'btf_iter_num' fwd_kind=struct That commit was probably hoping to ensure that struct bpf_iter_num is emitted in vmlinux BTF. A previous version of this patch changed the line to emit the correct type, but Yonghong confirmed that it would definitely be emitted regardless in [0], so this patch simply removes the line. This isn't marked "Fixes" because the extraneous btf_iter_num FWD wasn't causing any issues that I noticed, aside from mild confusion when I looked through the code. [0]: https://lore.kernel.org/bpf/25d08207-43e6-36a8-5e0f-47a913d4cda5@linux.dev/ Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231013204426.1074286-2-davemarchevsky@fb.com	2023-10-13 15:48:58 -07:00
Artem Savkov	ba8ea72388	bpf: Change syscall_nr type to int in struct syscall_tp_t linux-rt-devel tree contains a patch (b1773eac3f29c ("sched: Add support for lazy preemption")) that adds an extra member to struct trace_entry. This causes the offset of args field in struct trace_event_raw_sys_enter be different from the one in struct syscall_trace_enter: struct trace_event_raw_sys_enter { struct trace_entry ent; /* 0 12 / / XXX last struct has 3 bytes of padding / / XXX 4 bytes hole, try to pack / long int id; / 16 8 / long unsigned int args[6]; / 24 48 / / --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- / char __data[]; / 72 0 / / size: 72, cachelines: 2, members: 4 / / sum members: 68, holes: 1, sum holes: 4 / / paddings: 1, sum paddings: 3 / / last cacheline: 8 bytes / }; struct syscall_trace_enter { struct trace_entry ent; / 0 12 / / XXX last struct has 3 bytes of padding / int nr; / 12 4 / long unsigned int args[]; / 16 0 / / size: 16, cachelines: 1, members: 3 / / paddings: 1, sum paddings: 3 / / last cacheline: 16 bytes */ }; This, in turn, causes perf_event_set_bpf_prog() fail while running bpf test_profiler testcase because max_ctx_offset is calculated based on the former struct, while off on the latter: 10488 if (is_tracepoint \|\| is_syscall_tp) { 10489 int off = trace_event_get_offsets(event->tp_event); 10490 10491 if (prog->aux->max_ctx_offset > off) 10492 return -EACCES; 10493 } What bpf program is actually getting is a pointer to struct syscall_tp_t, defined in kernel/trace/trace_syscalls.c. This patch fixes the problem by aligning struct syscall_tp_t with struct syscall_trace_(enter\|exit) and changing the tests to use these structs to dereference context. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20231013054219.172920-1-asavkov@redhat.com	2023-10-13 12:39:36 -07:00
Martin KaFai Lau	9c1292eca2	net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set It was reported that there is a compiler warning on the unused variable "sin_addr_len" in af_inet.c when CONFIG_CGROUP_BPF is not set. This patch is to address it similar to the ipv6 counterpart in inet6_getname(). It is to "return sin_addr_len;" instead of "return sizeof(*sin);". Fixes: `fefba7d1ae` ("bpf: Propagate modified uaddrlen from cgroup sockaddr programs") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/bpf/20231013185702.3993710-1-martin.lau@linux.dev Closes: https://lore.kernel.org/bpf/20231013114007.2fb09691@canb.auug.org.au/	2023-10-13 12:35:43 -07:00
Yafang Shao	236334aeec	bpf: Avoid unnecessary audit log for CPU security mitigations Check cpu_mitigations_off() first to avoid calling capable() if it is off. This can avoid unnecessary audit log. Fixes: `bc5bc309db` ("bpf: Inherit system settings for CPU security mitigations") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/CAEf4Bza6UVUWqcWQ-66weZ-nMDr+TFU3Mtq=dumZFD-pSqU7Ow@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20231013083916.4199-1-laoar.shao@gmail.com	2023-10-13 12:33:21 -07:00
Martin KaFai Lau	d2dc885b8c	Merge branch 'Add cgroup sockaddr hooks for unix sockets' Daan De Meyer says: ==================== Changes since v10: * Removed extra check from bpf_sock_addr_set_sun_path() again in favor of calling unix_validate_addr() everywhere in af_unix.c before calling the hooks. Changes since v9: * Renamed bpf_sock_addr_set_unix_addr() to bpf_sock_addr_set_sun_path() and rennamed arguments to match the new name. * Added an extra check to bpf_sock_addr_set_sun_path() to disallow changing the address of an unnamed unix socket. * Removed unnecessary NULL check on uaddrlen in __cgroup_bpf_run_filter_sock_addr(). Changes since v8: * Added missing test programs to last patch Changes since v7: * Fixed formatting nit in comment * Renamed from cgroup/connectun to cgroup/connect_unix (and similar for all other hooks) Changes since v6: * Actually removed bpf_bind() helper for AF_UNIX hooks. * Fixed merge conflict * Updated comment to mention uaddrlen is read-only for AF_INET[6] * Removed unnecessary forward declaration of struct sock_addr_test * Removed unused BPF_CGROUP_RUN_PROG_UNIX_CONNECT() * Fixed formatting nit reported by checkpatch * Added more information to commit message about recvmsg() on connected socket Changes since v5: * Fixed kernel version in bpftool documentation (6.3 => 6.7). * Added connection mode socket recvmsg() test. * Removed bpf_bind() helper for AF_UNIX hooks. * Added missing getpeernameun and getsocknameun BPF test programs. * Added note for bind() test being unused currently. Changes since v4: * Dropped support for intercepting bind() as when using bind() with unix sockets and a pathname sockaddr, bind() will create an inode in the filesystem that needs to be cleaned up. If the address is rewritten, users might try to clean up the wrong file and leak the actual socket file in the filesystem. * Changed bpf_sock_addr_set_unix_addr() to use BTF_KFUNC_HOOK_CGROUP_SKB instead of BTF_KFUNC_HOOK_COMMON. * Removed unix socket related changes from BPF_CGROUP_PRE_CONNECT_ENABLED() as unix sockets do not support pre-connect. * Added tests for getpeernameun and getsocknameun hooks. * We now disallow an empty sockaddr in bpf_sock_addr_set_unix_addr() similar to unix_validate_addr(). * Removed unnecessary cgroup_bpf_enabled() checks * Removed unnecessary error checks Changes since v3: * Renamed bpf_sock_addr_set_addr() to bpf_sock_addr_set_unix_addr() and made it only operate on AF_UNIX sockaddrs. This is because for the other families, users usually want to configure more than just the address so a generic interface will not fit the bill here. e.g. for AF_INET and AF_INET6, users would generally also want to be able to configure the port which the current interface doesn't support. So we expose an AF_UNIX specific function instead. * Made the tests in the new sock addr tests more generic (similar to test_sock_addr.c), this should make it easier to migrate the other sock addr tests in the future. * Removed the new kfunc hook and attached to BTF_KFUNC_HOOK_COMMON instead * Set uaddrlen to 0 when the family is AF_UNSPEC * Pass in the addrlen to the hook from IPv6 code * Fixed mount directory mkdir() to ignore EEXIST Changes since v2: * Configuring the sock addr is now done via a new kfunc bpf_sock_addr_set() * The addrlen is exposed as u32 in bpf_sock_addr_kern * Selftests are updated to use the new kfunc * Selftests are now added as a new sock_addr test in prog_tests/ * Added BTF_KFUNC_HOOK_SOCK_ADDR for BPF_PROG_TYPE_CGROUP_SOCK_ADDR * __cgroup_bpf_run_filter_sock_addr() now returns the modified addrlen Changes since v1: * Split into multiple patches instead of one single patch * Added unix support for all socket address hooks instead of only connect() * Switched approach to expose the socket address length to the bpf hook instead of recalculating the socket address length in kernelspace to properly support abstract unix socket addresses * Modified socket address hook tests to calculate the socket address length once and pass it around everywhere instead of recalculating the actual unix socket address length on demand. * Added some missing section name tests for getpeername()/getsockname() This patch series extends the cgroup sockaddr hooks to include support for unix sockets. To add support for unix sockets, struct bpf_sock_addr_kern is extended to expose the socket address length to the bpf program. Along with that, a new kfunc bpf_sock_addr_set_unix_addr() is added to safely allow modifying an AF_UNIX sockaddr from bpf programs. I intend to use these new hooks in systemd to reimplement the LogNamespace= feature, which allows running multiple instances of systemd-journald to process the logs of different services. systemd-journald also processes syslog messages, so currently, using log namespaces means all services running in the same log namespace have to live in the same private mount namespace so that systemd can mount the journal namespace's associated syslog socket over /dev/log to properly direct syslog messages from all services running in that log namespace to the correct systemd-journald instance. We want to relax this requirement so that processes running in disjoint mount namespaces can still run in the same log namespace. To achieve this, we can use these new hooks to rewrite the socket address of any connect(), sendto(), ... syscalls to /dev/log to the socket address of the journal namespace's syslog socket instead, which will transparently do the redirection without requiring use of a mount namespace and mounting over /dev/log. Aside from the above usecase, these hooks can more generally be used to transparently redirect unix sockets to different addresses as required by services. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:56 -07:00
Daan De Meyer	82ab6b505e	selftests/bpf: Add tests for cgroup unix socket address hooks These selftests are written in prog_tests style instead of adding them to the existing test_sock_addr tests. Migrating the existing sock addr tests to prog_tests style is left for future work. This commit adds support for testing bind() sockaddr hooks, even though there's no unix socket sockaddr hook for bind(). We leave this code intact for when the INET and INET6 tests are migrated in the future which do support intercepting bind(). Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-10-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:55 -07:00
Daan De Meyer	af2752ed45	selftests/bpf: Make sure mount directory exists The mount directory for the selftests cgroup tree might not exist so let's make sure it does exist by creating it ourselves if it doesn't exist. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-9-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:55 -07:00
Daan De Meyer	3243fef6a4	documentation/bpf: Document cgroup unix socket address hooks Update the documentation to mention the new cgroup unix sockaddr hooks. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-8-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:55 -07:00
Daan De Meyer	8b3cba987e	bpftool: Add support for cgroup unix socket address hooks Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into bpftool. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20231011185113.140426-7-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:55 -07:00
Daan De Meyer	bf90438c78	libbpf: Add support for cgroup unix socket address hooks Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:55 -07:00
Daan De Meyer	859051dd16	bpf: Implement cgroup sockaddr hooks for unix sockets These hooks allows intercepting connect(), getsockname(), getpeername(), sendmsg() and recvmsg() for unix sockets. The unix socket hooks get write access to the address length because the address length is not fixed when dealing with unix sockets and needs to be modified when a unix socket address is modified by the hook. Because abstract socket unix addresses start with a NUL byte, we cannot recalculate the socket address in kernelspace after running the hook by calculating the length of the unix socket path using strlen(). These hooks can be used when users want to multiplex syscall to a single unix socket to multiple different processes behind the scenes by redirecting the connect() and other syscalls to process specific sockets. We do not implement support for intercepting bind() because when using bind() with unix sockets with a pathname address, this creates an inode in the filesystem which must be cleaned up. If we rewrite the address, the user might try to clean up the wrong file, leaking the socket in the filesystem where it is never cleaned up. Until we figure out a solution for this (and a use case for intercepting bind()), we opt to not allow rewriting the sockaddr in bind() calls. We also implement recvmsg() support for connected streams so that after a connect() that is modified by a sockaddr hook, any corresponding recmvsg() on the connected socket can also be modified to make the connected program think it is connected to the "intended" remote. Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-5-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 17:27:47 -07:00
Daan De Meyer	53e380d214	bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf As prep for adding unix socket support to the cgroup sockaddr hooks, let's add a kfunc bpf_sock_addr_set_sun_path() that allows modifying a unix sockaddr from bpf. While this is already possible for AF_INET and AF_INET6, we'll need this kfunc when we add unix socket support since modifying the address for those requires modifying both the address and the sockaddr length. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-4-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 16:29:25 -07:00
Daan De Meyer	fefba7d1ae	bpf: Propagate modified uaddrlen from cgroup sockaddr programs As prep for adding unix socket support to the cgroup sockaddr hooks, let's propagate the sockaddr length back to the caller after running a bpf cgroup sockaddr hook program. While not important for AF_INET or AF_INET6, the sockaddr length is important when working with AF_UNIX sockaddrs as the size of the sockaddr cannot be determined just from the address family or the sockaddr's contents. __cgroup_bpf_run_filter_sock_addr() is modified to take the uaddrlen as an input/output argument. After running the program, the modified sockaddr length is stored in the uaddrlen pointer. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-3-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 15:03:40 -07:00
Daan De Meyer	feba7b634e	selftests/bpf: Add missing section name tests for getpeername/getsockname These were missed when these hooks were first added so add them now instead to make sure every sockaddr hook has a matching section name test. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-2-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-11 13:24:18 -07:00
Martin KaFai Lau	1ef09e1281	Merge branch 'bpf: Fix src IP addr related limitation in bpf__fib_lookup()' Martynas Pumputis says: ==================== The patchset fixes the limitation of bpf__fib_lookup() helper, which prevents it from being used in BPF dataplanes with network interfaces which have more than one IP addr. See the first patch for more details. Thanks! * v2->v3: Address Martin KaFai Lau's feedback * v1->v2: Use IPv6 stubs to fix compilation when CONFIG_IPV6=m. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-09 16:28:37 -07:00
Martynas Pumputis	b0f7a8ca11	selftests/bpf: Add BPF_FIB_LOOKUP_SRC tests This patch extends the existing fib_lookup test suite by adding two test cases (for each IP family): * Test source IP selection from the egressing netdev. * Test source IP selection when an IP route has a preferred src IP addr. Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-3-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-09 16:28:37 -07:00
Martynas Pumputis	dab4e1f06c	bpf: Derive source IP addr via bpf__fib_lookup() Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; / the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-09 16:28:35 -07:00
Ian Rogers	1be84ca53c	bpftool: Align bpf_load_and_run_opts insns and data A C string lacks alignment so use aligned arrays to avoid potential alignment problems. Switch to using sizeof (less 1 for the \0 terminator) rather than a hardcode size constant. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20231007044439.25171-2-irogers@google.com	2023-10-09 09:36:51 -07:00
Ian Rogers	23671f4dfd	bpftool: Align output skeleton ELF code libbpf accesses the ELF data requiring at least 8 byte alignment, however, the data is generated into a C string that doesn't guarantee alignment. Fix this by assigning to an aligned char array. Use sizeof on the array, less one for the \0 terminator, rather than generating a constant. Fixes: `a6cc6b34b9` ("bpftool: Provide a helper method for accessing skeleton's embedded ELF data") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20231007044439.25171-1-irogers@google.com	2023-10-09 09:36:51 -07:00
David Vernet	0d7ae06860	selftests/bpf: Test pinning bpf timer to a core Now that we support pinning a BPF timer to the current core, we should test it with some selftests. This patch adds two new testcases to the timer suite, which verifies that a BPF timer both with and without BPF_F_TIMER_ABS, can be pinned to the calling core with BPF_F_TIMER_CPU_PIN. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-3-void@manifault.com	2023-10-09 16:29:06 +02:00
David Vernet	d6247ecb6c	bpf: Add ability to pin bpf timer to calling CPU BPF supports creating high resolution timers using bpf_timer_* helper functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which specifies that the timeout should be interpreted as absolute time. It would also be useful to be able to pin that timer to a core. For example, if you wanted to make a subset of cores run without timer interrupts, and only have the timer be invoked on a single core. This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag. When specified, the HRTIMER_MODE_PINNED flag is passed to hrtimer_start(). A subsequent patch will update selftests to validate. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com	2023-10-09 16:28:49 +02:00
Kees Cook	84cb9cbd91	bpf: Annotate struct bpf_stack_map with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle [1], add __counted_by for struct bpf_stack_map. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Link: https://lore.kernel.org/bpf/20231006201657.work.531-kees@kernel.org	2023-10-06 23:44:35 +02:00
Geliang Tang	fdd11c14c3	selftests/bpf: Add pairs_redir_to_connected helper Extract duplicate code from these four functions unix_redir_to_connected() udp_redir_to_connected() inet_unix_redir_to_connected() unix_inet_redir_to_connected() to generate a new helper pairs_redir_to_connected(). Create the different socketpairs in these four functions, then pass the socketpairs info to the new common helper to do the connections. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Link: https://lore.kernel.org/r/54bb28dcf764e7d4227ab160883931d2173f4f3d.1696588133.git.geliang.tang@suse.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-06 11:25:22 -07:00
Andrii Nakryiko	0af3aace5b	selftests/bpf: Don't truncate #test/subtest field We currently expect up to a three-digit number of tests and subtests, so: #999/999: some_test/some_subtest: ... Is the largest test/subtest we can see. If we happen to cross into 1000s, current logic will just truncate everything after 7th character. This patch fixes this truncate and allows to go way higher (up to 31 characters in total). We still nicely align test numbers: #60/66 core_reloc_btfgen/type_based___incompat:OK #60/67 core_reloc_btfgen/type_based___fn_wrong_args:OK #60/68 core_reloc_btfgen/type_id:OK #60/69 core_reloc_btfgen/type_id___missing_targets:OK #60/70 core_reloc_btfgen/enumval:OK Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-3-andrii@kernel.org	2023-10-06 20:17:28 +02:00
Andrii Nakryiko	46475cc0dd	selftests/bpf: Support building selftests in optimized -O2 mode Add support for building selftests with -O2 level of optimization, which allows more compiler warnings detection (like lots of potentially uninitialized usage), but also is useful to have a faster-running test for some CPU-intensive tests. One can build optimized versions of libbpf and selftests by running: $ make RELEASE=1 There is a measurable speed up of about 10 seconds for me locally, though it's mostly capped by non-parallelized serial tests. User CPU time goes down by total 40 seconds, from 1m10s to 0m28s. Unoptimized build (-O0) ======================= Summary: 430/3544 PASSED, 25 SKIPPED, 4 FAILED real 1m59.937s user 1m10.877s sys 3m14.880s Optimized build (-O2) ===================== Summary: 425/3543 PASSED, 25 SKIPPED, 9 FAILED real 1m50.540s user 0m28.406s sys 3m13.198s Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-2-andrii@kernel.org	2023-10-06 20:17:28 +02:00
Andrii Nakryiko	925a01577e	selftests/bpf: Fix compiler warnings reported in -O2 mode Fix a bunch of potentially unitialized variable usage warnings that are reported by GCC in -O2 mode. Also silence overzealous stringop-truncation class of warnings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-1-andrii@kernel.org	2023-10-06 20:17:28 +02:00
Yafang Shao	bc5bc309db	bpf: Inherit system settings for CPU security mitigations Currently, there exists a system-wide setting related to CPU security mitigations, denoted as 'mitigations='. When set to 'mitigations=off', it deactivates all optional CPU mitigations. Therefore, if we implement a system-wide 'mitigations=off' setting, it should inherently bypass Spectre v1 and Spectre v4 in the BPF subsystem. Please note that there is also a more specific 'nospectre_v1' setting on x86 and ppc architectures, though it is not currently exported. For the time being, let's disregard more fine-grained options. This idea emerged during our discussion about potential Spectre v1 attacks with Luis [0]. [0] https://lore.kernel.org/bpf/b4fc15f7-b204-767e-ebb9-fdb4233961fb@iogearbox.net Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <song@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Cc: Luis Gerhorst <gerhorst@cs.fau.de> Link: https://lore.kernel.org/bpf/20231005084123.1338-1-laoar.shao@gmail.com	2023-10-06 20:16:44 +02:00
Akihiko Odaki	9c8c3fa3a5	bpf: Fix the comment for bpf_restore_data_end() The comment used to say: > Restore data saved by bpf_compute_data_pointers(). But bpf_compute_data_pointers() does not save the data; bpf_compute_and_save_data_end() does. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231005072137.29870-1-akihiko.odaki@daynix.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-05 22:33:50 -07:00
Geliang Tang	d549854bc5	selftests/bpf: Enable CONFIG_VSOCKETS in config CONFIG_VSOCKETS is required by BPF selftests, otherwise we get errors like this: ./test_progs:socket_loopback_reuseport:386: socket: Address family not supported by protocol socket_loopback_reuseport:FAIL:386 ./test_progs:vsock_unix_redir_connectible:1496: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1496 So this patch enables it in tools/testing/selftests/bpf/config. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Link: https://lore.kernel.org/r/472e73d285db2ea59aca9bbb95eb5d4048327588.1696490003.git.geliang.tang@suse.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-05 21:41:07 -07:00
Andrii Nakryiko	3157b7ce14	Merge branch 'selftest/bpf, riscv: Improved cross-building support' Björn Töpel says: ==================== From: Björn Töpel <bjorn@rivosinc.com> Yet another "more cross-building support for RISC-V" series. An example how to invoke a gen_tar build: \| make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- CC=riscv64-linux-gnu-gcc \ \| HOSTCC=gcc O=/workspace/kbuild FORMAT= \ \| SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" -j $(($(nproc)-1)) \ \| -C tools/testing/selftests gen_tar Björn ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-10-04 13:37:56 -07:00
Björn Töpel	e096ab9d9f	selftests/bpf: Add uprobe_multi to gen_tar target The uprobe_multi program was not picked up for the gen_tar target. Fix by adding it to TEST_GEN_FILES. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-4-bjorn@kernel.org	2023-10-04 13:37:41 -07:00
Björn Töpel	72fae63199	selftests/bpf: Enable lld usage for RISC-V RISC-V has proper lld support. Use that, similar to what x86 does, for urandom_read et al. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-3-bjorn@kernel.org	2023-10-04 13:37:41 -07:00
Björn Töpel	97a79e502e	selftests/bpf: Add cross-build support for urandom_read et al Some userland programs in the BPF test suite, e.g. urandom_read, is missing cross-build support. Add cross-build support for these programs Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-2-bjorn@kernel.org	2023-10-04 13:36:50 -07:00
Andrii Nakryiko	cbcb199b7c	Merge branch 'libbpf/selftests syscall wrapper fixes for RISC-V' Björn Töpel says: ==================== From: Björn Töpel <bjorn@rivosinc.com> Commit `08d0ce30e0` ("riscv: Implement syscall wrappers") introduced some regressions in libbpf, and the kselftests BPF suite, which are fixed with these three patches. Note that there's an outstanding fix [1] for ftrace syscall tracing which is also a fallout from the commit above. Björn [1] https://lore.kernel.org/linux-riscv/20231003182407.32198-1-alexghiti@rivosinc.com/ Alexandre Ghiti (1): libbpf: Fix syscall access arguments on riscv ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-10-04 13:19:47 -07:00
Björn Töpel	b55b775f03	selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for riscv Add missing sys_nanosleep name for RISC-V, which is used by some tests (e.g. attach_probe). Fixes: `08d0ce30e0` ("riscv: Implement syscall wrappers") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-4-bjorn@kernel.org	2023-10-04 13:19:44 -07:00
Björn Töpel	0f2692ee43	selftests/bpf: Define SYS_PREFIX for riscv SYS_PREFIX was missing for a RISC-V, which made a couple of kprobe tests fail. Add missing SYS_PREFIX for RISC-V. Fixes: `08d0ce30e0` ("riscv: Implement syscall wrappers") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-3-bjorn@kernel.org	2023-10-04 13:19:39 -07:00
Alexandre Ghiti	8a412c5c1c	libbpf: Fix syscall access arguments on riscv Since commit `08d0ce30e0` ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: `08d0ce30e0` ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org	2023-10-04 13:19:13 -07:00
Daniel Borkmann	93fb2776f4	Merge branch 'bpf-xsk-sh-umem' Tushar Vyavahare says: ==================== Implement a test for the SHARED_UMEM feature in this patch set and make necessary changes/improvements. Ensure that the framework now supports different streams for different sockets. v2->v3: - Set the sock_num at the end of the while loop. - Declare xsk at the top of the while loop. v1->v2: - Remove generate_mac_addresses() and generate mac addresses based on the number of sockets in __test_spec_init() function. [Magnus] - Update Makefile to include find_bit.c for compiling xskxceiver. - Add bitmap_full() function to verify all bits are set to break the while loop in the receive_pkts() and send_pkts() functions. - Replace __test_and_set_bit() function with __set_bit() function. - Add single return check for wait_for_tx_completion() function call. Patch series summary: 1: Move the packet stream from the ifobject struct to the xsk_socket_info struct to enable the use of different streams for different sockets This will facilitate the sending and receiving of data from multiple sockets simultaneously using the SHARED_XDP_UMEM feature. It gives flexibility of send/recive individual traffic on particular socket. 2: Rename the header file to a generic name so that it can be used by all future XDP programs. 3: Move the src_mac and dst_mac fields from the ifobject structure to the xsk_socket_info structure to achieve per-socket MAC address assignment. Require this in order to steer traffic to various sockets in subsequent patches. 4: Improve the receive_pkt() function to enable it to receive packets from multiple sockets. Define a sock_num variable to iterate through all the sockets in the Rx path. Add nb_valid_entries to check that all the expected number of packets are received. 5: The pkt_set() function no longer needs the umem parameter. This commit removes the umem parameter from the pkt_set() function. 6: Iterate over all the sockets in the send pkts function. Update send_pkts() to handle multiple sockets for sending packets. Multiple TX sockets are utilized alternately based on the batch size for improve packet transmission. 7: Modify xsk_update_xskmap() to accept the index as an argument, enabling the addition of multiple sockets to xskmap. 8: Add a new test for testing shared umem feature. This is accomplished by adding a new XDP program and using the multiple sockets. The new XDP program redirects the packets based on the destination MAC address. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2023-10-04 15:26:18 +02:00
Tushar Vyavahare	6d198a89c0	selftests/xsk: Add a test for shared umem feature Add a new test for testing shared umem feature. This is accomplished by adding a new XDP program and using the multiple sockets. The new XDP program redirects the packets based on the destination MAC address. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-9-tushar.vyavahare@intel.com	2023-10-04 15:26:02 +02:00
Tushar Vyavahare	fc2cb86495	selftests/xsk: Modify xsk_update_xskmap() to accept the index as an argument Modify xsk_update_xskmap() to accept the index as an argument, enabling the addition of multiple sockets to xskmap. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-8-tushar.vyavahare@intel.com	2023-10-04 15:26:02 +02:00
Tushar Vyavahare	fd0815ae9b	selftests/xsk: Iterate over all the sockets in the send pkts function Update send_pkts() to handle multiple sockets for sending packets. Multiple TX sockets are utilized alternately based on the batch size for improve packet transmission. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-7-tushar.vyavahare@intel.com	2023-10-04 15:26:02 +02:00
Tushar Vyavahare	46e43786cc	selftests/xsk: Remove unnecessary parameter from pkt_set() function call The pkt_set() function no longer needs the umem parameter. This commit removes the umem parameter from the pkt_set() function. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-6-tushar.vyavahare@intel.com	2023-10-04 15:26:02 +02:00
Tushar Vyavahare	8913e653e9	selftests/xsk: Iterate over all the sockets in the receive pkts function Improve the receive_pkt() function to enable it to receive packets from multiple sockets. Define a sock_num variable to iterate through all the sockets in the Rx path. Add nb_valid_entries to check that all the expected number of packets are received. Revise the function __receive_pkts() to only inspect the receive ring once, handle any received packets, and promptly return. Implement a bitmap to store the value of number of sockets. Update Makefile to include find_bit.c for compiling xskxceiver. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-5-tushar.vyavahare@intel.com	2023-10-04 15:26:02 +02:00
Tushar Vyavahare	985fd2145a	selftests/xsk: Move src_mac and dst_mac to the xsk_socket_info Move the src_mac and dst_mac fields from the ifobject structure to the xsk_socket_info structure to achieve per-socket MAC address assignment. Require this in order to steer traffic to various sockets in subsequent patches. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-4-tushar.vyavahare@intel.com	2023-10-04 15:26:01 +02:00
Tushar Vyavahare	93ba112479	selftests/xsk: Rename xsk_xdp_metadata.h to xsk_xdp_common.h Rename the header file to a generic name so that it can be used by all future XDP programs. Ensure that the xsk_xdp_common.h header file includes include guards. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-3-tushar.vyavahare@intel.com	2023-10-04 15:26:01 +02:00
Tushar Vyavahare	8367eb954e	selftests/xsk: Move pkt_stream to the xsk_socket_info Move the packet stream from the ifobject struct to the xsk_socket_info struct to enable the use of different streams for different sockets. This will facilitate the sending and receiving of data from multiple sockets simultaneously using the SHARED_XDP_UMEM feature. Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20230927135241.2287547-2-tushar.vyavahare@intel.com	2023-10-04 15:26:01 +02:00
Hengqi Chen	2147c8d07e	libbpf: Allow Golang symbols in uprobe secdef Golang symbols in ELF files are different from C/C++ which contains special characters like '', '(' and ')'. With generics, things get more complicated, there are symbols like: github.com/cilium/ebpf/internal.(Deque[go.shape.interface { Format(fmt.State, int32); TypeName() string;github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type}]).Grow Matching such symbols using `%m[^\n]` in sscanf, this excludes newline which typically does not appear in ELF symbols. This should work in most use-cases and also work for unicode letters in identifiers. If newline do show up in ELF symbols, users can still attach to such symbol by specifying bpf_uprobe_opts::func_name. A working example can be found at this repo ([0]). [0]: https://github.com/chenhengqi/libbpf-go-symbols Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929155954.92448-1-hengqi.chen@gmail.com	2023-09-29 14:32:20 -07:00
Ruowen Qin	9e09b75079	samples/bpf: Add -fsanitize=bounds to userspace programs The sanitizer flag, which is supported by both clang and gcc, would make it easier to debug array index out-of-bounds problems in these programs. Make the Makfile smarter to detect ubsan support from the compiler and add the '-fsanitize=bounds' accordingly. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com> Signed-off-by: Jinghao Jia <jinghao7@illinois.edu> Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230927045030.224548-2-ruowenq2@illinois.edu	2023-09-28 09:31:05 -07:00
Andrii Nakryiko	0e73ef1d8c	Merge branch 'bpf: Add missed stats for kprobes' Jiri Olsa says: ==================== hi, at the moment we can't retrieve the number of missed kprobe executions and subsequent execution of BPF programs. This patchset adds: - counting of missed execution on attach layer for: . kprobes attached through perf link (kprobe/ftrace) . kprobes attached through kprobe.multi link (fprobe) - counting of recursion_misses for BPF kprobe programs It's still technically possible to create kprobe without perf link (using SET_BPF perf ioctl) in which case we don't have a way to retrieve the kprobe's 'missed' count. However both libbpf and cilium/ebpf libraries use perf link if it's available, and for old kernels without perf link support we can use BPF program to retrieve the kprobe missed count. v3 changes: - added acks [Song] - make test_missed not serial [Andrii] Also available at: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git bpf/missed_stats thanks, jirka ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-25 16:37:53 -07:00
Jiri Olsa	85981e0f9e	selftests/bpf: Add test for recursion counts of perf event link tracepoint Adding selftest that puts kprobe on bpf_fentry_test1 that calls bpf_printk and invokes bpf_trace_printk tracepoint. The bpf_trace_printk tracepoint has test[234] programs attached to it. Because kprobe execution goes through bpf_prog_active check, programs attached to the tracepoint will fail the recursion check and increment the recursion_misses stats. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-10-jolsa@kernel.org	2023-09-25 16:37:45 -07:00

1 2 3 4 5 ...

1216122 Commits