linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-26 05:34:13 +08:00

Author	SHA1	Message	Date
Roberto Sassu	865b0566d8	bpf: Add bpf_verify_pkcs7_signature() kfunc Add the bpf_verify_pkcs7_signature() kfunc, to give eBPF security modules the ability to check the validity of a signature against supplied data, by using user-provided or system-provided keys as trust anchor. The new kfunc makes it possible to enforce mandatory policies, as eBPF programs might be allowed to make security decisions only based on data sources the system administrator approves. The caller should provide the data to be verified and the signature as eBPF dynamic pointers (to minimize the number of parameters) and a bpf_key structure containing a reference to the keyring with keys trusted for signature verification, obtained from bpf_lookup_user_key() or bpf_lookup_system_key(). For bpf_key structures obtained from the former lookup function, bpf_verify_pkcs7_signature() completes the permission check deferred by that function by calling key_validate(). key_task_permission() is already called by the PKCS#7 code. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20220920075951.929132-9-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:49 -07:00
Roberto Sassu	f3cf4134c5	bpf: Add bpf_lookup_*_key() and bpf_key_put() kfuncs Add the bpf_lookup_user_key(), bpf_lookup_system_key() and bpf_key_put() kfuncs, to respectively search a key with a given key handle serial number and flags, obtain a key from a pre-determined ID defined in include/linux/verification.h, and cleanup. Introduce system_keyring_id_check() to validate the keyring ID parameter of bpf_lookup_system_key(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20220920075951.929132-8-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:49 -07:00
Roberto Sassu	90fd8f26ed	KEYS: Move KEY_LOOKUP_ to include/linux/key.h and define KEY_LOOKUP_ALL In preparation for the patch that introduces the bpf_lookup_user_key() eBPF kfunc, move KEY_LOOKUP_ definitions to include/linux/key.h, to be able to validate the kfunc parameters. Add them to enum key_lookup_flag, so that all the current ones and the ones defined in the future are automatically exported through BTF and available to eBPF programs. Also, add KEY_LOOKUP_ALL to the enum, with the logical OR of currently defined flags as value, to facilitate checking whether a variable contains only those flags. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lore.kernel.org/r/20220920075951.929132-7-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
Roberto Sassu	51df486571	bpf: Export bpf_dynptr_get_size() Export bpf_dynptr_get_size(), so that kernel code dealing with eBPF dynamic pointers can obtain the real size of data carried by this data structure. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220920075951.929132-6-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
Roberto Sassu	b8d31762a0	btf: Allow dynamic pointer parameters in kfuncs Allow dynamic pointers (struct bpf_dynptr_kern *) to be specified as parameters in kfuncs. Also, ensure that dynamic pointers passed as argument are valid and initialized, are a pointer to the stack, and of the type local. More dynamic pointer types can be supported in the future. To properly detect whether a parameter is of the desired type, introduce the stringify_struct() macro to compare the returned structure name with the desired name. In addition, protect against structure renames, by halting the build with BUILD_BUG_ON(), so that developers have to revisit the code. To check if a dynamic pointer passed to the kfunc is valid and initialized, and if its type is local, export the existing functions is_dynptr_reg_valid_init() and is_dynptr_type_expected(). Cc: Joanne Koong <joannelkoong@gmail.com> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220920075951.929132-5-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
Roberto Sassu	e9e315b4a5	bpf: Move dynptr type check to is_dynptr_type_expected() Move dynptr type check to is_dynptr_type_expected() from is_dynptr_reg_valid_init(), so that callers can better determine the cause of a negative result (dynamic pointer not valid/initialized, dynamic pointer of the wrong type). It will be useful for example for BTF, to restrict which dynamic pointer types can be passed to kfuncs, as initially only the local type will be supported. Also, splitting makes the code more readable, since checking the dynamic pointer type is not necessarily related to validity and initialization. Split the validity/initialization and dynamic pointer type check also in the verifier, and adjust the expected error message in the test (a test for an unexpected dynptr type passed to a helper cannot be added due to missing suitable helpers, but this case has been tested manually). Cc: Joanne Koong <joannelkoong@gmail.com> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220920075951.929132-4-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
Roberto Sassu	00f146413c	btf: Export bpf_dynptr definition eBPF dynamic pointers is a new feature recently added to upstream. It binds together a pointer to a memory area and its size. The internal kernel structure bpf_dynptr_kern is not accessible by eBPF programs in user space. They instead see bpf_dynptr, which is then translated to the internal kernel structure by the eBPF verifier. The problem is that it is not possible to include at the same time the uapi include linux/bpf.h and the vmlinux BTF vmlinux.h, as they both contain the definition of some structures/enums. The compiler complains saying that the structures/enums are redefined. As bpf_dynptr is defined in the uapi include linux/bpf.h, this makes it impossible to include vmlinux.h. However, in some cases, e.g. when using kfuncs, vmlinux.h has to be included. The only option until now was to include vmlinux.h and add the definition of bpf_dynptr directly in the eBPF program source code from linux/bpf.h. Solve the problem by using the same approach as for bpf_timer (which also follows the same scheme with the _kern suffix for the internal kernel structure). Add the following line in one of the dynamic pointer helpers, bpf_dynptr_from_mem(): BTF_TYPE_EMIT(struct bpf_dynptr); Cc: stable@vger.kernel.org Cc: Joanne Koong <joannelkoong@gmail.com> Fixes: `97e03f5210` ("bpf: Add verifier support for dynptrs") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Tested-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20220920075951.929132-3-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
KP Singh	d15bf1501c	bpf: Allow kfuncs to be used in LSM programs In preparation for the addition of new kfuncs, allow kfuncs defined in the tracing subsystem to be used in LSM programs by mapping the LSM program type to the TRACING hook. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220920075951.929132-2-roberto.sassu@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-21 17:32:48 -07:00
Tao Chen	01f2e36c95	libbpf: Support raw BTF placed in the default search path Currently, the default vmlinux files at '/boot/vmlinux-', '/lib/modules//vmlinux-*' etc. are parsed with 'btf__parse_elf()' to extract BTF. It is possible that these files are actually raw BTF files similar to /sys/kernel/btf/vmlinux. So parse these files with 'btf__parse' which tries both raw format and ELF format. This might be useful in some scenarios where users put their custom BTF into known locations and don't want to specify btf_custom_path option. Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/3f59fb5a345d2e4f10e16fe9e35fbc4c03ecaa3e.1662999860.git.chentao.kernel@linux.alibaba.com	2022-09-21 17:26:16 -07:00
Yauheni Kaliuta	272d1f4cfa	selftests: bpf: test_kmod.sh: Pass parameters to the module It's possible to specify particular tests for test_bpf.ko with module parameters. Make it possible to pass the module parameters, example: test_kmod.sh test_range=1,3 Since magnitude tests take long time it can be reasonable to skip them. Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220908120146.381218-1-ykaliuta@redhat.com	2022-09-21 17:09:36 -07:00
Yonghong Song	9f2f5d7830	libbpf: Improve BPF_PROG2 macro code quality and description Commit `34586d29f8` ("libbpf: Add new BPF_PROG2 macro") added BPF_PROG2 macro for trampoline based programs with struct arguments. Andrii made a few suggestions to improve code quality and description. This patch implemented these suggestions including better internal macro name, consistent usage pattern for __builtin_choose_expr(), simpler macro definition for always-inline func arguments and better macro description. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20220910025214.1536510-1-yhs@fb.com	2022-09-21 17:05:31 -07:00
Andrii Nakryiko	c12a037667	Merge branch 'bpf: Add user-space-publisher ring buffer map type' David Vernet says: ==================== This patch set defines a new map type, BPF_MAP_TYPE_USER_RINGBUF, which provides single-user-space-producer / single-kernel-consumer semantics over a ring buffer. Along with the new map type, a helper function called bpf_user_ringbuf_drain() is added which allows a BPF program to specify a callback with the following signature, to which samples are posted by the helper: void (struct bpf_dynptr dynptr, void context); The program can then use the bpf_dynptr_read() or bpf_dynptr_data() helper functions to safely read the sample from the dynptr. There are currently no helpers available to determine the size of the sample, but one could easily be added if required. On the user-space side, libbpf has been updated to export a new 'struct ring_buffer_user' type, along with the following symbols: struct ring_buffer_user * ring_buffer_user__new(int map_fd, const struct ring_buffer_user_opts opts); void ring_buffer_user__free(struct ring_buffer_user rb); void ring_buffer_user__reserve(struct ring_buffer_user rb, uint32_t size); void ring_buffer_user__poll(struct ring_buffer_user rb, uint32_t size, int timeout_ms); void ring_buffer_user__discard(struct ring_buffer_user rb, void sample); void ring_buffer_user__submit(struct ring_buffer_user rb, void sample); These symbols are exported for inclusion in libbpf version 1.0.0. Signed-off-by: David Vernet <void@manifault.com> --- v5 -> v6: - Fixed s/BPF_MAP_TYPE_RINGBUF/BPF_MAP_TYPE_USER_RINGBUF typo in the libbpf user ringbuf doxygen header comment for ring_buffer_user__new() (Andrii). - Specify that pointer returned from ring_buffer_user__reserve() and its blocking counterpart is 8-byte aligned (Andrii). - Renamed user_ringbuf__commit() to user_ringbuf_commit(), as it's static (Andrii). - Another slight reworking of user_ring_buffer__reserve_blocking() to remove some extraneous nanosecond variables + checking (Andrii). - Add a final check of user_ring_buffer__reserve() in user_ring_buffer__reserve_blocking(). - Moved busy bit lock / unlock logic from __bpf_user_ringbuf_peek() to bpf_user_ringbuf_drain() (Andrii). - -ENOSPC -> -ENODATA for an empty ring buffer in __bpf_user_ringbuf_peek() (Andrii). - Updated BPF_RB_FORCE_WAKEUP to only force a wakeup notification to be sent if even if no sample was drained. - Changed a bit of the wording in the UAPI header for bpf_user_ringbuf_drain() to mention the BPF_RB_FORCE_WAKEUP behavior. - Remove extra space after return in ringbuf_map_poll_user() (Andrii). - Removed now-extraneous paragraph from the commit summary of patch 2/4 (Andrii). v4 -> v5: - DENYLISTed the user-ringbuf test suite on s390x. We have a number of functions in the progs/user_ringbuf_success.c prog that user-space fires by invoking a syscall. Not all of these syscalls are available on s390x. If and when we add the ability to kick the kernel from user-space, or if we end up using iterators for that per Hao's suggestion, we could re-enable this test suite on s390x. - Fixed a few more places that needed ringbuffer -> ring buffer. v3 -> v4: - Update BPF_MAX_USER_RINGBUF_SAMPLES to not specify a bit, and instead just specify a number of samples. (Andrii) - Update "ringbuffer" in comments and commit summaries to say "ring buffer". (Andrii) - Return -E2BIG from bpf_user_ringbuf_drain() both when a sample can't fit into the ring buffer, and when it can't fit into a dynptr. (Andrii) - Don't loop over samples in __bpf_user_ringbuf_peek() if a sample was discarded. Instead, return -EAGAIN so the caller can deal with it. Also updated the caller to detect -EAGAIN and skip over it when iterating. (Andrii) - Removed the heuristic for notifying user-space when a sample is drained, causing the ring buffer to no longer be full. This may be useful in the future, but is being removed now because it's strictly a heuristic. - Re-add BPF_RB_FORCE_WAKEUP flag to bpf_user_ringbuf_drain(). (Andrii) - Remove helper_allocated_dynptr tracker from verifier. (Andrii) - Add libbpf function header comments to tools/lib/bpf/libbpf.h, so that they will be included in rendered libbpf docs. (Andrii) - Add symbols to a new LIBBPF_1.1.0 section in linker version script, rather than including them in LIBBPF_1.0.0. (Andrii) - Remove libbpf_err() calls from static libbpf functions. (Andrii) - Check user_ring_buffer_opts instead of ring_buffer_opts in user_ring_buffer__new(). (Andrii) - Avoid an extra if in the hot path in user_ringbuf__commit(). (Andrii) - Use ENOSPC rather than ENODATA if no space is available in the ring buffer. (Andrii) - Don't round sample size in header to 8, but still round size that is reserved and written to 8, and validate positions are multiples of 8 (Andrii). - Use nanoseconds for most calculations in user_ring_buffer__reserve_blocking(). (Andrii) - Don't use CHECK() in testcases, instead use ASSERT_. (Andrii) - Use SEC("?raw_tp") instead of SEC("?raw_tp/sys_nanosleep") in negative test. (Andrii) - Move test_user_ringbuf.h header to live next to BPF program instead of a directory up from both it and the user-space test program. (Andrii) - Update bpftool help message / docs to also include user_ringbuf. v2 -> v3: - Lots of formatting fixes, such as keeping things on one line if they fit within 100 characters, and removing some extraneous newlines. Applies to all diffs in the patch-set. (Andrii) - Renamed ring_buffer_user__ symbols to user_ring_buffer__. (Andrii) - Added a missing smb_mb__before_atomic() in __bpf_user_ringbuf_sample_release(). (Hao) - Restructure how and when notification events are sent from the kernel to the user-space producers via the .map_poll() callback for the BPF_MAP_TYPE_USER_RINGBUF map. Before, we only sent a notification when the ringbuffer was fully drained. Now, we guarantee user-space that we'll send an event at least once per bpf_user_ringbuf_drain(), as long as at least one sample was drained, and BPF_RB_NO_WAKEUP was not passed. As a heuristic, we also send a notification event any time a sample being drained causes the ringbuffer to no longer be full. (Andrii) - Continuing on the above point, updated user_ring_buffer__reserve_blocking() to loop around epoll_wait() until a sufficiently large sample is found. (Andrii) - Communicate BPF_RINGBUF_BUSY_BIT and BPF_RINGBUF_DISCARD_BIT in sample headers. The ringbuffer implementation still only supports single-producer semantics, but we can now add synchronization support in user_ring_buffer__reserve(), and will automatically get multi-producer semantics. (Andrii) - Updated some commit summaries, specifically adding more details where warranted. (Andrii) - Improved function documentation for bpf_user_ringbuf_drain(), more clearly explaining all function arguments and return types, as well as the semantics for waking up user-space producers. - Add function header comments for user_ring_buffer__reserve{_blocking}(). (Andrii) - Rounding-up all samples to 8-bytes in the user-space producer, and enforcing that all samples are properly aligned in the kernel. (Andrii) - Added testcases that verify that bpf_user_ringbuf_drain() properly validates samples, and returns error conditions if any invalid samples are encountered. (Andrii) - Move atomic_t busy field out of the consumer page, and into the struct bpf_ringbuf. (Andrii) - Split ringbuf_map_{mmap, poll}_{kern, user}() into separate implementations. (Andrii) - Don't silently consume errors in bpf_user_ringbuf_drain(). (Andrii) - Remove magic number of samples (4096) from bpf_user_ringbuf_drain(), and instead use BPF_MAX_USER_RINGBUF_SAMPLES macro, which allows 128k samples. (Andrii) - Remove MEM_ALLOC modifier from PTR_TO_DYNPTR register in verifier, and instead rely solely on the register being PTR_TO_DYNPTR. (Andrii) - Move freeing of atomic_t busy bit to before we invoke irq_work_queue() in __bpf_user_ringbuf_sample_release(). (Andrii) - Only check for BPF_RB_NO_WAKEUP flag in bpf_ringbuf_drain(). - Remove libbpf function names from kernel smp_{load, store} comments in the kernel. (Andrii) - Don't use double-underscore naming convention in libbpf functions. (Andrii) - Use proper __u32 and __u64 for types where we need to guarantee their size. (Andrii) v1 -> v2: - Following Joanne landing `883743422c` ("bpf: Fix ref_obj_id for dynptr data slices in verifier") [0], removed [PATCH 1/5] bpf: Clear callee saved regs after updating REG0 [1]. (Joanne) - Following the above adjustment, updated check_helper_call() to not store a reference for bpf_dynptr_data() if the register containing the dynptr is of type MEM_ALLOC. (Joanne) - Fixed casting issue pointed out by kernel test robot by adding a missing (uintptr_t) cast. (lkp) [0] https://lore.kernel.org/all/20220809214055.4050604-1-joannelkoong@gmail.com/ [1] https://lore.kernel.org/all/20220808155341.2479054-1-void@manifault.com/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-09-21 16:25:03 -07:00
David Vernet	e5a9df51c7	selftests/bpf: Add selftests validating the user ringbuf This change includes selftests that validate the expected behavior and APIs of the new BPF_MAP_TYPE_USER_RINGBUF map type. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-5-void@manifault.com	2022-09-21 16:25:03 -07:00
David Vernet	b66ccae01f	bpf: Add libbpf logic for user-space ring buffer Now that all of the logic is in place in the kernel to support user-space produced ring buffers, we can add the user-space logic to libbpf. This patch therefore adds the following public symbols to libbpf: struct user_ring_buffer * user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts opts); void user_ring_buffer__reserve(struct user_ring_buffer rb, __u32 size); void user_ring_buffer__reserve_blocking(struct user_ring_buffer rb, __u32 size, int timeout_ms); void user_ring_buffer__submit(struct user_ring_buffer rb, void sample); void user_ring_buffer__discard(struct user_ring_buffer rb, void user_ring_buffer__free(struct user_ring_buffer rb); A user-space producer must first create a struct user_ring_buffer object with user_ring_buffer__new(), and can then reserve samples in the ring buffer using one of the following two symbols: void user_ring_buffer__reserve(struct user_ring_buffer rb, __u32 size); void user_ring_buffer__reserve_blocking(struct user_ring_buffer rb, __u32 size, int timeout_ms); With user_ring_buffer__reserve(), a pointer to a 'size' region of the ring buffer will be returned if sufficient space is available in the buffer. user_ring_buffer__reserve_blocking() provides similar semantics, but will block for up to 'timeout_ms' in epoll_wait if there is insufficient space in the buffer. This function has the guarantee from the kernel that it will receive at least one event-notification per invocation to bpf_ringbuf_drain(), provided that at least one sample is drained, and the BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain(). Once a sample is reserved, it must either be committed to the ring buffer with user_ring_buffer__submit(), or discarded with user_ring_buffer__discard(). Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-4-void@manifault.com	2022-09-21 16:25:03 -07:00
David Vernet	2057156738	bpf: Add bpf_user_ringbuf_drain() helper In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which will allow user-space applications to publish messages to a ring buffer that is consumed by a BPF program in kernel-space. In order for this map-type to be useful, it will require a BPF helper function that BPF programs can invoke to drain samples from the ring buffer, and invoke callbacks on those samples. This change adds that capability via a new BPF helper function: bpf_user_ringbuf_drain(struct bpf_map map, void callback_fn, void ctx, u64 flags) BPF programs may invoke this function to run callback_fn() on a series of samples in the ring buffer. callback_fn() has the following signature: long callback_fn(struct bpf_dynptr dynptr, void context); Samples are provided to the callback in the form of struct bpf_dynptr 's, which the program can read using BPF helper functions for querying struct bpf_dynptr's. In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register type is added to the verifier to reflect a dynptr that was allocated by a helper function and passed to a BPF program. Unlike PTR_TO_STACK dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR dynptrs need not use reference tracking, as the BPF helper is trusted to properly free the dynptr before returning. The verifier currently only supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL. Note that while the corresponding user-space libbpf logic will be added in a subsequent patch, this patch does contain an implementation of the .map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This .map_poll() callback guarantees that an epoll-waiting user-space producer will receive at least one event notification whenever at least one sample is drained in an invocation of bpf_user_ringbuf_drain(), provided that the function is not invoked with the BPF_RB_NO_WAKEUP flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup notification is sent even if no sample was drained. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-3-void@manifault.com	2022-09-21 16:24:58 -07:00
David Vernet	583c1f4201	bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type We want to support a ringbuf map type where samples are published from user-space, to be consumed by BPF programs. BPF currently supports a kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF map type. We'll need to define a new map type for user-space -> kernel, as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply to a user-space producer ring buffer, and we'll want to add one or more helper functions that would not apply for a kernel-producer ring buffer. This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type definition. The map type is useless in its current form, as there is no way to access or use it for anything until we one or more BPF helpers. A follow-on patch will therefore add a new helper function that allows BPF programs to run callbacks on samples that are published to the ring buffer. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com	2022-09-21 16:24:17 -07:00
William Dean	3a74904cef	bpf: simplify code in btf_parse_hdr It could directly return 'btf_check_sec_info' to simplify code. Signed-off-by: William Dean <williamsukatube@163.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220917084248.3649-1-williamsukatube@163.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-21 10:28:46 -07:00
Xin Liu	7620bffbf7	libbpf: Fix NULL pointer exception in API btf_dump__dump_type_data We found that function btf_dump__dump_type_data can be called by the user as an API, but in this function, the `opts` parameter may be used as a null pointer.This causes `opts->indent_str` to trigger a NULL pointer exception. Fixes: `2ce8450ef5` ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts") Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Weibin Kong <kongweibin2@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220917084809.30770-1-liuxin350@huawei.com	2022-09-20 17:34:09 -07:00
Rong Tao	bc069da65e	samples/bpf: Replace blk_account_io_done() with __blk_account_io_done() Since commit `be6bfe36db` ("block: inline hot paths of blk_account_io_()") blk_account_io_() become inline functions. Signed-off-by: Rong Tao <rtoax@foxmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/tencent_1CC476835C219FACD84B6715F0D785517E07@qq.com	2022-09-20 17:25:59 -07:00
Martin KaFai Lau	bfa8fe95ff	Merge branch 'bpf: Small nf_conn cleanups' Daniel Xu says: ==================== This patchset cleans up a few small things: * Delete unused stub * Rename variable to be more descriptive * Fix some `extern` declaration warnings Past discussion: - v2: https://lore.kernel.org/bpf/cover.1663616584.git.dxu@dxuuu.xyz/ Changes since v2: - Remove unused #include's - Move #include <linux/filter.h> to .c ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-20 14:41:38 -07:00
Daniel Xu	fdf214978a	bpf: Move nf_conn extern declarations to filter.h We're seeing the following new warnings on netdev/build_32bit and netdev/build_allmodconfig_warn CI jobs: ../net/core/filter.c:8608:1: warning: symbol 'nf_conn_btf_access_lock' was not declared. Should it be static? ../net/core/filter.c:8611:5: warning: symbol 'nfct_bsa' was not declared. Should it be static? Fix by ensuring extern declaration is present while compiling filter.o. Fixes: `864b656f82` ("bpf: Add support for writing to nf_conn:mark") Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/2bd2e0283df36d8a4119605878edb1838d144174.1663683114.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-20 14:41:35 -07:00
Daniel Xu	5a090aa350	bpf: Rename nfct_bsa to nfct_btf_struct_access The former name was a little hard to guess. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/73adc72385c8b162391fbfb404f0b6d4c5cc55d7.1663683114.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-20 14:30:34 -07:00
Daniel Xu	52bdae37c9	bpf: Remove unused btf_struct_access stub This stub was not being used anywhere. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/590e7bd6172ffe0f3d7b51cd40e8ded941aaf7e8.1663683114.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-20 14:30:34 -07:00
Hou Tao	c31b38cb94	bpf: Check whether or not node is NULL before free it in free_bulk llnode could be NULL if there are new allocations after the checking of c-free_cnt > c->high_watermark in bpf_mem_refill() and before the calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel or allocation in NMI context). And it will incur oops as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT_RT SMP CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G W 6.0.0-rc6-rt9+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:bpf_mem_refill+0x66/0x130 ...... Call Trace: <TASK> irq_work_single+0x24/0x60 irq_work_run_list+0x24/0x30 run_irq_workd+0x18/0x20 smpboot_thread_fn+0x13f/0x2c0 kthread+0x121/0x140 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Simply fixing it by checking whether or not llnode is NULL in free_bulk(). Fixes: `8d5a8011b3` ("bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220919144811.3570825-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-20 08:06:27 -07:00
Hou Tao	a7e85406bd	selftests/bpf: Add test result messages for test_task_storage_map_stress_lookup Add test result message when test_task_storage_map_stress_lookup() succeeds or is skipped. The test case can be skipped due to the choose of preemption model in kernel config, so export skips in test_maps.c and increase it when needed. The following is the output of test_maps when the test case succeeds or is skipped: test_task_storage_map_stress_lookup:PASS test_maps: OK, 0 SKIPPED test_task_storage_map_stress_lookup SKIP (no CONFIG_PREEMPT) test_maps: OK, 1 SKIPPED Fixes: `73b97bc78b` ("selftests/bpf: Test concurrent updates on bpf_task_storage_busy") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220919035714.2195144-1-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-19 11:17:38 -07:00
Peilin Ye	571f9738bf	bpf/btf: Use btf_type_str() whenever possible We have btf_type_str(). Use it whenever possible in btf.c, instead of "btf_kind_str[BTF_INFO_KIND(t->info)]". Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Link: https://lore.kernel.org/r/20220916202800.31421-1-yepeilin.cs@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-16 17:09:38 -07:00
Xin Liu	dc567045f1	libbpf: Clean up legacy bpf maps declaration in bpf_helpers Legacy BPF map declarations are no longer supported in libbpf v1.0 [0]. Only BTF-defined maps are supported starting from v1.0, so it is time to remove the definition of bpf_map_def in bpf_helpers.h. [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0 Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220913073643.19960-1-liuxin350@huawei.com	2022-09-16 22:56:09 +02:00
Andrii Nakryiko	c8bc5e0509	selftests/bpf: Add veristat tool for mass-verifying BPF object files Add a small tool, veristat, that allows mass-verification of a set of libbpf-compatible BPF ELF object files. For each such object file, veristat will attempt to verify each BPF program individually. Regardless of success or failure, it parses BPF verifier stats and outputs them in human-readable table format. In the future we can also add CSV and JSON output for more scriptable post-processing, if necessary. veristat allows to specify a set of stats that should be output and ordering between multiple objects and files (e.g., so that one can easily order by total instructions processed, instead of default file name, prog name, verdict, total instructions order). This tool should be useful for validating various BPF verifier changes or even validating different kernel versions for regressions. Here's an example for some of the heaviest selftests/bpf BPF object files: $ sudo ./veristat -s insns,file,prog {pyperf,loop,test_verif_scale,strobemeta,test_cls_redirect,profiler}*.linked3.o File Program Verdict Duration, us Total insns Total states Peak states ------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ ----------- loop3.linked3.o while_true failure 350990 1000001 9663 9663 test_verif_scale3.linked3.o balancer_ingress success 115244 845499 8636 2141 test_verif_scale2.linked3.o balancer_ingress success 77688 773445 3048 788 pyperf600.linked3.o on_event success 2079872 624585 30335 30241 pyperf600_nounroll.linked3.o on_event success 353972 568128 37101 2115 strobemeta.linked3.o on_event success 455230 557149 15915 13537 test_verif_scale1.linked3.o balancer_ingress success 89880 554754 8636 2141 strobemeta_nounroll2.linked3.o on_event success 433906 501725 17087 1912 loop6.linked3.o trace_virtqueue_add_sgs success 282205 398057 8717 919 loop1.linked3.o nested_loops success 125630 361349 5504 5504 pyperf180.linked3.o on_event success 2511740 160398 11470 11446 pyperf100.linked3.o on_event success 744329 87681 6213 6191 test_cls_redirect.linked3.o cls_redirect success 54087 78925 4782 903 strobemeta_subprogs.linked3.o on_event success 57898 65420 1954 403 test_cls_redirect_subprogs.linked3.o cls_redirect success 54522 64965 4619 958 strobemeta_nounroll1.linked3.o on_event success 43313 57240 1757 382 pyperf50.linked3.o on_event success 194355 46378 3263 3241 profiler2.linked3.o tracepoint__syscalls__sys_enter_kill success 23869 43372 1423 542 pyperf_subprogs.linked3.o on_event success 29179 36358 2499 2499 profiler1.linked3.o tracepoint__syscalls__sys_enter_kill success 13052 27036 1946 936 profiler3.linked3.o tracepoint__syscalls__sys_enter_kill success 21023 26016 2186 915 profiler2.linked3.o kprobe__vfs_link success 5255 13896 303 271 profiler1.linked3.o kprobe__vfs_link success 7792 12687 1042 1041 profiler3.linked3.o kprobe__vfs_link success 7332 10601 865 865 profiler2.linked3.o kprobe_ret__do_filp_open success 3417 8900 216 199 profiler2.linked3.o kprobe__vfs_symlink success 3548 8775 203 186 pyperf_global.linked3.o on_event success 10007 7563 520 520 profiler3.linked3.o kprobe_ret__do_filp_open success 4708 6464 532 532 profiler1.linked3.o kprobe_ret__do_filp_open success 3090 6445 508 508 profiler3.linked3.o kprobe__vfs_symlink success 4477 6358 521 521 profiler1.linked3.o kprobe__vfs_symlink success 3381 6347 507 507 profiler2.linked3.o raw_tracepoint__sched_process_exec success 2464 5874 292 189 profiler3.linked3.o raw_tracepoint__sched_process_exec success 2677 4363 397 283 profiler2.linked3.o kprobe__proc_sys_write success 1800 4355 143 138 profiler1.linked3.o raw_tracepoint__sched_process_exec success 1649 4019 333 240 pyperf600_bpf_loop.linked3.o on_event success 2711 3966 306 306 profiler2.linked3.o raw_tracepoint__sched_process_exit success 1234 3138 83 66 profiler3.linked3.o kprobe__proc_sys_write success 1755 2623 223 223 profiler1.linked3.o kprobe__proc_sys_write success 1222 2456 193 193 loop2.linked3.o while_true success 608 1783 57 30 profiler3.linked3.o raw_tracepoint__sched_process_exit success 789 1680 146 146 profiler1.linked3.o raw_tracepoint__sched_process_exit success 592 1526 133 133 strobemeta_bpf_loop.linked3.o on_event success 1015 1512 106 106 loop4.linked3.o combinations success 165 524 18 17 profiler3.linked3.o raw_tracepoint__sched_process_fork success 196 299 25 25 profiler1.linked3.o raw_tracepoint__sched_process_fork success 109 265 19 19 profiler2.linked3.o raw_tracepoint__sched_process_fork success 111 265 19 19 loop5.linked3.o while_true success 47 84 9 9 ------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ ----------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-4-andrii@kernel.org	2022-09-16 22:41:41 +02:00
Andrii Nakryiko	749c202cb6	libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF for freplace programs. It's wrong to search in vmlinux BTF and libbpf doesn't even mark vmlinux BTF as required for freplace programs. So trying to search anything in obj->vmlinux_btf might cause NULL dereference if nothing else in BPF object requires vmlinux BTF. Instead, error out if freplace (EXT) program doesn't specify attach_prog_fd during at the load time. Fixes: `91abb4a6d7` ("libbpf: Support attachment of BPF tracing programs to kernel modules") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org	2022-09-16 22:39:37 +02:00
Andrii Nakryiko	cf060c2c39	selftests/bpf: Fix test_verif_scale{1,3} SEC() annotations Use proper SEC("tc") for test_verif_scale{1,3} programs. It's not a problem for selftests right now because we manually set type programmatically, but not having correct SEC() definitions makes it harded to generically load BPF object files. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-2-andrii@kernel.org	2022-09-16 22:39:36 +02:00
Jiri Olsa	ceea991a01	bpf: Move bpf_dispatcher function out of ftrace locations The dispatcher function is attached/detached to trampoline by dispatcher update function. At the same time it's available as ftrace attachable function. After discussion [1] the proposed solution is to use compiler attributes to alter bpf_dispatcher_##name##_func function: - remove it from being instrumented with __no_instrument_function__ attribute, so ftrace has no track of it - but still generate 5 nop instructions with patchable_function_entry(5) attribute, which are expected by bpf_arch_text_poke used by dispatcher update function Enabling HAVE_DYNAMIC_FTRACE_NO_PATCHABLE option for x86, so __patchable_function_entries functions are not part of ftrace/mcount locations. Adding attributes to bpf_dispatcher_XXX function on x86_64 so it's kept out of ftrace locations and has 5 byte nop generated at entry. These attributes need to be arch specific as pointed out by Ilya Leoshkevic in here [2]. The dispatcher image is generated only for x86_64 arch, so the code can stay as is for other archs. [1] https://lore.kernel.org/bpf/20220722110811.124515-1-jolsa@kernel.org/ [2] https://lore.kernel.org/bpf/969a14281a7791c334d476825863ee449964dd0c.camel@linux.ibm.com/ Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20220903131154.420467-3-jolsa@kernel.org	2022-09-16 22:23:20 +02:00
Peter Zijlstra (Intel)	9440155ccb	ftrace: Add HAVE_DYNAMIC_FTRACE_NO_PATCHABLE x86 will shortly start using -fpatchable-function-entry for purposes other than ftrace, make sure the __patchable_function_entry section isn't merged in the mcount_loc section. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220903131154.420467-2-jolsa@kernel.org	2022-09-16 22:16:48 +02:00
Yauheni Kaliuta	bfeb7e399b	bpf: Use bpf_capable() instead of CAP_SYS_ADMIN for blinding decision The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays. These days given unprivileged BPF is disabled by default, the main users for constant blinding coming from unprivileged in particular via cBPF -> eBPF migration (e.g. old-style socket filters). Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com	2022-09-16 22:11:57 +02:00
Wang Yufen	a02c118ee9	bpf: use kvmemdup_bpfptr helper Use kvmemdup_bpfptr helper instead of open-coding to simplify the code. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/1663058433-14089-1-git-send-email-wangyufen@huawei.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-16 11:58:18 -07:00
Dave Marchevsky	47e34cb74d	bpf: Add verifier check for BPF_PTR_POISON retval and arg BPF_PTR_POISON was added in commit `c0a5a21c25` ("bpf: Allow storing referenced kptr in map") to denote a bpf_func_proto btf_id which the verifier will replace with a dynamically-determined btf_id at verification time. This patch adds verifier 'poison' functionality to BPF_PTR_POISON in order to prepare for expanded use of the value to poison ret- and arg-btf_id in ongoing work, namely rbtree and linked list patchsets [0, 1]. Specifically, when the verifier checks helper calls, it assumes that BPF_PTR_POISON'ed ret type will be replaced with a valid type before - or in lieu of - the default ret_btf_id logic. Similarly for arg btf_id. If poisoned btf_id reaches default handling block for either, consider this a verifier internal error and fail verification. Otherwise a helper w/ poisoned btf_id but no verifier logic replacing the type will cause a crash as the invalid pointer is dereferenced. Also move BPF_PTR_POISON to existing include/linux/posion.h header and remove unnecessary shift. [0]: lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com [1]: lore.kernel.org/bpf/20220904204145.3089-1-memxor@gmail.com Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220912154544.1398199-1-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-15 02:44:07 -07:00
Dave Marchevsky	1bfe26fb08	bpf: Add verifier support for custom callback return range Verifier logic to confirm that a callback function returns 0 or 1 was added in commit `69c087ba62` ("bpf: Add bpf_for_each_map_elem() helper"). At the time, callback return value was only used to continue or stop iteration. In order to support callbacks with a broader return value range, such as those added in rbtree series[0] and others, add a callback_ret_range to bpf_func_state. Verifier's helpers which set in_callback_fn will also set the new field, which the verifier will later use to check return value bounds. Default to tnum_range(0, 0) instead of using tnum_unknown as a sentinel value as the latter would prevent the valid range (0, U64_MAX) being used. Previous global default tnum_range(0, 1) is explicitly set for extant callback helpers. The change to global default was made after discussion around this patch in rbtree series [1], goal here is to make it more obvious that callback_ret_range should be explicitly set. [0]: lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com/ [1]: lore.kernel.org/bpf/20220830172759.4069786-2-davemarchevsky@fb.com/ Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220908230716.2751723-1-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 18:14:50 -07:00
Lorenzo Bianconi	f7c946f288	selftests/bpf: fix ct status check in bpf_nf selftests Check properly the connection tracking entry status configured running bpf_ct_change_status kfunc. Remove unnecessary IPS_CONFIRMED status configuration since it is already done during entry allocation. Fixes: `6eb7fba007` ("selftests/bpf: Add tests for new nf_conntrack kfuncs") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/813a5161a71911378dfac8770ec890428e4998aa.1662623574.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:40:46 -07:00
Alexei Starovoitov	b8c62fe202	Merge branch 'Support direct writes to nf_conn:mark' Daniel Xu says: ==================== Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Past discussion: - v4: https://lore.kernel.org/bpf/cover.1661192455.git.dxu@dxuuu.xyz/ - v3: https://lore.kernel.org/bpf/cover.1660951028.git.dxu@dxuuu.xyz/ - v2: https://lore.kernel.org/bpf/CAP01T74Sgn354dXGiFWFryu4vg+o8b9s9La1d9zEbC4LGvH4qg@mail.gmail.com/T/ - v1: https://lore.kernel.org/bpf/cover.1660592020.git.dxu@dxuuu.xyz/ Changes since v4: - Use exported function pointer + mutex to handle CONFIG_NF_CONNTRACK=m case Changes since v3: - Use a mutex to protect module load/unload critical section Changes since v2: - Remove use of NOT_INIT for btf_struct_access write path - Disallow nf_conn writing when nf_conntrack module not loaded - Support writing to nf_conn___init:mark Changes since v1: - Add unimplemented stub for when !CONFIG_BPF_SYSCALL ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:33 -07:00
Daniel Xu	e2d75e954c	selftests/bpf: Add tests for writing to nf_conn:mark Add a simple extension to the existing selftest to write to nf_conn:mark. Also add a failure test for writing to unsupported field. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/f78966b81b9349d2b8ebb4cee2caf15cb6b38ee2.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Daniel Xu	864b656f82	bpf: Add support for writing to nf_conn:mark Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/ebca06dea366e3e7e861c12f375a548cc4c61108.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Daniel Xu	84c6ac417c	bpf: Export btf_type_by_id() and bpf_log() These symbols will be used in nf_conntrack.ko to support direct writes to `nf_conn`. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/3c98c19dc50d3b18ea5eca135b4fc3a5db036060.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Daniel Xu	896f07c07d	bpf: Use 0 instead of NOT_INIT for btf_struct_access() writes Returning a bpf_reg_type only makes sense in the context of a BPF_READ. For writes, prefer to explicitly return 0 for clarity. Note that is non-functional change as it just so happened that NOT_INIT == 0. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/01772bc1455ae16600796ac78c6cc9fff34f95ff.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Daniel Xu	d4f7bdb2ed	bpf: Add stub for btf_struct_access() Add corresponding unimplemented stub for when CONFIG_BPF_SYSCALL=n Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/4021398e884433b1fef57a4d28361bb9fcf1bd05.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Daniel Xu	65269888c6	bpf: Remove duplicate PTR_TO_BTF_ID RO check Since commit `27ae7997a6` ("bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS") there has existed bpf_verifier_ops:btf_struct_access. When btf_struct_access is _unset_ for a prog type, the verifier runs the default implementation, which is to enforce read only: if (env->ops->btf_struct_access) { [...] } else { if (atype != BPF_READ) { verbose(env, "only read is supported\n"); return -EACCES; } [...] } When btf_struct_access is _set_, the expectation is that btf_struct_access has full control over accesses, including if writes are allowed. Rather than carve out an exception for each prog type that may write to BTF ptrs, delete the redundant check and give full control to btf_struct_access. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/962da2bff1238746589e332ff1aecc49403cd7ce.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 17:27:32 -07:00
Punit Agrawal	57c92f11a2	bpf: Simplify code by using for_each_cpu_wrap() In the percpu freelist code, it is a common pattern to iterate over the possible CPUs mask starting with the current CPU. The pattern is implemented using a hand rolled while loop with the loop variable increment being open-coded. Simplify the code by using for_each_cpu_wrap() helper to iterate over the possible cpus starting with the current CPU. As a result, some of the special-casing in the loop also gets simplified. No functional change intended. Signed-off-by: Punit Agrawal <punit.agrawal@bytedance.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20220907155746.1750329-1-punit.agrawal@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 16:18:55 -07:00
Tetsuo Handa	cf7de6a536	bpf: add missing percpu_counter_destroy() in htab_map_alloc() syzbot is reporting ODEBUG bug in htab_map_alloc() [1], for commit `86fe28f769` ("bpf: Optimize element count in non-preallocated hash map.") added percpu_counter_init() to htab_map_alloc() but forgot to add percpu_counter_destroy() to the error path. Link: https://syzkaller.appspot.com/bug?extid=5d1da78b375c3b5e6c2b [1] Reported-by: syzbot <syzbot+5d1da78b375c3b5e6c2b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `86fe28f769` ("bpf: Optimize element count in non-preallocated hash map.") Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/e2e4cc0e-9d36-4ca1-9bfa-ce23e6f8310b@I-love.SAKURA.ne.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-10 16:04:07 -07:00
Martin KaFai Lau	2fae67716b	Merge branch 'cgroup/connect{4,6} programs for unprivileged ICMP ping' YiFei Zhu says: ==================== Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This series adds the invocation for cgroup/connect{4,6} programs to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW). This also adds a test to verify that the hooks are being called and invoking bpf_bind() from within the hook actually binds the socket. Patch 1 adds the invocation of the hook. Patch 2 deduplicates write_sysctl in BPF test_progs. Patch 3 adds the tests for this hook. v1 -> v2: * Added static to bindaddr_v6 in prog_tests/connect_ping.c * Deduplicated much of the test logic in prog_tests/connect_ping.c * Deduplicated write_sysctl() to test_progs.c v2 -> v3: * Renamed variable "obj" to "skel" for the BPF skeleton object in prog_tests/connect_ping.c v3 -> v4: * Fixed error path to destroy skel in prog_tests/connect_ping.c ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-09 10:40:46 -07:00
YiFei Zhu	58c449a969	selftests/bpf: Ensure cgroup/connect{4,6} programs can bind unpriv ICMP ping This tests that when an unprivileged ICMP ping socket connects, the hooks are actually invoked. We also ensure that if the hook does not call bpf_bind(), the bound address is unmodified, and if the hook calls bpf_bind(), the bound address is exactly what we provided to the helper. A new netns is used to enable ping_group_range in the test without affecting ouside of the test, because by default, not even root is permitted to use unprivileged ICMP ping... Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/086b227c1b97f4e94193e58aae7576d0261b68a4.1662682323.git.zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-09 10:40:45 -07:00
YiFei Zhu	e42921c3c3	selftests/bpf: Deduplicate write_sysctl() to test_progs.c This helper is needed in multiple tests. Instead of copying it over and over, better to deduplicate this helper to test_progs.c. test_progs.c is chosen over testing_helpers.c because of this helper's use of CHECK / ASSERT_, and the CHECK was modified to use ASSERT_ so it does not rely on a duration variable. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/9b4fc9a27bd52f771b657b4c4090fc8d61f3a6b5.1662682323.git.zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-09 10:40:45 -07:00
YiFei Zhu	0ffe241253	bpf: Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This adds the hook invocation to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW. Logic is mirrored from UDP sockets where the hook is invoked during pre_connect, after a check for suficiently sized addr_len. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/5764914c252fad4cd134fb6664c6ede95f409412.1662682323.git.zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-09 10:40:45 -07:00

1 2 3 4 5 ...

1123125 Commits