iproute2/include/bpf_elf.h

54 lines
1.2 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __BPF_ELF__
#define __BPF_ELF__
#include <asm/types.h>
/* Note:
*
* Below ELF section names and bpf_elf_map structure definition
* are not (!) kernel ABI. It's rather a "contract" between the
* application and the BPF loader in tc. For compatibility, the
* section names should stay as-is. Introduction of aliases, if
* needed, are a possibility, though.
*/
/* ELF section names, etc */
#define ELF_SECTION_LICENSE "license"
#define ELF_SECTION_MAPS "maps"
#define ELF_SECTION_PROG "prog"
#define ELF_SECTION_CLASSIFIER "classifier"
#define ELF_SECTION_ACTION "action"
#define ELF_MAX_MAPS 64
#define ELF_MAX_LICENSE_LEN 128
{f,m}_bpf: allow for sharing maps This larger work addresses one of the bigger remaining issues on tc's eBPF frontend, that is, to allow for persistent file descriptors. Whenever tc parses the ELF object, extracts and loads maps into the kernel, these file descriptors will be out of reach after the tc instance exits. Meaning, for simple (unnested) programs which contain one or multiple maps, the kernel holds a reference, and they will live on inside the kernel until the program holding them is unloaded, but they will be out of reach for user space, even worse with (also multiple nested) tail calls. For this issue, we introduced the concept of an agent that can receive the set of file descriptors from the tc instance creating them, in order to be able to further inspect/update map data for a specific use case. However, while that is more tied towards specific applications, it still doesn't easily allow for sharing maps accross multiple tc instances and would require a daemon to be running in the background. F.e. when a map should be shared by two eBPF programs, one attached to ingress, one to egress, this currently doesn't work with the tc frontend. This work solves exactly that, i.e. if requested, maps can now be _arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within a single object (but various program sections, PIN_OBJECT_NS) without "loosing" the file descriptor set. To make that happen, we use eBPF object pinning introduced in kernel commit b2197755b263 ("bpf: add support for persistent maps/progs") for exactly this purpose. The shipped examples/bpf/bpf_shared.c code from this patch can be easily applied, for instance, as: - classifier-classifier shared: tc filter add dev foo parent 1: bpf obj shared.o sec egress tc filter add dev foo parent ffff: bpf obj shared.o sec ingress - classifier-action shared (here: late binding to a dummy classifier): tc actions add action bpf obj shared.o sec egress pass index 42 tc filter add dev foo parent ffff: bpf obj shared.o sec ingress tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \ action bpf index 42 The toy example increments a shared counter on egress and dumps its value on ingress (if no sharing (PIN_NONE) would have been chosen, map value is 0, of course, due to the two map instances being created): [...] <idle>-0 [002] ..s. 38264.788234: : map val: 4 <idle>-0 [002] ..s. 38264.788919: : map val: 4 <idle>-0 [002] ..s. 38264.789599: : map val: 5 [...] ... thus if both sections reference the pinned map(s) in question, tc will take care of fetching the appropriate file descriptor. The patch has been tested extensively on both, classifier and action sides. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2015-11-13 07:39:29 +08:00
/* Object pinning settings */
#define PIN_NONE 0
#define PIN_OBJECT_NS 1
#define PIN_GLOBAL_NS 2
/* ELF map definition */
struct bpf_elf_map {
__u32 type;
__u32 size_key;
__u32 size_value;
__u32 max_elem;
__u32 flags;
__u32 id;
__u32 pinning;
__u32 inner_id;
__u32 inner_idx;
};
bpf: implement btf handling and map annotation Implement loading of .BTF section from object file and build up internal table for retrieving key/value id related to maps in the BPF program. Latter is done by setting up struct btf_type table. One of the issues is that there's a disconnect between the data types used in the map and struct bpf_elf_map, meaning the underlying types are unknown from the map description. One way to overcome this is to add a annotation such that the loader will recognize the relation to both. BPF_ANNOTATE_KV_PAIR(map_foo, struct key, struct val); has been added to the API that programs can use. The loader will then pick the corresponding key/value type ids and attach it to the maps for creation. This can later on be dumped via bpftool for introspection. Example with test_xdp_noinline.o from kernel selftests: [...] struct ctl_value { union { __u64 value; __u32 ifindex; __u8 mac[6]; }; }; struct bpf_map_def __attribute__ ((section("maps"), used)) ctl_array = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(struct ctl_value), .max_entries = 16, .map_flags = 0, }; BPF_ANNOTATE_KV_PAIR(ctl_array, __u32, struct ctl_value); [...] Above could also further be wrapped in a macro. Compiling through LLVM and converting to BTF: # llc --version LLVM (http://llvm.org/): LLVM version 7.0.0svn Optimized build. Default target: x86_64-unknown-linux-gnu Host CPU: skylake Registered Targets: bpf - BPF (host endian) bpfeb - BPF (big endian) bpfel - BPF (little endian) [...] # clang [...] -O2 -target bpf -g -emit-llvm -c test_xdp_noinline.c -o - | llc -march=bpf -mcpu=probe -mattr=dwarfris -filetype=obj -o test_xdp_noinline.o # pahole -J test_xdp_noinline.o Checking pahole dump of BPF object file: # file test_xdp_noinline.o test_xdp_noinline.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), with debug_info, not stripped # pahole test_xdp_noinline.o [...] struct ctl_value { union { __u64 value; /* 0 8 */ __u32 ifindex; /* 0 4 */ __u8 mac[0]; /* 0 0 */ }; /* 0 8 */ /* size: 8, cachelines: 1, members: 1 */ /* last cacheline: 8 bytes */ }; Now loading into kernel and dumping the map via bpftool: # ip -force link set dev lo xdp obj test_xdp_noinline.o sec xdp-test # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric/id:227 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever [...] # bpftool prog show id 227 227: xdp tag a85e060c275c5616 gpl loaded_at 2018-07-17T14:41:29+0000 uid 0 xlated 8152B not jited memlock 12288B map_ids 381,385,386,382,384,383 # bpftool map dump id 386 [{ "key": 0, "value": { "": { "value": 0, "ifindex": 0, "mac": [] } } },{ "key": 1, "value": { "": { "value": 0, "ifindex": 0, "mac": [] } } },{ [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>
2018-07-18 07:31:22 +08:00
#define BPF_ANNOTATE_KV_PAIR(name, type_key, type_val) \
struct ____btf_map_##name { \
type_key key; \
type_val value; \
}; \
struct ____btf_map_##name \
__attribute__ ((section(".maps." #name), used)) \
____btf_map_##name = { }
#endif /* __BPF_ELF__ */