Merge branch 'Improve XDP samples usability and output'

Kumar Kartikeya says:

====================

This set revamps XDP samples related to redirection to show better output and
implement missing features consolidating all their differences and giving them a
consistent look and feel, by implementing common features and command line
options.  Some of the TODO items like reporting redirect error numbers
(ENETDOWN, EINVAL, ENOSPC, etc.) have also been implemented.

Some of the features are:
* Received packet statistics
* xdp_redirect/xdp_redirect_map tracepoint statistics
* xdp_redirect_err/xdp_redirect_map_err tracepoint statistics (with support for
  showing exact errno)
* xdp_cpumap_enqueue/xdp_cpumap_kthread tracepoint statistics
* xdp_devmap_xmit tracepoint statistics
* xdp_exception tracepoint statistics
* Per ifindex pair devmap_xmit stats shown dynamically (for xdp_monitor) to
  decompose the total.
* Use of BPF skeleton and BPF static linking to share BPF programs.
* Use of vmlinux.h and tp_btf for raw_tracepoint support.
* Removal of redundant -N/--native-mode option (enforced by default now)
* ... and massive cleanups all over the place.

All tracepoints also use raw_tp now, and tracepoints like xdp_redirect
are only enabled when requested explicitly to capture successful redirection
statistics.

The set of programs converted as part of this series are:
 * xdp_redirect_cpu
 * xdp_redirect_map_multi
 * xdp_redirect_map
 * xdp_redirect
 * xdp_monitor

 Explanation of the output:

There is now a concise output mode by default that shows primarily four fields:
  rx/s        Number of packets received per second
  redir/s     Number of packets successfully redirected per second
  err,drop/s  Aggregated count of errors per second (including dropped packets)
  xmit/s      Number of packets transmitted on the output device per second

Some examples:
 ; sudo ./xdp_redirect_map veth0 veth1 -s
Redirecting from veth0 (ifindex 15; driver veth) to veth1 (ifindex 14; driver veth)
veth0->veth1                    0 rx/s                  0 redir/s		0 err,drop/s               0 xmit/s
veth0->veth1            9,998,660 rx/s          9,998,658 redir/s		0 err,drop/s       9,998,654 xmit/s
...

There is also a verbose mode, that can also be enabled by default using -v (--verbose).
The output mode can be switched dynamically at runtime using Ctrl + \ (SIGQUIT).

To make the concise output more useful, the errors that occur are expanded inline
(as if verbose mode was enabled) to let the user pin down the source of the
problem without having to clutter output (or possibly miss it) or always use verbose mode.

For instance, let's consider a case where the output device link state is set to
down while redirection is happening:

[...]
veth0->veth1           24,503,376 rx/s                  0 err,drop/s      24,503,372 xmit/s
veth0->veth1           25,044,775 rx/s                  0 err,drop/s      25,044,783 xmit/s
veth0->veth1           25,263,046 rx/s                  4 err,drop/s      25,263,028 xmit/s
  redirect_err                  4 error/s
    ENETDOWN                    4 error/s
[...]

The same holds for xdp_exception actions.

An example of how a complete xdp_redirect_map session would look:

 ; sudo ./xdp_redirect_map veth0 veth1
Redirecting from veth0 (ifindex 5; driver veth) to veth1 (ifindex 4; driver veth)
veth0->veth1            7,411,506 rx/s                  0 err,drop/s    7,411,470 xmit/s
veth0->veth1            8,931,770 rx/s                  0 err,drop/s    8,931,771 xmit/s
^\
veth0->veth1            8,787,295 rx/s                  0 err,drop/s    8,787,325 xmit/s
  receive total         8,787,295 pkt/s                 0 drop/s                0 error/s
    cpu:7               8,787,295 pkt/s                 0 drop/s                0 error/s
  redirect_err                  0 error/s
  xdp_exception                 0 hit/s
  xmit veth0->veth1     8,787,325 xmit/s                0 drop/s                0 drv_err/s          2.00 bulk-avg
     cpu:7              8,787,325 xmit/s                0 drop/s                0 drv_err/s          2.00 bulk-avg

veth0->veth1            8,842,610 rx/s                  0 err,drop/s    8,842,606 xmit/s
  receive total         8,842,610 pkt/s                 0 drop/s                0 error/s
    cpu:7               8,842,610 pkt/s                 0 drop/s                0 error/s
  redirect_err                  0 error/s
  xdp_exception                 0 hit/s
  xmit veth0->veth1     8,842,606 xmit/s                0 drop/s                0 drv_err/s          2.00 bulk-avg
     cpu:7              8,842,606 xmit/s                0 drop/s                0 drv_err/s          2.00 bulk-avg

^C
  Packets received    : 33,973,181
  Average packets/s   : 4,246,648
  Packets transmitted : 33,973,172
  Average transmit/s  : 4,246,647

The xdp_redirect tracepoint (for success stats) needs to be enabled explicitly
using --stats/-s. Documentation for entire output and options is provided when
user specifies --help/-h with a sample.

Changelog:
----------
v3 -> v4:
v3: https://lore.kernel.org/bpf/20210728165552.435050-1-memxor@gmail.com

 * Address all feedback from Daniel
  * Use READ_ONCE/WRITE_ONCE from linux/compiler.h (cannot directly include
    due to conflicts with vmlinux.h)
  * Fix MAX_CPUS hardcoding by switching to mmapable array maps, that are
    resized based on the value of libbpf_num_possible_cpus
  * s/ELEMENTS_OF/ARRAY_SIZE/g
  * Use tools/include/linux/hashtable.h
  * Coding style fixes
  * Remove hyperlinks for tracepoints
  * Split into smaller reviewable changes
 * Restore support for specifying custom xdp_redirect_cpu cpumap prog with some
   enhancements, including built-in programs for common actions (pass, drop,
   redirect). By default, cpumap prog is now disabled.
 * Misc bug fixes all over the place

  The printing stuff is a lot more basic without hyperlink support, hence it
  has not been exported into a more general facility.

v2 -> v3
v2: https://lore.kernel.org/bpf/20210721212833.701342-1-memxor@gmail.com

 * Address all feedback from Andrii
  * Replace usage of libbpf hashmap (internal API) with custom one
  * Rename ATOMIC_* macros to NO_TEAR_* to better reflect their use
  * Use size_t as a portable word sized data type
  * Set libbpf_set_strict_mode
  * Invert conditions in BPF programs to exit early and reduce nesting
  * Use canonical SEC("xdp") naming for all XDP BPF progams
 * Add missing help description for cpumap enqueue and kthread tracepoints
 * Move private struct declarations from xdp_sample_user.h to .c file
 * Improve help output for cpumap enqueue and cpumap kthread tracepoints
 * Fix a bug where keys array for BPF_MAP_LOOKUP_BATCH is overallocated
 * Fix some conditions for printing stats (earlier only checked pps, now pps,
   drop, err and print if any is greater than zero)
 * Fix alloc_stats_record to properly return and cleanup allocated memory on
   allocation failure instead of calling exit(3)
 * Bump bpf_map_lookup_batch count to 32 to reduce lookup time with multiple
   devices in map
 * Fix a bug where devmap_xmit_multi stats are not printed when previous record
   is missing (i.e. when the first time stats are printed), by simply using a
   dummy record that is zeroed out
 * Also print per-CPU counts for devmap_xmit_multi which we collect already
 * Change mac_map to be BPF_MAP_TYPE_HASH instead of array to prevent resizing
   to a large size when max_ifindex is high, in xdp_redirect_map_multi
 * Fix instance of strerror(errno) in sample_install_xdp to use saved errno
 * Provide a usage function from samples helper
 * Provide a fix where incorrect stats are shown for parallel sessions of
   xdp_redirect_* samples by introducing matching support for input device(s),
   output device(s) and cpumap map id for enqueue and kthread stats.
   Only xdp_monitor doesn't filter stats, all others do.

RFC (v1) -> v2
RFC (v1): https://lore.kernel.org/bpf/20210528235250.2635167-1-memxor@gmail.com

 * Address all feedback from Andrii
   * Use BPF static linking
   * Use vmlinux.h
   * Use BPF_PROG macro
   * Use global variables instead of maps
 * Use of tp_btf for raw_tracepoint progs
 * Switch to timerfd for polling
 * Use libbpf hashmap for maintaing device sets for per ifindex pair
   devmap_xmit stats
 * Fix Makefile to specify object dependencies properly
 * Use in-tree bpftool
 * ... misc fixes and cleanups all over the place
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov 2021-08-24 14:48:43 -07:00
commit 3bbc8ee7c3
22 changed files with 3445 additions and 2899 deletions

View File

@ -39,11 +39,6 @@ tprogs-y += lwt_len_hist
tprogs-y += xdp_tx_iptunnel
tprogs-y += test_map_in_map
tprogs-y += per_socket_stats_example
tprogs-y += xdp_redirect
tprogs-y += xdp_redirect_map
tprogs-y += xdp_redirect_map_multi
tprogs-y += xdp_redirect_cpu
tprogs-y += xdp_monitor
tprogs-y += xdp_rxq_info
tprogs-y += syscall_tp
tprogs-y += cpustat
@ -57,11 +52,18 @@ tprogs-y += xdp_sample_pkts
tprogs-y += ibumad
tprogs-y += hbm
tprogs-y += xdp_redirect_cpu
tprogs-y += xdp_redirect_map_multi
tprogs-y += xdp_redirect_map
tprogs-y += xdp_redirect
tprogs-y += xdp_monitor
# Libbpf dependencies
LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
CGROUP_HELPERS := ../../tools/testing/selftests/bpf/cgroup_helpers.o
TRACE_HELPERS := ../../tools/testing/selftests/bpf/trace_helpers.o
XDP_SAMPLE := xdp_sample_user.o
fds_example-objs := fds_example.o
sockex1-objs := sockex1_user.o
@ -98,11 +100,6 @@ lwt_len_hist-objs := lwt_len_hist_user.o
xdp_tx_iptunnel-objs := xdp_tx_iptunnel_user.o
test_map_in_map-objs := test_map_in_map_user.o
per_socket_stats_example-objs := cookie_uid_helper_example.o
xdp_redirect-objs := xdp_redirect_user.o
xdp_redirect_map-objs := xdp_redirect_map_user.o
xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o
xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o
xdp_monitor-objs := xdp_monitor_user.o
xdp_rxq_info-objs := xdp_rxq_info_user.o
syscall_tp-objs := syscall_tp_user.o
cpustat-objs := cpustat_user.o
@ -116,6 +113,12 @@ xdp_sample_pkts-objs := xdp_sample_pkts_user.o
ibumad-objs := ibumad_user.o
hbm-objs := hbm.o $(CGROUP_HELPERS)
xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o $(XDP_SAMPLE)
xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o $(XDP_SAMPLE)
xdp_redirect_map-objs := xdp_redirect_map_user.o $(XDP_SAMPLE)
xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE)
xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE)
# Tell kbuild to always build the programs
always-y := $(tprogs-y)
always-y += sockex1_kern.o
@ -160,11 +163,6 @@ always-y += tcp_clamp_kern.o
always-y += tcp_basertt_kern.o
always-y += tcp_tos_reflect_kern.o
always-y += tcp_dumpstats_kern.o
always-y += xdp_redirect_kern.o
always-y += xdp_redirect_map_kern.o
always-y += xdp_redirect_map_multi_kern.o
always-y += xdp_redirect_cpu_kern.o
always-y += xdp_monitor_kern.o
always-y += xdp_rxq_info_kern.o
always-y += xdp2skb_meta_kern.o
always-y += syscall_tp_kern.o
@ -276,6 +274,11 @@ $(LIBBPF): FORCE
$(MAKE) -C $(dir $@) RM='rm -rf' EXTRA_CFLAGS="$(TPROGS_CFLAGS)" \
LDFLAGS=$(TPROGS_LDFLAGS) srctree=$(BPF_SAMPLES_PATH)/../../ O=
BPFTOOLDIR := $(TOOLS_PATH)/bpf/bpftool
BPFTOOL := $(BPFTOOLDIR)/bpftool
$(BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile)
$(MAKE) -C $(BPFTOOLDIR) srctree=$(BPF_SAMPLES_PATH)/../../
$(obj)/syscall_nrs.h: $(obj)/syscall_nrs.s FORCE
$(call filechk,offsets,__SYSCALL_NRS_H__)
@ -306,6 +309,12 @@ verify_target_bpf: verify_cmds
$(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF)
$(src)/*.c: verify_target_bpf $(LIBBPF)
$(obj)/xdp_redirect_cpu_user.o: $(obj)/xdp_redirect_cpu.skel.h
$(obj)/xdp_redirect_map_multi_user.o: $(obj)/xdp_redirect_map_multi.skel.h
$(obj)/xdp_redirect_map_user.o: $(obj)/xdp_redirect_map.skel.h
$(obj)/xdp_redirect_user.o: $(obj)/xdp_redirect.skel.h
$(obj)/xdp_monitor_user.o: $(obj)/xdp_monitor.skel.h
$(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h
$(obj)/hbm_out_kern.o: $(src)/hbm.h $(src)/hbm_kern.h
$(obj)/hbm.o: $(src)/hbm.h
@ -313,6 +322,76 @@ $(obj)/hbm_edt_kern.o: $(src)/hbm.h $(src)/hbm_kern.h
-include $(BPF_SAMPLES_PATH)/Makefile.target
VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \
$(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \
../../../../vmlinux \
/sys/kernel/btf/vmlinux \
/boot/vmlinux-$(shell uname -r)
VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
ifeq ($(VMLINUX_BTF),)
$(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)")
endif
$(obj)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
ifeq ($(VMLINUX_H),)
$(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
else
$(Q)cp "$(VMLINUX_H)" $@
endif
clean-files += vmlinux.h
# Get Clang's default includes on this system, as opposed to those seen by
# '-target bpf'. This fixes "missing" files on some architectures/distros,
# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc.
#
# Use '-idirafter': Don't interfere with include mechanics except where the
# build would have failed anyways.
define get_sys_includes
$(shell $(1) -v -E - </dev/null 2>&1 \
| sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \
$(shell $(1) -dM -E - </dev/null | grep '#define __riscv_xlen ' | sed 's/#define /-D/' | sed 's/ /=/')
endef
CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG))
$(obj)/xdp_redirect_cpu.bpf.o: $(obj)/xdp_sample.bpf.o
$(obj)/xdp_redirect_map_multi.bpf.o: $(obj)/xdp_sample.bpf.o
$(obj)/xdp_redirect_map.bpf.o: $(obj)/xdp_sample.bpf.o
$(obj)/xdp_redirect.bpf.o: $(obj)/xdp_sample.bpf.o
$(obj)/xdp_monitor.bpf.o: $(obj)/xdp_sample.bpf.o
$(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/xdp_sample_shared.h
@echo " CLANG-BPF " $@
$(Q)$(CLANG) -g -O2 -target bpf -D__TARGET_ARCH_$(SRCARCH) \
-Wno-compare-distinct-pointer-types -I$(srctree)/include \
-I$(srctree)/samples/bpf -I$(srctree)/tools/include \
-I$(srctree)/tools/lib $(CLANG_SYS_INCLUDES) \
-c $(filter %.bpf.c,$^) -o $@
LINKED_SKELS := xdp_redirect_cpu.skel.h xdp_redirect_map_multi.skel.h \
xdp_redirect_map.skel.h xdp_redirect.skel.h xdp_monitor.skel.h
clean-files += $(LINKED_SKELS)
xdp_redirect_cpu.skel.h-deps := xdp_redirect_cpu.bpf.o xdp_sample.bpf.o
xdp_redirect_map_multi.skel.h-deps := xdp_redirect_map_multi.bpf.o xdp_sample.bpf.o
xdp_redirect_map.skel.h-deps := xdp_redirect_map.bpf.o xdp_sample.bpf.o
xdp_redirect.skel.h-deps := xdp_redirect.bpf.o xdp_sample.bpf.o
xdp_monitor.skel.h-deps := xdp_monitor.bpf.o xdp_sample.bpf.o
LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.bpf.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps)))
BPF_SRCS_LINKED := $(notdir $(wildcard $(src)/*.bpf.c))
BPF_OBJS_LINKED := $(patsubst %.bpf.c,$(obj)/%.bpf.o, $(BPF_SRCS_LINKED))
BPF_SKELS_LINKED := $(addprefix $(obj)/,$(LINKED_SKELS))
$(BPF_SKELS_LINKED): $(BPF_OBJS_LINKED) $(BPFTOOL)
@echo " BPF GEN-OBJ " $(@:.skel.h=)
$(Q)$(BPFTOOL) gen object $(@:.skel.h=.lbpf.o) $(addprefix $(obj)/,$($(@F)-deps))
@echo " BPF GEN-SKEL" $(@:.skel.h=)
$(Q)$(BPFTOOL) gen skeleton $(@:.skel.h=.lbpf.o) name $(notdir $(@:.skel.h=)) > $@
# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
# But, there is no easy way to fix it, so just exclude it since it is
# useless for BPF samples.

View File

@ -73,3 +73,14 @@ quiet_cmd_tprog-cobjs = CC $@
cmd_tprog-cobjs = $(CC) $(tprogc_flags) -c -o $@ $<
$(tprog-cobjs): $(obj)/%.o: $(src)/%.c FORCE
$(call if_changed_dep,tprog-cobjs)
# Override includes for xdp_sample_user.o because $(srctree)/usr/include in
# TPROGS_CFLAGS causes conflicts
XDP_SAMPLE_CFLAGS += -Wall -O2 -lm \
-I./tools/include \
-I./tools/include/uapi \
-I./tools/lib \
-I./tools/testing/selftests/bpf
$(obj)/xdp_sample_user.o: $(src)/xdp_sample_user.c \
$(src)/xdp_sample_user.h $(src)/xdp_sample_shared.h
$(CC) $(XDP_SAMPLE_CFLAGS) -c -o $@ $<

View File

@ -167,7 +167,7 @@ static void prog_load(void)
static void prog_attach_iptables(char *file)
{
int ret;
char rules[100];
char rules[256];
if (bpf_obj_pin(prog_fd, file))
error(1, errno, "bpf_obj_pin");
@ -175,8 +175,13 @@ static void prog_attach_iptables(char *file)
printf("file path too long: %s\n", file);
exit(1);
}
sprintf(rules, "iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT",
file);
ret = snprintf(rules, sizeof(rules),
"iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT",
file);
if (ret < 0 || ret >= sizeof(rules)) {
printf("error constructing iptables command\n");
exit(1);
}
ret = system(rules);
if (ret < 0) {
printf("iptables rule update failed: %d/n", WEXITSTATUS(ret));

View File

@ -32,7 +32,7 @@ static void print_old_objects(int fd)
__u64 key, next_key;
struct pair v;
key = write(1, "\e[1;1H\e[2J", 12); /* clear screen */
key = write(1, "\e[1;1H\e[2J", 11); /* clear screen */
key = -1;
while (bpf_map_get_next_key(fd, &key, &next_key) == 0) {

View File

@ -0,0 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright(c) 2017-2018 Jesper Dangaard Brouer, Red Hat Inc.
*
* XDP monitor tool, based on tracepoints
*/
#include "xdp_sample.bpf.h"
char _license[] SEC("license") = "GPL";

View File

@ -1,257 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0
* Copyright(c) 2017-2018 Jesper Dangaard Brouer, Red Hat Inc.
*
* XDP monitor tool, based on tracepoints
*/
#include <uapi/linux/bpf.h>
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, u64);
__uint(max_entries, 2);
/* TODO: have entries for all possible errno's */
} redirect_err_cnt SEC(".maps");
#define XDP_UNKNOWN XDP_REDIRECT + 1
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, u64);
__uint(max_entries, XDP_UNKNOWN + 1);
} exception_cnt SEC(".maps");
/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_redirect/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct xdp_redirect_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int prog_id; // offset:8; size:4; signed:1;
u32 act; // offset:12 size:4; signed:0;
int ifindex; // offset:16 size:4; signed:1;
int err; // offset:20 size:4; signed:1;
int to_ifindex; // offset:24 size:4; signed:1;
u32 map_id; // offset:28 size:4; signed:0;
int map_index; // offset:32 size:4; signed:1;
}; // offset:36
enum {
XDP_REDIRECT_SUCCESS = 0,
XDP_REDIRECT_ERROR = 1
};
static __always_inline
int xdp_redirect_collect_stat(struct xdp_redirect_ctx *ctx)
{
u32 key = XDP_REDIRECT_ERROR;
int err = ctx->err;
u64 *cnt;
if (!err)
key = XDP_REDIRECT_SUCCESS;
cnt = bpf_map_lookup_elem(&redirect_err_cnt, &key);
if (!cnt)
return 1;
*cnt += 1;
return 0; /* Indicate event was filtered (no further processing)*/
/*
* Returning 1 here would allow e.g. a perf-record tracepoint
* to see and record these events, but it doesn't work well
* in-practice as stopping perf-record also unload this
* bpf_prog. Plus, there is additional overhead of doing so.
*/
}
SEC("tracepoint/xdp/xdp_redirect_err")
int trace_xdp_redirect_err(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
SEC("tracepoint/xdp/xdp_redirect_map_err")
int trace_xdp_redirect_map_err(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
/* Likely unloaded when prog starts */
SEC("tracepoint/xdp/xdp_redirect")
int trace_xdp_redirect(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
/* Likely unloaded when prog starts */
SEC("tracepoint/xdp/xdp_redirect_map")
int trace_xdp_redirect_map(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_exception/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct xdp_exception_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int prog_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int ifindex; // offset:16; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_exception")
int trace_xdp_exception(struct xdp_exception_ctx *ctx)
{
u64 *cnt;
u32 key;
key = ctx->act;
if (key > XDP_REDIRECT)
key = XDP_UNKNOWN;
cnt = bpf_map_lookup_elem(&exception_cnt, &key);
if (!cnt)
return 1;
*cnt += 1;
return 0;
}
/* Common stats data record shared with _user.c */
struct datarec {
u64 processed;
u64 dropped;
u64 info;
u64 err;
};
#define MAX_CPUS 64
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, MAX_CPUS);
} cpumap_enqueue_cnt SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, 1);
} cpumap_kthread_cnt SEC(".maps");
/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_enqueue/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct cpumap_enqueue_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int map_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int cpu; // offset:16; size:4; signed:1;
unsigned int drops; // offset:20; size:4; signed:0;
unsigned int processed; // offset:24; size:4; signed:0;
int to_cpu; // offset:28; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_cpumap_enqueue")
int trace_xdp_cpumap_enqueue(struct cpumap_enqueue_ctx *ctx)
{
u32 to_cpu = ctx->to_cpu;
struct datarec *rec;
if (to_cpu >= MAX_CPUS)
return 1;
rec = bpf_map_lookup_elem(&cpumap_enqueue_cnt, &to_cpu);
if (!rec)
return 0;
rec->processed += ctx->processed;
rec->dropped += ctx->drops;
/* Record bulk events, then userspace can calc average bulk size */
if (ctx->processed > 0)
rec->info += 1;
return 0;
}
/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_kthread/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct cpumap_kthread_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int map_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int cpu; // offset:16; size:4; signed:1;
unsigned int drops; // offset:20; size:4; signed:0;
unsigned int processed; // offset:24; size:4; signed:0;
int sched; // offset:28; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_cpumap_kthread")
int trace_xdp_cpumap_kthread(struct cpumap_kthread_ctx *ctx)
{
struct datarec *rec;
u32 key = 0;
rec = bpf_map_lookup_elem(&cpumap_kthread_cnt, &key);
if (!rec)
return 0;
rec->processed += ctx->processed;
rec->dropped += ctx->drops;
/* Count times kthread yielded CPU via schedule call */
if (ctx->sched)
rec->info++;
return 0;
}
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, 1);
} devmap_xmit_cnt SEC(".maps");
/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_devmap_xmit/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct devmap_xmit_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int from_ifindex; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int to_ifindex; // offset:16; size:4; signed:1;
int drops; // offset:20; size:4; signed:1;
int sent; // offset:24; size:4; signed:1;
int err; // offset:28; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_devmap_xmit")
int trace_xdp_devmap_xmit(struct devmap_xmit_ctx *ctx)
{
struct datarec *rec;
u32 key = 0;
rec = bpf_map_lookup_elem(&devmap_xmit_cnt, &key);
if (!rec)
return 0;
rec->processed += ctx->sent;
rec->dropped += ctx->drops;
/* Record bulk events, then userspace can calc average bulk size */
rec->info += 1;
/* Record error cases, where no frame were sent */
if (ctx->err)
rec->err++;
/* Catch API error of drv ndo_xdp_xmit sent more than count */
if (ctx->drops < 0)
rec->err++;
return 1;
}

View File

@ -1,15 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0
* Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc.
*/
// SPDX-License-Identifier: GPL-2.0
/* Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. */
static const char *__doc__=
"XDP monitor tool, based on tracepoints\n"
;
"XDP monitor tool, based on tracepoints\n";
static const char *__doc_err_only__=
" NOTICE: Only tracking XDP redirect errors\n"
" Enable TX success stats via '--stats'\n"
" (which comes with a per packet processing overhead)\n"
;
" NOTICE: Only tracking XDP redirect errors\n"
" Enable redirect success stats via '-s/--stats'\n"
" (which comes with a per packet processing overhead)\n";
#include <errno.h>
#include <stdio.h>
@ -20,768 +17,103 @@ static const char *__doc_err_only__=
#include <ctype.h>
#include <unistd.h>
#include <locale.h>
#include <sys/resource.h>
#include <getopt.h>
#include <net/if.h>
#include <time.h>
#include <signal.h>
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
#include "bpf_util.h"
#include "xdp_sample_user.h"
#include "xdp_monitor.skel.h"
enum map_type {
REDIRECT_ERR_CNT,
EXCEPTION_CNT,
CPUMAP_ENQUEUE_CNT,
CPUMAP_KTHREAD_CNT,
DEVMAP_XMIT_CNT,
};
static int mask = SAMPLE_REDIRECT_ERR_CNT | SAMPLE_CPUMAP_ENQUEUE_CNT |
SAMPLE_CPUMAP_KTHREAD_CNT | SAMPLE_EXCEPTION_CNT |
SAMPLE_DEVMAP_XMIT_CNT | SAMPLE_DEVMAP_XMIT_CNT_MULTI;
static const char *const map_type_strings[] = {
[REDIRECT_ERR_CNT] = "redirect_err_cnt",
[EXCEPTION_CNT] = "exception_cnt",
[CPUMAP_ENQUEUE_CNT] = "cpumap_enqueue_cnt",
[CPUMAP_KTHREAD_CNT] = "cpumap_kthread_cnt",
[DEVMAP_XMIT_CNT] = "devmap_xmit_cnt",
};
#define NUM_MAP 5
#define NUM_TP 8
static int tp_cnt;
static int map_cnt;
static int verbose = 1;
static bool debug = false;
struct bpf_map *map_data[NUM_MAP] = {};
struct bpf_link *tp_links[NUM_TP] = {};
struct bpf_object *obj;
DEFINE_SAMPLE_INIT(xdp_monitor);
static const struct option long_options[] = {
{"help", no_argument, NULL, 'h' },
{"debug", no_argument, NULL, 'D' },
{"stats", no_argument, NULL, 'S' },
{"sec", required_argument, NULL, 's' },
{0, 0, NULL, 0 }
{ "help", no_argument, NULL, 'h' },
{ "stats", no_argument, NULL, 's' },
{ "interval", required_argument, NULL, 'i' },
{ "verbose", no_argument, NULL, 'v' },
{}
};
static void int_exit(int sig)
{
/* Detach tracepoints */
while (tp_cnt)
bpf_link__destroy(tp_links[--tp_cnt]);
bpf_object__close(obj);
exit(0);
}
/* C standard specifies two constants, EXIT_SUCCESS(0) and EXIT_FAILURE(1) */
#define EXIT_FAIL_MEM 5
static void usage(char *argv[])
{
int i;
printf("\nDOCUMENTATION:\n%s\n", __doc__);
printf("\n");
printf(" Usage: %s (options-see-below)\n",
argv[0]);
printf(" Listing options:\n");
for (i = 0; long_options[i].name != 0; i++) {
printf(" --%-15s", long_options[i].name);
if (long_options[i].flag != NULL)
printf(" flag (internal value:%d)",
*long_options[i].flag);
else
printf("short-option: -%c",
long_options[i].val);
printf("\n");
}
printf("\n");
}
#define NANOSEC_PER_SEC 1000000000 /* 10^9 */
static __u64 gettime(void)
{
struct timespec t;
int res;
res = clock_gettime(CLOCK_MONOTONIC, &t);
if (res < 0) {
fprintf(stderr, "Error with gettimeofday! (%i)\n", res);
exit(EXIT_FAILURE);
}
return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec;
}
enum {
REDIR_SUCCESS = 0,
REDIR_ERROR = 1,
};
#define REDIR_RES_MAX 2
static const char *redir_names[REDIR_RES_MAX] = {
[REDIR_SUCCESS] = "Success",
[REDIR_ERROR] = "Error",
};
static const char *err2str(int err)
{
if (err < REDIR_RES_MAX)
return redir_names[err];
return NULL;
}
/* enum xdp_action */
#define XDP_UNKNOWN XDP_REDIRECT + 1
#define XDP_ACTION_MAX (XDP_UNKNOWN + 1)
static const char *xdp_action_names[XDP_ACTION_MAX] = {
[XDP_ABORTED] = "XDP_ABORTED",
[XDP_DROP] = "XDP_DROP",
[XDP_PASS] = "XDP_PASS",
[XDP_TX] = "XDP_TX",
[XDP_REDIRECT] = "XDP_REDIRECT",
[XDP_UNKNOWN] = "XDP_UNKNOWN",
};
static const char *action2str(int action)
{
if (action < XDP_ACTION_MAX)
return xdp_action_names[action];
return NULL;
}
/* Common stats data record shared with _kern.c */
struct datarec {
__u64 processed;
__u64 dropped;
__u64 info;
__u64 err;
};
#define MAX_CPUS 64
/* Userspace structs for collection of stats from maps */
struct record {
__u64 timestamp;
struct datarec total;
struct datarec *cpu;
};
struct u64rec {
__u64 processed;
};
struct record_u64 {
/* record for _kern side __u64 values */
__u64 timestamp;
struct u64rec total;
struct u64rec *cpu;
};
struct stats_record {
struct record_u64 xdp_redirect[REDIR_RES_MAX];
struct record_u64 xdp_exception[XDP_ACTION_MAX];
struct record xdp_cpumap_kthread;
struct record xdp_cpumap_enqueue[MAX_CPUS];
struct record xdp_devmap_xmit;
};
static bool map_collect_record(int fd, __u32 key, struct record *rec)
{
/* For percpu maps, userspace gets a value per possible CPU */
unsigned int nr_cpus = bpf_num_possible_cpus();
struct datarec values[nr_cpus];
__u64 sum_processed = 0;
__u64 sum_dropped = 0;
__u64 sum_info = 0;
__u64 sum_err = 0;
int i;
if ((bpf_map_lookup_elem(fd, &key, values)) != 0) {
fprintf(stderr,
"ERR: bpf_map_lookup_elem failed key:0x%X\n", key);
return false;
}
/* Get time as close as possible to reading map contents */
rec->timestamp = gettime();
/* Record and sum values from each CPU */
for (i = 0; i < nr_cpus; i++) {
rec->cpu[i].processed = values[i].processed;
sum_processed += values[i].processed;
rec->cpu[i].dropped = values[i].dropped;
sum_dropped += values[i].dropped;
rec->cpu[i].info = values[i].info;
sum_info += values[i].info;
rec->cpu[i].err = values[i].err;
sum_err += values[i].err;
}
rec->total.processed = sum_processed;
rec->total.dropped = sum_dropped;
rec->total.info = sum_info;
rec->total.err = sum_err;
return true;
}
static bool map_collect_record_u64(int fd, __u32 key, struct record_u64 *rec)
{
/* For percpu maps, userspace gets a value per possible CPU */
unsigned int nr_cpus = bpf_num_possible_cpus();
struct u64rec values[nr_cpus];
__u64 sum_total = 0;
int i;
if ((bpf_map_lookup_elem(fd, &key, values)) != 0) {
fprintf(stderr,
"ERR: bpf_map_lookup_elem failed key:0x%X\n", key);
return false;
}
/* Get time as close as possible to reading map contents */
rec->timestamp = gettime();
/* Record and sum values from each CPU */
for (i = 0; i < nr_cpus; i++) {
rec->cpu[i].processed = values[i].processed;
sum_total += values[i].processed;
}
rec->total.processed = sum_total;
return true;
}
static double calc_period(struct record *r, struct record *p)
{
double period_ = 0;
__u64 period = 0;
period = r->timestamp - p->timestamp;
if (period > 0)
period_ = ((double) period / NANOSEC_PER_SEC);
return period_;
}
static double calc_period_u64(struct record_u64 *r, struct record_u64 *p)
{
double period_ = 0;
__u64 period = 0;
period = r->timestamp - p->timestamp;
if (period > 0)
period_ = ((double) period / NANOSEC_PER_SEC);
return period_;
}
static double calc_pps(struct datarec *r, struct datarec *p, double period)
{
__u64 packets = 0;
double pps = 0;
if (period > 0) {
packets = r->processed - p->processed;
pps = packets / period;
}
return pps;
}
static double calc_pps_u64(struct u64rec *r, struct u64rec *p, double period)
{
__u64 packets = 0;
double pps = 0;
if (period > 0) {
packets = r->processed - p->processed;
pps = packets / period;
}
return pps;
}
static double calc_drop(struct datarec *r, struct datarec *p, double period)
{
__u64 packets = 0;
double pps = 0;
if (period > 0) {
packets = r->dropped - p->dropped;
pps = packets / period;
}
return pps;
}
static double calc_info(struct datarec *r, struct datarec *p, double period)
{
__u64 packets = 0;
double pps = 0;
if (period > 0) {
packets = r->info - p->info;
pps = packets / period;
}
return pps;
}
static double calc_err(struct datarec *r, struct datarec *p, double period)
{
__u64 packets = 0;
double pps = 0;
if (period > 0) {
packets = r->err - p->err;
pps = packets / period;
}
return pps;
}
static void stats_print(struct stats_record *stats_rec,
struct stats_record *stats_prev,
bool err_only)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
int rec_i = 0, i, to_cpu;
double t = 0, pps = 0;
/* Header */
printf("%-15s %-7s %-12s %-12s %-9s\n",
"XDP-event", "CPU:to", "pps", "drop-pps", "extra-info");
/* tracepoint: xdp:xdp_redirect_* */
if (err_only)
rec_i = REDIR_ERROR;
for (; rec_i < REDIR_RES_MAX; rec_i++) {
struct record_u64 *rec, *prev;
char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %s\n";
char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %s\n";
rec = &stats_rec->xdp_redirect[rec_i];
prev = &stats_prev->xdp_redirect[rec_i];
t = calc_period_u64(rec, prev);
for (i = 0; i < nr_cpus; i++) {
struct u64rec *r = &rec->cpu[i];
struct u64rec *p = &prev->cpu[i];
pps = calc_pps_u64(r, p, t);
if (pps > 0)
printf(fmt1, "XDP_REDIRECT", i,
rec_i ? 0.0: pps, rec_i ? pps : 0.0,
err2str(rec_i));
}
pps = calc_pps_u64(&rec->total, &prev->total, t);
printf(fmt2, "XDP_REDIRECT", "total",
rec_i ? 0.0: pps, rec_i ? pps : 0.0, err2str(rec_i));
}
/* tracepoint: xdp:xdp_exception */
for (rec_i = 0; rec_i < XDP_ACTION_MAX; rec_i++) {
struct record_u64 *rec, *prev;
char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %s\n";
char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %s\n";
rec = &stats_rec->xdp_exception[rec_i];
prev = &stats_prev->xdp_exception[rec_i];
t = calc_period_u64(rec, prev);
for (i = 0; i < nr_cpus; i++) {
struct u64rec *r = &rec->cpu[i];
struct u64rec *p = &prev->cpu[i];
pps = calc_pps_u64(r, p, t);
if (pps > 0)
printf(fmt1, "Exception", i,
0.0, pps, action2str(rec_i));
}
pps = calc_pps_u64(&rec->total, &prev->total, t);
if (pps > 0)
printf(fmt2, "Exception", "total",
0.0, pps, action2str(rec_i));
}
/* cpumap enqueue stats */
for (to_cpu = 0; to_cpu < MAX_CPUS; to_cpu++) {
char *fmt1 = "%-15s %3d:%-3d %'-12.0f %'-12.0f %'-10.2f %s\n";
char *fmt2 = "%-15s %3s:%-3d %'-12.0f %'-12.0f %'-10.2f %s\n";
struct record *rec, *prev;
char *info_str = "";
double drop, info;
rec = &stats_rec->xdp_cpumap_enqueue[to_cpu];
prev = &stats_prev->xdp_cpumap_enqueue[to_cpu];
t = calc_period(rec, prev);
for (i = 0; i < nr_cpus; i++) {
struct datarec *r = &rec->cpu[i];
struct datarec *p = &prev->cpu[i];
pps = calc_pps(r, p, t);
drop = calc_drop(r, p, t);
info = calc_info(r, p, t);
if (info > 0) {
info_str = "bulk-average";
info = pps / info; /* calc average bulk size */
}
if (pps > 0)
printf(fmt1, "cpumap-enqueue",
i, to_cpu, pps, drop, info, info_str);
}
pps = calc_pps(&rec->total, &prev->total, t);
if (pps > 0) {
drop = calc_drop(&rec->total, &prev->total, t);
info = calc_info(&rec->total, &prev->total, t);
if (info > 0) {
info_str = "bulk-average";
info = pps / info; /* calc average bulk size */
}
printf(fmt2, "cpumap-enqueue",
"sum", to_cpu, pps, drop, info, info_str);
}
}
/* cpumap kthread stats */
{
char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.0f %s\n";
char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.0f %s\n";
struct record *rec, *prev;
double drop, info;
char *i_str = "";
rec = &stats_rec->xdp_cpumap_kthread;
prev = &stats_prev->xdp_cpumap_kthread;
t = calc_period(rec, prev);
for (i = 0; i < nr_cpus; i++) {
struct datarec *r = &rec->cpu[i];
struct datarec *p = &prev->cpu[i];
pps = calc_pps(r, p, t);
drop = calc_drop(r, p, t);
info = calc_info(r, p, t);
if (info > 0)
i_str = "sched";
if (pps > 0 || drop > 0)
printf(fmt1, "cpumap-kthread",
i, pps, drop, info, i_str);
}
pps = calc_pps(&rec->total, &prev->total, t);
drop = calc_drop(&rec->total, &prev->total, t);
info = calc_info(&rec->total, &prev->total, t);
if (info > 0)
i_str = "sched-sum";
printf(fmt2, "cpumap-kthread", "total", pps, drop, info, i_str);
}
/* devmap ndo_xdp_xmit stats */
{
char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.2f %s %s\n";
char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.2f %s %s\n";
struct record *rec, *prev;
double drop, info, err;
char *i_str = "";
char *err_str = "";
rec = &stats_rec->xdp_devmap_xmit;
prev = &stats_prev->xdp_devmap_xmit;
t = calc_period(rec, prev);
for (i = 0; i < nr_cpus; i++) {
struct datarec *r = &rec->cpu[i];
struct datarec *p = &prev->cpu[i];
pps = calc_pps(r, p, t);
drop = calc_drop(r, p, t);
info = calc_info(r, p, t);
err = calc_err(r, p, t);
if (info > 0) {
i_str = "bulk-average";
info = (pps+drop) / info; /* calc avg bulk */
}
if (err > 0)
err_str = "drv-err";
if (pps > 0 || drop > 0)
printf(fmt1, "devmap-xmit",
i, pps, drop, info, i_str, err_str);
}
pps = calc_pps(&rec->total, &prev->total, t);
drop = calc_drop(&rec->total, &prev->total, t);
info = calc_info(&rec->total, &prev->total, t);
err = calc_err(&rec->total, &prev->total, t);
if (info > 0) {
i_str = "bulk-average";
info = (pps+drop) / info; /* calc avg bulk */
}
if (err > 0)
err_str = "drv-err";
printf(fmt2, "devmap-xmit", "total", pps, drop,
info, i_str, err_str);
}
printf("\n");
}
static bool stats_collect(struct stats_record *rec)
{
int fd;
int i;
/* TODO: Detect if someone unloaded the perf event_fd's, as
* this can happen by someone running perf-record -e
*/
fd = bpf_map__fd(map_data[REDIRECT_ERR_CNT]);
for (i = 0; i < REDIR_RES_MAX; i++)
map_collect_record_u64(fd, i, &rec->xdp_redirect[i]);
fd = bpf_map__fd(map_data[EXCEPTION_CNT]);
for (i = 0; i < XDP_ACTION_MAX; i++) {
map_collect_record_u64(fd, i, &rec->xdp_exception[i]);
}
fd = bpf_map__fd(map_data[CPUMAP_ENQUEUE_CNT]);
for (i = 0; i < MAX_CPUS; i++)
map_collect_record(fd, i, &rec->xdp_cpumap_enqueue[i]);
fd = bpf_map__fd(map_data[CPUMAP_KTHREAD_CNT]);
map_collect_record(fd, 0, &rec->xdp_cpumap_kthread);
fd = bpf_map__fd(map_data[DEVMAP_XMIT_CNT]);
map_collect_record(fd, 0, &rec->xdp_devmap_xmit);
return true;
}
static void *alloc_rec_per_cpu(int record_size)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
void *array;
array = calloc(nr_cpus, record_size);
if (!array) {
fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus);
exit(EXIT_FAIL_MEM);
}
return array;
}
static struct stats_record *alloc_stats_record(void)
{
struct stats_record *rec;
int rec_sz;
int i;
/* Alloc main stats_record structure */
rec = calloc(1, sizeof(*rec));
if (!rec) {
fprintf(stderr, "Mem alloc error\n");
exit(EXIT_FAIL_MEM);
}
/* Alloc stats stored per CPU for each record */
rec_sz = sizeof(struct u64rec);
for (i = 0; i < REDIR_RES_MAX; i++)
rec->xdp_redirect[i].cpu = alloc_rec_per_cpu(rec_sz);
for (i = 0; i < XDP_ACTION_MAX; i++)
rec->xdp_exception[i].cpu = alloc_rec_per_cpu(rec_sz);
rec_sz = sizeof(struct datarec);
rec->xdp_cpumap_kthread.cpu = alloc_rec_per_cpu(rec_sz);
rec->xdp_devmap_xmit.cpu = alloc_rec_per_cpu(rec_sz);
for (i = 0; i < MAX_CPUS; i++)
rec->xdp_cpumap_enqueue[i].cpu = alloc_rec_per_cpu(rec_sz);
return rec;
}
static void free_stats_record(struct stats_record *r)
{
int i;
for (i = 0; i < REDIR_RES_MAX; i++)
free(r->xdp_redirect[i].cpu);
for (i = 0; i < XDP_ACTION_MAX; i++)
free(r->xdp_exception[i].cpu);
free(r->xdp_cpumap_kthread.cpu);
free(r->xdp_devmap_xmit.cpu);
for (i = 0; i < MAX_CPUS; i++)
free(r->xdp_cpumap_enqueue[i].cpu);
free(r);
}
/* Pointer swap trick */
static inline void swap(struct stats_record **a, struct stats_record **b)
{
struct stats_record *tmp;
tmp = *a;
*a = *b;
*b = tmp;
}
static void stats_poll(int interval, bool err_only)
{
struct stats_record *rec, *prev;
rec = alloc_stats_record();
prev = alloc_stats_record();
stats_collect(rec);
if (err_only)
printf("\n%s\n", __doc_err_only__);
/* Trick to pretty printf with thousands separators use %' */
setlocale(LC_NUMERIC, "en_US");
/* Header */
if (verbose)
printf("\n%s", __doc__);
/* TODO Need more advanced stats on error types */
if (verbose) {
printf(" - Stats map0: %s\n", bpf_map__name(map_data[0]));
printf(" - Stats map1: %s\n", bpf_map__name(map_data[1]));
printf("\n");
}
fflush(stdout);
while (1) {
swap(&prev, &rec);
stats_collect(rec);
stats_print(rec, prev, err_only);
fflush(stdout);
sleep(interval);
}
free_stats_record(rec);
free_stats_record(prev);
}
static void print_bpf_prog_info(void)
{
struct bpf_program *prog;
struct bpf_map *map;
int i = 0;
/* Prog info */
printf("Loaded BPF prog have %d bpf program(s)\n", tp_cnt);
bpf_object__for_each_program(prog, obj) {
printf(" - prog_fd[%d] = fd(%d)\n", i, bpf_program__fd(prog));
i++;
}
i = 0;
/* Maps info */
printf("Loaded BPF prog have %d map(s)\n", map_cnt);
bpf_object__for_each_map(map, obj) {
const char *name = bpf_map__name(map);
int fd = bpf_map__fd(map);
printf(" - map_data[%d] = fd(%d) name:%s\n", i, fd, name);
i++;
}
/* Event info */
printf("Searching for (max:%d) event file descriptor(s)\n", tp_cnt);
for (i = 0; i < tp_cnt; i++) {
int fd = bpf_link__fd(tp_links[i]);
if (fd != -1)
printf(" - event_fd[%d] = fd(%d)\n", i, fd);
}
}
int main(int argc, char **argv)
{
struct bpf_program *prog;
int longindex = 0, opt;
int ret = EXIT_FAILURE;
enum map_type type;
char filename[256];
/* Default settings: */
unsigned long interval = 2;
int ret = EXIT_FAIL_OPTION;
struct xdp_monitor *skel;
bool errors_only = true;
int interval = 2;
int longindex = 0, opt;
bool error = true;
/* Parse commands line args */
while ((opt = getopt_long(argc, argv, "hDSs:",
while ((opt = getopt_long(argc, argv, "si:vh",
long_options, &longindex)) != -1) {
switch (opt) {
case 'D':
debug = true;
break;
case 'S':
errors_only = false;
break;
case 's':
interval = atoi(optarg);
errors_only = false;
mask |= SAMPLE_REDIRECT_CNT;
break;
case 'i':
interval = strtoul(optarg, NULL, 0);
break;
case 'v':
sample_switch_mode();
break;
case 'h':
error = false;
default:
usage(argv);
sample_usage(argv, long_options, __doc__, mask, error);
return ret;
}
}
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
/* Remove tracepoint program when program is interrupted or killed */
signal(SIGINT, int_exit);
signal(SIGTERM, int_exit);
obj = bpf_object__open_file(filename, NULL);
if (libbpf_get_error(obj)) {
printf("ERROR: opening BPF object file failed\n");
obj = NULL;
goto cleanup;
skel = xdp_monitor__open();
if (!skel) {
fprintf(stderr, "Failed to xdp_monitor__open: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end;
}
/* load BPF program */
if (bpf_object__load(obj)) {
printf("ERROR: loading BPF object file failed\n");
goto cleanup;
ret = sample_init_pre_load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to sample_init_pre_load: %s\n", strerror(-ret));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
for (type = 0; type < NUM_MAP; type++) {
map_data[type] =
bpf_object__find_map_by_name(obj, map_type_strings[type]);
if (libbpf_get_error(map_data[type])) {
printf("ERROR: finding a map in obj file failed\n");
goto cleanup;
}
map_cnt++;
ret = xdp_monitor__load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to xdp_monitor__load: %s\n", strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
bpf_object__for_each_program(prog, obj) {
tp_links[tp_cnt] = bpf_program__attach(prog);
if (libbpf_get_error(tp_links[tp_cnt])) {
printf("ERROR: bpf_program__attach failed\n");
tp_links[tp_cnt] = NULL;
goto cleanup;
}
tp_cnt++;
ret = sample_init(skel, mask);
if (ret < 0) {
fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
if (debug) {
print_bpf_prog_info();
if (errors_only)
printf("%s", __doc_err_only__);
ret = sample_run(interval, NULL, NULL);
if (ret < 0) {
fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
/* Unload/stop tracepoint event by closing bpf_link's */
if (errors_only) {
/* The bpf_link[i] depend on the order of
* the functions was defined in _kern.c
*/
bpf_link__destroy(tp_links[2]); /* tracepoint/xdp/xdp_redirect */
tp_links[2] = NULL;
bpf_link__destroy(tp_links[3]); /* tracepoint/xdp/xdp_redirect_map */
tp_links[3] = NULL;
}
stats_poll(interval, errors_only);
ret = EXIT_SUCCESS;
cleanup:
/* Detach tracepoints */
while (tp_cnt)
bpf_link__destroy(tp_links[--tp_cnt]);
bpf_object__close(obj);
return ret;
ret = EXIT_OK;
end_destroy:
xdp_monitor__destroy(skel);
end:
sample_exit(ret);
}

View File

@ -0,0 +1,49 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2016 John Fastabend <john.r.fastabend@intel.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of version 2 of the GNU General Public
* License as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*/
#include "vmlinux.h"
#include "xdp_sample.bpf.h"
#include "xdp_sample_shared.h"
const volatile int ifindex_out;
SEC("xdp")
int xdp_redirect_prog(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
struct datarec *rec;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_PASS;
NO_TEAR_INC(rec->processed);
swap_src_dst_mac(data);
return bpf_redirect(ifindex_out, 0);
}
/* Redirect require an XDP bpf_prog loaded on the TX device */
SEC("xdp")
int xdp_redirect_dummy_prog(struct xdp_md *ctx)
{
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";

View File

@ -2,74 +2,18 @@
*
* GPLv2, Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc.
*/
#include <uapi/linux/if_ether.h>
#include <uapi/linux/if_packet.h>
#include <uapi/linux/if_vlan.h>
#include <uapi/linux/ip.h>
#include <uapi/linux/ipv6.h>
#include <uapi/linux/in.h>
#include <uapi/linux/tcp.h>
#include <uapi/linux/udp.h>
#include <uapi/linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "vmlinux.h"
#include "xdp_sample.bpf.h"
#include "xdp_sample_shared.h"
#include "hash_func01.h"
#define MAX_CPUS NR_CPUS
/* Special map type that can XDP_REDIRECT frames to another CPU */
struct {
__uint(type, BPF_MAP_TYPE_CPUMAP);
__uint(key_size, sizeof(u32));
__uint(value_size, sizeof(struct bpf_cpumap_val));
__uint(max_entries, MAX_CPUS);
} cpu_map SEC(".maps");
/* Common stats data record to keep userspace more simple */
struct datarec {
__u64 processed;
__u64 dropped;
__u64 issue;
__u64 xdp_pass;
__u64 xdp_drop;
__u64 xdp_redirect;
};
/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success
* feedback. Redirect TX errors can be caught via a tracepoint.
*/
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, 1);
} rx_cnt SEC(".maps");
/* Used by trace point */
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, 2);
/* TODO: have entries for all possible errno's */
} redirect_err_cnt SEC(".maps");
/* Used by trace point */
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, MAX_CPUS);
} cpumap_enqueue_cnt SEC(".maps");
/* Used by trace point */
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(max_entries, 1);
} cpumap_kthread_cnt SEC(".maps");
/* Set of maps controlling available CPU, and for iterating through
* selectable redirect CPUs.
*/
@ -77,14 +21,15 @@ struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, u32);
__type(value, u32);
__uint(max_entries, MAX_CPUS);
} cpus_available SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, u32);
__type(value, u32);
__uint(max_entries, 1);
} cpus_count SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
@ -92,25 +37,17 @@ struct {
__uint(max_entries, 1);
} cpus_iterator SEC(".maps");
/* Used by trace point */
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, struct datarec);
__uint(type, BPF_MAP_TYPE_DEVMAP);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(struct bpf_devmap_val));
__uint(max_entries, 1);
} exception_cnt SEC(".maps");
} tx_port SEC(".maps");
char tx_mac_addr[ETH_ALEN];
/* Helper parse functions */
/* Parse Ethernet layer 2, extract network layer 3 offset and protocol
*
* Returns false on error and non-supported ether-type
*/
struct vlan_hdr {
__be16 h_vlan_TCI;
__be16 h_vlan_encapsulated_proto;
};
static __always_inline
bool parse_eth(struct ethhdr *eth, void *data_end,
u16 *eth_proto, u64 *l3_offset)
@ -125,11 +62,12 @@ bool parse_eth(struct ethhdr *eth, void *data_end,
eth_type = eth->h_proto;
/* Skip non 802.3 Ethertypes */
if (unlikely(ntohs(eth_type) < ETH_P_802_3_MIN))
if (__builtin_expect(bpf_ntohs(eth_type) < ETH_P_802_3_MIN, 0))
return false;
/* Handle VLAN tagged packet */
if (eth_type == htons(ETH_P_8021Q) || eth_type == htons(ETH_P_8021AD)) {
if (eth_type == bpf_htons(ETH_P_8021Q) ||
eth_type == bpf_htons(ETH_P_8021AD)) {
struct vlan_hdr *vlan_hdr;
vlan_hdr = (void *)eth + offset;
@ -139,7 +77,8 @@ bool parse_eth(struct ethhdr *eth, void *data_end,
eth_type = vlan_hdr->h_vlan_encapsulated_proto;
}
/* Handle double VLAN tagged packet */
if (eth_type == htons(ETH_P_8021Q) || eth_type == htons(ETH_P_8021AD)) {
if (eth_type == bpf_htons(ETH_P_8021Q) ||
eth_type == bpf_htons(ETH_P_8021AD)) {
struct vlan_hdr *vlan_hdr;
vlan_hdr = (void *)eth + offset;
@ -149,7 +88,7 @@ bool parse_eth(struct ethhdr *eth, void *data_end,
eth_type = vlan_hdr->h_vlan_encapsulated_proto;
}
*eth_proto = ntohs(eth_type);
*eth_proto = bpf_ntohs(eth_type);
*l3_offset = offset;
return true;
}
@ -172,7 +111,7 @@ u16 get_dest_port_ipv4_udp(struct xdp_md *ctx, u64 nh_off)
if (udph + 1 > data_end)
return 0;
dport = ntohs(udph->dest);
dport = bpf_ntohs(udph->dest);
return dport;
}
@ -200,50 +139,48 @@ int get_proto_ipv6(struct xdp_md *ctx, u64 nh_off)
return ip6h->nexthdr;
}
SEC("xdp_cpu_map0")
SEC("xdp")
int xdp_prognum0_no_touch(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct datarec *rec;
u32 *cpu_selected;
u32 cpu_dest;
u32 key = 0;
u32 cpu_dest = 0;
u32 key0 = 0;
/* Only use first entry in cpus_available */
cpu_selected = bpf_map_lookup_elem(&cpus_available, &key);
cpu_selected = bpf_map_lookup_elem(&cpus_available, &key0);
if (!cpu_selected)
return XDP_ABORTED;
cpu_dest = *cpu_selected;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
SEC("xdp_cpu_map1_touch_data")
SEC("xdp")
int xdp_prognum1_touch_data(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
struct datarec *rec;
u32 *cpu_selected;
u32 cpu_dest;
u32 cpu_dest = 0;
u32 key0 = 0;
u16 eth_type;
u32 key = 0;
/* Only use first entry in cpus_available */
cpu_selected = bpf_map_lookup_elem(&cpus_available, &key);
cpu_selected = bpf_map_lookup_elem(&cpus_available, &key0);
if (!cpu_selected)
return XDP_ABORTED;
cpu_dest = *cpu_selected;
@ -252,36 +189,33 @@ int xdp_prognum1_touch_data(struct xdp_md *ctx)
if (eth + 1 > data_end)
return XDP_ABORTED;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
/* Read packet data, and use it (drop non 802.3 Ethertypes) */
eth_type = eth->h_proto;
if (ntohs(eth_type) < ETH_P_802_3_MIN) {
rec->dropped++;
if (bpf_ntohs(eth_type) < ETH_P_802_3_MIN) {
NO_TEAR_INC(rec->dropped);
return XDP_DROP;
}
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
SEC("xdp_cpu_map2_round_robin")
SEC("xdp")
int xdp_prognum2_round_robin(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
u32 key = bpf_get_smp_processor_id();
struct datarec *rec;
u32 cpu_dest;
u32 *cpu_lookup;
u32 cpu_dest = 0;
u32 key0 = 0;
u32 *cpu_selected;
@ -307,40 +241,37 @@ int xdp_prognum2_round_robin(struct xdp_md *ctx)
return XDP_ABORTED;
cpu_dest = *cpu_selected;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key0);
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
SEC("xdp_cpu_map3_proto_separate")
SEC("xdp")
int xdp_prognum3_proto_separate(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
u8 ip_proto = IPPROTO_UDP;
struct datarec *rec;
u16 eth_proto = 0;
u64 l3_offset = 0;
u32 cpu_dest = 0;
u32 cpu_idx = 0;
u32 *cpu_lookup;
u32 key = 0;
u32 cpu_idx = 0;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
if (!(parse_eth(eth, data_end, &eth_proto, &l3_offset)))
return XDP_PASS; /* Just skip */
@ -381,35 +312,33 @@ int xdp_prognum3_proto_separate(struct xdp_md *ctx)
return XDP_ABORTED;
cpu_dest = *cpu_lookup;
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
SEC("xdp_cpu_map4_ddos_filter_pktgen")
SEC("xdp")
int xdp_prognum4_ddos_filter_pktgen(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
u8 ip_proto = IPPROTO_UDP;
struct datarec *rec;
u16 eth_proto = 0;
u64 l3_offset = 0;
u32 cpu_dest = 0;
u32 *cpu_lookup;
u32 cpu_idx = 0;
u16 dest_port;
u32 *cpu_lookup;
u32 key = 0;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
if (!(parse_eth(eth, data_end, &eth_proto, &l3_offset)))
return XDP_PASS; /* Just skip */
@ -443,8 +372,7 @@ int xdp_prognum4_ddos_filter_pktgen(struct xdp_md *ctx)
/* DDoS filter UDP port 9 (pktgen) */
dest_port = get_dest_port_ipv4_udp(ctx, l3_offset);
if (dest_port == 9) {
if (rec)
rec->dropped++;
NO_TEAR_INC(rec->dropped);
return XDP_DROP;
}
break;
@ -457,11 +385,10 @@ int xdp_prognum4_ddos_filter_pktgen(struct xdp_md *ctx)
return XDP_ABORTED;
cpu_dest = *cpu_lookup;
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
@ -496,10 +423,10 @@ u32 get_ipv6_hash_ip_pair(struct xdp_md *ctx, u64 nh_off)
if (ip6h + 1 > data_end)
return 0;
cpu_hash = ip6h->saddr.s6_addr32[0] + ip6h->daddr.s6_addr32[0];
cpu_hash += ip6h->saddr.s6_addr32[1] + ip6h->daddr.s6_addr32[1];
cpu_hash += ip6h->saddr.s6_addr32[2] + ip6h->daddr.s6_addr32[2];
cpu_hash += ip6h->saddr.s6_addr32[3] + ip6h->daddr.s6_addr32[3];
cpu_hash = ip6h->saddr.in6_u.u6_addr32[0] + ip6h->daddr.in6_u.u6_addr32[0];
cpu_hash += ip6h->saddr.in6_u.u6_addr32[1] + ip6h->daddr.in6_u.u6_addr32[1];
cpu_hash += ip6h->saddr.in6_u.u6_addr32[2] + ip6h->daddr.in6_u.u6_addr32[2];
cpu_hash += ip6h->saddr.in6_u.u6_addr32[3] + ip6h->daddr.in6_u.u6_addr32[3];
cpu_hash = SuperFastHash((char *)&cpu_hash, 4, INITVAL + ip6h->nexthdr);
return cpu_hash;
@ -509,30 +436,29 @@ u32 get_ipv6_hash_ip_pair(struct xdp_md *ctx, u64 nh_off)
* hashing scheme is symmetric, meaning swapping IP src/dest still hit
* same CPU.
*/
SEC("xdp_cpu_map5_lb_hash_ip_pairs")
SEC("xdp")
int xdp_prognum5_lb_hash_ip_pairs(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
u8 ip_proto = IPPROTO_UDP;
struct datarec *rec;
u16 eth_proto = 0;
u64 l3_offset = 0;
u32 cpu_dest = 0;
u32 cpu_idx = 0;
u32 *cpu_lookup;
u32 key0 = 0;
u32 *cpu_max;
u32 cpu_hash;
u32 key = 0;
/* Count RX packet in map */
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_ABORTED;
rec->processed++;
return XDP_PASS;
NO_TEAR_INC(rec->processed);
cpu_max = bpf_map_lookup_elem(&cpus_count, &key);
cpu_max = bpf_map_lookup_elem(&cpus_count, &key0);
if (!cpu_max)
return XDP_ABORTED;
@ -560,171 +486,56 @@ int xdp_prognum5_lb_hash_ip_pairs(struct xdp_md *ctx)
return XDP_ABORTED;
cpu_dest = *cpu_lookup;
if (cpu_dest >= MAX_CPUS) {
rec->issue++;
if (cpu_dest >= nr_cpus) {
NO_TEAR_INC(rec->issue);
return XDP_ABORTED;
}
return bpf_redirect_map(&cpu_map, cpu_dest, 0);
}
SEC("xdp_cpumap/redirect")
int xdp_redirect_cpu_devmap(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
swap_src_dst_mac(data);
return bpf_redirect_map(&tx_port, 0, 0);
}
SEC("xdp_cpumap/pass")
int xdp_redirect_cpu_pass(struct xdp_md *ctx)
{
return XDP_PASS;
}
SEC("xdp_cpumap/drop")
int xdp_redirect_cpu_drop(struct xdp_md *ctx)
{
return XDP_DROP;
}
SEC("xdp_devmap/egress")
int xdp_redirect_egress_prog(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
__builtin_memcpy(eth->h_source, (const char *)tx_mac_addr, ETH_ALEN);
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
/*** Trace point code ***/
/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_redirect/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct xdp_redirect_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int prog_id; // offset:8; size:4; signed:1;
u32 act; // offset:12 size:4; signed:0;
int ifindex; // offset:16 size:4; signed:1;
int err; // offset:20 size:4; signed:1;
int to_ifindex; // offset:24 size:4; signed:1;
u32 map_id; // offset:28 size:4; signed:0;
int map_index; // offset:32 size:4; signed:1;
}; // offset:36
enum {
XDP_REDIRECT_SUCCESS = 0,
XDP_REDIRECT_ERROR = 1
};
static __always_inline
int xdp_redirect_collect_stat(struct xdp_redirect_ctx *ctx)
{
u32 key = XDP_REDIRECT_ERROR;
struct datarec *rec;
int err = ctx->err;
if (!err)
key = XDP_REDIRECT_SUCCESS;
rec = bpf_map_lookup_elem(&redirect_err_cnt, &key);
if (!rec)
return 0;
rec->dropped += 1;
return 0; /* Indicate event was filtered (no further processing)*/
/*
* Returning 1 here would allow e.g. a perf-record tracepoint
* to see and record these events, but it doesn't work well
* in-practice as stopping perf-record also unload this
* bpf_prog. Plus, there is additional overhead of doing so.
*/
}
SEC("tracepoint/xdp/xdp_redirect_err")
int trace_xdp_redirect_err(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
SEC("tracepoint/xdp/xdp_redirect_map_err")
int trace_xdp_redirect_map_err(struct xdp_redirect_ctx *ctx)
{
return xdp_redirect_collect_stat(ctx);
}
/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_exception/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct xdp_exception_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int prog_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int ifindex; // offset:16; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_exception")
int trace_xdp_exception(struct xdp_exception_ctx *ctx)
{
struct datarec *rec;
u32 key = 0;
rec = bpf_map_lookup_elem(&exception_cnt, &key);
if (!rec)
return 1;
rec->dropped += 1;
return 0;
}
/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_enqueue/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct cpumap_enqueue_ctx {
u64 __pad; // First 8 bytes are not accessible by bpf code
int map_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int cpu; // offset:16; size:4; signed:1;
unsigned int drops; // offset:20; size:4; signed:0;
unsigned int processed; // offset:24; size:4; signed:0;
int to_cpu; // offset:28; size:4; signed:1;
};
SEC("tracepoint/xdp/xdp_cpumap_enqueue")
int trace_xdp_cpumap_enqueue(struct cpumap_enqueue_ctx *ctx)
{
u32 to_cpu = ctx->to_cpu;
struct datarec *rec;
if (to_cpu >= MAX_CPUS)
return 1;
rec = bpf_map_lookup_elem(&cpumap_enqueue_cnt, &to_cpu);
if (!rec)
return 0;
rec->processed += ctx->processed;
rec->dropped += ctx->drops;
/* Record bulk events, then userspace can calc average bulk size */
if (ctx->processed > 0)
rec->issue += 1;
/* Inception: It's possible to detect overload situations, via
* this tracepoint. This can be used for creating a feedback
* loop to XDP, which can take appropriate actions to mitigate
* this overload situation.
*/
return 0;
}
/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_kthread/format
* Code in: kernel/include/trace/events/xdp.h
*/
struct cpumap_kthread_ctx {
u64 __pad; // First 8 bytes are not accessible
int map_id; // offset:8; size:4; signed:1;
u32 act; // offset:12; size:4; signed:0;
int cpu; // offset:16; size:4; signed:1;
unsigned int drops; // offset:20; size:4; signed:0;
unsigned int processed; // offset:24; size:4; signed:0;
int sched; // offset:28; size:4; signed:1;
unsigned int xdp_pass; // offset:32; size:4; signed:0;
unsigned int xdp_drop; // offset:36; size:4; signed:0;
unsigned int xdp_redirect; // offset:40; size:4; signed:0;
};
SEC("tracepoint/xdp/xdp_cpumap_kthread")
int trace_xdp_cpumap_kthread(struct cpumap_kthread_ctx *ctx)
{
struct datarec *rec;
u32 key = 0;
rec = bpf_map_lookup_elem(&cpumap_kthread_cnt, &key);
if (!rec)
return 0;
rec->processed += ctx->processed;
rec->dropped += ctx->drops;
rec->xdp_pass += ctx->xdp_pass;
rec->xdp_drop += ctx->xdp_drop;
rec->xdp_redirect += ctx->xdp_redirect;
/* Count times kthread yielded CPU via schedule call */
if (ctx->sched)
rec->issue++;
return 0;
}

File diff suppressed because it is too large Load Diff

View File

@ -1,90 +0,0 @@
/* Copyright (c) 2016 John Fastabend <john.r.fastabend@intel.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of version 2 of the GNU General Public
* License as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*/
#define KBUILD_MODNAME "foo"
#include <uapi/linux/bpf.h>
#include <linux/in.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <linux/if_vlan.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, int);
__type(value, int);
__uint(max_entries, 1);
} tx_port SEC(".maps");
/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success
* feedback. Redirect TX errors can be caught via a tracepoint.
*/
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, long);
__uint(max_entries, 1);
} rxcnt SEC(".maps");
static void swap_src_dst_mac(void *data)
{
unsigned short *p = data;
unsigned short dst[3];
dst[0] = p[0];
dst[1] = p[1];
dst[2] = p[2];
p[0] = p[3];
p[1] = p[4];
p[2] = p[5];
p[3] = dst[0];
p[4] = dst[1];
p[5] = dst[2];
}
SEC("xdp_redirect")
int xdp_redirect_prog(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
int rc = XDP_DROP;
int *ifindex, port = 0;
long *value;
u32 key = 0;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return rc;
ifindex = bpf_map_lookup_elem(&tx_port, &port);
if (!ifindex)
return rc;
value = bpf_map_lookup_elem(&rxcnt, &key);
if (value)
*value += 1;
swap_src_dst_mac(data);
return bpf_redirect(*ifindex, 0);
}
/* Redirect require an XDP bpf_prog loaded on the TX device */
SEC("xdp_redirect_dummy")
int xdp_redirect_dummy_prog(struct xdp_md *ctx)
{
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";

View File

@ -10,14 +10,10 @@
* General Public License for more details.
*/
#define KBUILD_MODNAME "foo"
#include <uapi/linux/bpf.h>
#include <linux/in.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <linux/if_vlan.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <bpf/bpf_helpers.h>
#include "vmlinux.h"
#include "xdp_sample.bpf.h"
#include "xdp_sample_shared.h"
/* The 2nd xdp prog on egress does not support skb mode, so we define two
* maps, tx_port_general and tx_port_native.
@ -26,114 +22,71 @@ struct {
__uint(type, BPF_MAP_TYPE_DEVMAP);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(int));
__uint(max_entries, 100);
__uint(max_entries, 1);
} tx_port_general SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_DEVMAP);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(struct bpf_devmap_val));
__uint(max_entries, 100);
__uint(max_entries, 1);
} tx_port_native SEC(".maps");
/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success
* feedback. Redirect TX errors can be caught via a tracepoint.
*/
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, long);
__uint(max_entries, 1);
} rxcnt SEC(".maps");
/* map to store egress interface mac address */
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, u32);
__type(value, __be64);
__uint(max_entries, 1);
} tx_mac SEC(".maps");
static void swap_src_dst_mac(void *data)
{
unsigned short *p = data;
unsigned short dst[3];
dst[0] = p[0];
dst[1] = p[1];
dst[2] = p[2];
p[0] = p[3];
p[1] = p[4];
p[2] = p[5];
p[3] = dst[0];
p[4] = dst[1];
p[5] = dst[2];
}
/* store egress interface mac address */
const volatile char tx_mac_addr[ETH_ALEN];
static __always_inline int xdp_redirect_map(struct xdp_md *ctx, void *redirect_map)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
u32 key = bpf_get_smp_processor_id();
struct ethhdr *eth = data;
int rc = XDP_DROP;
long *value;
u32 key = 0;
u64 nh_off;
int vport;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return rc;
/* constant virtual port */
vport = 0;
/* count packet in global counter */
value = bpf_map_lookup_elem(&rxcnt, &key);
if (value)
*value += 1;
swap_src_dst_mac(data);
/* send packet out physical port */
return bpf_redirect_map(redirect_map, vport, 0);
}
SEC("xdp_redirect_general")
int xdp_redirect_map_general(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &tx_port_general);
}
SEC("xdp_redirect_native")
int xdp_redirect_map_native(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &tx_port_native);
}
SEC("xdp_devmap/map_prog")
int xdp_redirect_map_egress(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
__be64 *mac;
u32 key = 0;
struct datarec *rec;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
mac = bpf_map_lookup_elem(&tx_mac, &key);
if (mac)
__builtin_memcpy(eth->h_source, mac, ETH_ALEN);
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_PASS;
NO_TEAR_INC(rec->processed);
swap_src_dst_mac(data);
return bpf_redirect_map(redirect_map, 0, 0);
}
SEC("xdp")
int xdp_redirect_map_general(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &tx_port_general);
}
SEC("xdp")
int xdp_redirect_map_native(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &tx_port_native);
}
SEC("xdp_devmap/egress")
int xdp_redirect_map_egress(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
u64 nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
__builtin_memcpy(eth->h_source, (const char *)tx_mac_addr, ETH_ALEN);
return XDP_PASS;
}
/* Redirect require an XDP bpf_prog loaded on the TX device */
SEC("xdp_redirect_dummy")
SEC("xdp")
int xdp_redirect_dummy_prog(struct xdp_md *ctx)
{
return XDP_PASS;

View File

@ -1,11 +1,14 @@
// SPDX-License-Identifier: GPL-2.0
#define KBUILD_MODNAME "foo"
#include <uapi/linux/bpf.h>
#include <linux/in.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <bpf/bpf_helpers.h>
#include "vmlinux.h"
#include "xdp_sample.bpf.h"
#include "xdp_sample_shared.h"
enum {
BPF_F_BROADCAST = (1ULL << 3),
BPF_F_EXCLUDE_INGRESS = (1ULL << 4),
};
struct {
__uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
@ -21,50 +24,41 @@ struct {
__uint(max_entries, 32);
} forward_map_native SEC(".maps");
/* map to store egress interfaces mac addresses */
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, u32);
__type(value, long);
__uint(max_entries, 1);
} rxcnt SEC(".maps");
/* map to store egress interfaces mac addresses, set the
* max_entries to 1 and extend it in user sapce prog.
*/
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, u32);
__type(value, __be64);
__uint(max_entries, 1);
__uint(max_entries, 32);
} mac_map SEC(".maps");
static int xdp_redirect_map(struct xdp_md *ctx, void *forward_map)
{
long *value;
u32 key = 0;
u32 key = bpf_get_smp_processor_id();
struct datarec *rec;
/* count packet in global counter */
value = bpf_map_lookup_elem(&rxcnt, &key);
if (value)
*value += 1;
rec = bpf_map_lookup_elem(&rx_cnt, &key);
if (!rec)
return XDP_PASS;
NO_TEAR_INC(rec->processed);
return bpf_redirect_map(forward_map, key,
return bpf_redirect_map(forward_map, 0,
BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS);
}
SEC("xdp_redirect_general")
SEC("xdp")
int xdp_redirect_map_general(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &forward_map_general);
}
SEC("xdp_redirect_native")
SEC("xdp")
int xdp_redirect_map_native(struct xdp_md *ctx)
{
return xdp_redirect_map(ctx, &forward_map_native);
}
SEC("xdp_devmap/map_prog")
SEC("xdp_devmap/egress")
int xdp_devmap_prog(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;

View File

@ -1,7 +1,12 @@
// SPDX-License-Identifier: GPL-2.0
static const char *__doc__ =
"XDP multi redirect tool, using BPF_MAP_TYPE_DEVMAP and BPF_F_BROADCAST flag for bpf_redirect_map\n"
"Usage: xdp_redirect_map_multi <IFINDEX|IFNAME> <IFINDEX|IFNAME> ... <IFINDEX|IFNAME>\n";
#include <linux/bpf.h>
#include <linux/if_link.h>
#include <assert.h>
#include <getopt.h>
#include <errno.h>
#include <signal.h>
#include <stdio.h>
@ -15,106 +20,54 @@
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include "bpf_util.h"
#include <linux/if_ether.h>
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
#include "bpf_util.h"
#include "xdp_sample_user.h"
#include "xdp_redirect_map_multi.skel.h"
#define MAX_IFACE_NUM 32
static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
static int ifaces[MAX_IFACE_NUM] = {};
static int rxcnt_map_fd;
static void int_exit(int sig)
static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_MAP_CNT |
SAMPLE_EXCEPTION_CNT | SAMPLE_DEVMAP_XMIT_CNT |
SAMPLE_DEVMAP_XMIT_CNT_MULTI | SAMPLE_SKIP_HEADING;
DEFINE_SAMPLE_INIT(xdp_redirect_map_multi);
static const struct option long_options[] = {
{ "help", no_argument, NULL, 'h' },
{ "skb-mode", no_argument, NULL, 'S' },
{ "force", no_argument, NULL, 'F' },
{ "load-egress", no_argument, NULL, 'X' },
{ "stats", no_argument, NULL, 's' },
{ "interval", required_argument, NULL, 'i' },
{ "verbose", no_argument, NULL, 'v' },
{}
};
static int update_mac_map(struct bpf_map *map)
{
__u32 prog_id = 0;
int i;
for (i = 0; ifaces[i] > 0; i++) {
if (bpf_get_link_xdp_id(ifaces[i], &prog_id, xdp_flags)) {
printf("bpf_get_link_xdp_id failed\n");
exit(1);
}
if (prog_id)
bpf_set_link_xdp_fd(ifaces[i], -1, xdp_flags);
}
exit(0);
}
static void poll_stats(int interval)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
__u64 values[nr_cpus], prev[nr_cpus];
memset(prev, 0, sizeof(prev));
while (1) {
__u64 sum = 0;
__u32 key = 0;
int i;
sleep(interval);
assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0);
for (i = 0; i < nr_cpus; i++)
sum += (values[i] - prev[i]);
if (sum)
printf("Forwarding %10llu pkt/s\n", sum / interval);
memcpy(prev, values, sizeof(values));
}
}
static int get_mac_addr(unsigned int ifindex, void *mac_addr)
{
char ifname[IF_NAMESIZE];
struct ifreq ifr;
int fd, ret = -1;
fd = socket(AF_INET, SOCK_DGRAM, 0);
if (fd < 0)
return ret;
if (!if_indextoname(ifindex, ifname))
goto err_out;
strcpy(ifr.ifr_name, ifname);
if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0)
goto err_out;
memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char));
ret = 0;
err_out:
close(fd);
return ret;
}
static int update_mac_map(struct bpf_object *obj)
{
int i, ret = -1, mac_map_fd;
int mac_map_fd = bpf_map__fd(map);
unsigned char mac_addr[6];
unsigned int ifindex;
mac_map_fd = bpf_object__find_map_fd_by_name(obj, "mac_map");
if (mac_map_fd < 0) {
printf("find mac map fd failed\n");
return ret;
}
int i, ret = -1;
for (i = 0; ifaces[i] > 0; i++) {
ifindex = ifaces[i];
ret = get_mac_addr(ifindex, mac_addr);
if (ret < 0) {
printf("get interface %d mac failed\n", ifindex);
fprintf(stderr, "get interface %d mac failed\n",
ifindex);
return ret;
}
ret = bpf_map_update_elem(mac_map_fd, &ifindex, mac_addr, 0);
if (ret) {
perror("bpf_update_elem mac_map_fd");
if (ret < 0) {
fprintf(stderr, "Failed to update mac address for ifindex %d\n",
ifindex);
return ret;
}
}
@ -122,181 +75,159 @@ static int update_mac_map(struct bpf_object *obj)
return 0;
}
static void usage(const char *prog)
{
fprintf(stderr,
"usage: %s [OPTS] <IFNAME|IFINDEX> <IFNAME|IFINDEX> ...\n"
"OPTS:\n"
" -S use skb-mode\n"
" -N enforce native mode\n"
" -F force loading prog\n"
" -X load xdp program on egress\n",
prog);
}
int main(int argc, char **argv)
{
int i, ret, opt, forward_map_fd, max_ifindex = 0;
struct bpf_program *ingress_prog, *egress_prog;
int ingress_prog_fd, egress_prog_fd = 0;
struct bpf_devmap_val devmap_val;
bool attach_egress_prog = false;
struct bpf_devmap_val devmap_val = {};
struct xdp_redirect_map_multi *skel;
struct bpf_program *ingress_prog;
bool xdp_devmap_attached = false;
struct bpf_map *forward_map;
int ret = EXIT_FAIL_OPTION;
unsigned long interval = 2;
char ifname[IF_NAMESIZE];
struct bpf_map *mac_map;
struct bpf_object *obj;
unsigned int ifindex;
char filename[256];
bool generic = false;
bool force = false;
bool tried = false;
bool error = true;
int i, opt;
while ((opt = getopt(argc, argv, "SNFX")) != -1) {
while ((opt = getopt_long(argc, argv, "hSFXi:vs",
long_options, NULL)) != -1) {
switch (opt) {
case 'S':
xdp_flags |= XDP_FLAGS_SKB_MODE;
break;
case 'N':
/* default, set below */
generic = true;
/* devmap_xmit tracepoint not available */
mask &= ~(SAMPLE_DEVMAP_XMIT_CNT |
SAMPLE_DEVMAP_XMIT_CNT_MULTI);
break;
case 'F':
xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST;
force = true;
break;
case 'X':
attach_egress_prog = true;
xdp_devmap_attached = true;
break;
case 'i':
interval = strtoul(optarg, NULL, 0);
break;
case 'v':
sample_switch_mode();
break;
case 's':
mask |= SAMPLE_REDIRECT_MAP_CNT;
break;
case 'h':
error = false;
default:
usage(basename(argv[0]));
return 1;
sample_usage(argv, long_options, __doc__, mask, error);
return ret;
}
}
if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) {
xdp_flags |= XDP_FLAGS_DRV_MODE;
} else if (attach_egress_prog) {
printf("Load xdp program on egress with SKB mode not supported yet\n");
return 1;
if (argc <= optind + 1) {
sample_usage(argv, long_options, __doc__, mask, error);
return ret;
}
if (optind == argc) {
printf("usage: %s <IFNAME|IFINDEX> <IFNAME|IFINDEX> ...\n", argv[0]);
return 1;
skel = xdp_redirect_map_multi__open();
if (!skel) {
fprintf(stderr, "Failed to xdp_redirect_map_multi__open: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end;
}
printf("Get interfaces");
ret = sample_init_pre_load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to sample_init_pre_load: %s\n", strerror(-ret));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
ret = EXIT_FAIL_OPTION;
for (i = 0; i < MAX_IFACE_NUM && argv[optind + i]; i++) {
ifaces[i] = if_nametoindex(argv[optind + i]);
if (!ifaces[i])
ifaces[i] = strtoul(argv[optind + i], NULL, 0);
if (!if_indextoname(ifaces[i], ifname)) {
perror("Invalid interface name or i");
return 1;
fprintf(stderr, "Bad interface index or name\n");
sample_usage(argv, long_options, __doc__, mask, true);
goto end_destroy;
}
/* Find the largest index number */
if (ifaces[i] > max_ifindex)
max_ifindex = ifaces[i];
printf(" %d", ifaces[i]);
}
printf("\n");
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
obj = bpf_object__open(filename);
if (libbpf_get_error(obj)) {
printf("ERROR: opening BPF object file failed\n");
obj = NULL;
goto err_out;
skel->rodata->from_match[i] = ifaces[i];
skel->rodata->to_match[i] = ifaces[i];
}
/* Reset the map size to max ifindex + 1 */
if (attach_egress_prog) {
mac_map = bpf_object__find_map_by_name(obj, "mac_map");
ret = bpf_map__resize(mac_map, max_ifindex + 1);
if (ret < 0) {
printf("ERROR: reset mac map size failed\n");
goto err_out;
}
ret = xdp_redirect_map_multi__load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to xdp_redirect_map_multi__load: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
/* load BPF program */
if (bpf_object__load(obj)) {
printf("ERROR: loading BPF object file failed\n");
goto err_out;
}
if (xdp_flags & XDP_FLAGS_SKB_MODE) {
ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_general");
forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_general");
} else {
ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_native");
forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_native");
}
if (!ingress_prog || forward_map_fd < 0) {
printf("finding ingress_prog/forward_map in obj file failed\n");
goto err_out;
}
ingress_prog_fd = bpf_program__fd(ingress_prog);
if (ingress_prog_fd < 0) {
printf("find ingress_prog fd failed\n");
goto err_out;
}
rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt");
if (rxcnt_map_fd < 0) {
printf("bpf_object__find_map_fd_by_name failed\n");
goto err_out;
}
if (attach_egress_prog) {
if (xdp_devmap_attached) {
/* Update mac_map with all egress interfaces' mac addr */
if (update_mac_map(obj) < 0) {
printf("Error: update mac map failed");
goto err_out;
}
/* Find egress prog fd */
egress_prog = bpf_object__find_program_by_name(obj, "xdp_devmap_prog");
if (!egress_prog) {
printf("finding egress_prog in obj file failed\n");
goto err_out;
}
egress_prog_fd = bpf_program__fd(egress_prog);
if (egress_prog_fd < 0) {
printf("find egress_prog fd failed\n");
goto err_out;
if (update_mac_map(skel->maps.mac_map) < 0) {
fprintf(stderr, "Updating mac address failed\n");
ret = EXIT_FAIL;
goto end_destroy;
}
}
/* Remove attached program when program is interrupted or killed */
signal(SIGINT, int_exit);
signal(SIGTERM, int_exit);
ret = sample_init(skel, mask);
if (ret < 0) {
fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
ingress_prog = skel->progs.xdp_redirect_map_native;
forward_map = skel->maps.forward_map_native;
/* Init forward multicast groups */
for (i = 0; ifaces[i] > 0; i++) {
ifindex = ifaces[i];
ret = EXIT_FAIL_XDP;
restart:
/* bind prog_fd to each interface */
ret = bpf_set_link_xdp_fd(ifindex, ingress_prog_fd, xdp_flags);
if (ret) {
printf("Set xdp fd failed on %d\n", ifindex);
goto err_out;
if (sample_install_xdp(ingress_prog, ifindex, generic, force) < 0) {
if (generic && !tried) {
fprintf(stderr,
"Trying fallback to sizeof(int) as value_size for devmap in generic mode\n");
ingress_prog = skel->progs.xdp_redirect_map_general;
forward_map = skel->maps.forward_map_general;
tried = true;
goto restart;
}
goto end_destroy;
}
/* Add all the interfaces to forward group and attach
* egress devmap programe if exist
* egress devmap program if exist
*/
devmap_val.ifindex = ifindex;
devmap_val.bpf_prog.fd = egress_prog_fd;
ret = bpf_map_update_elem(forward_map_fd, &ifindex, &devmap_val, 0);
if (ret) {
perror("bpf_map_update_elem forward_map");
goto err_out;
if (xdp_devmap_attached)
devmap_val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_devmap_prog);
ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0);
if (ret < 0) {
fprintf(stderr, "Failed to update devmap value: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
}
poll_stats(2);
return 0;
err_out:
return 1;
ret = sample_run(interval, NULL, NULL);
if (ret < 0) {
fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
ret = EXIT_OK;
end_destroy:
xdp_redirect_map_multi__destroy(skel);
end:
sample_exit(ret);
}

View File

@ -1,6 +1,10 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2017 Covalent IO, Inc. http://covalent.io
*/
static const char *__doc__ =
"XDP redirect tool, using BPF_MAP_TYPE_DEVMAP\n"
"Usage: xdp_redirect_map <IFINDEX|IFNAME>_IN <IFINDEX|IFNAME>_OUT\n";
#include <linux/bpf.h>
#include <linux/if_link.h>
#include <assert.h>
@ -13,165 +17,83 @@
#include <net/if.h>
#include <unistd.h>
#include <libgen.h>
#include <sys/resource.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include "bpf_util.h"
#include <getopt.h>
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
#include "bpf_util.h"
#include "xdp_sample_user.h"
#include "xdp_redirect_map.skel.h"
static int ifindex_in;
static int ifindex_out;
static bool ifindex_out_xdp_dummy_attached = true;
static bool xdp_devmap_attached;
static __u32 prog_id;
static __u32 dummy_prog_id;
static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_MAP_CNT |
SAMPLE_EXCEPTION_CNT | SAMPLE_DEVMAP_XMIT_CNT_MULTI;
static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
static int rxcnt_map_fd;
DEFINE_SAMPLE_INIT(xdp_redirect_map);
static void int_exit(int sig)
{
__u32 curr_prog_id = 0;
if (bpf_get_link_xdp_id(ifindex_in, &curr_prog_id, xdp_flags)) {
printf("bpf_get_link_xdp_id failed\n");
exit(1);
}
if (prog_id == curr_prog_id)
bpf_set_link_xdp_fd(ifindex_in, -1, xdp_flags);
else if (!curr_prog_id)
printf("couldn't find a prog id on iface IN\n");
else
printf("program on iface IN changed, not removing\n");
if (ifindex_out_xdp_dummy_attached) {
curr_prog_id = 0;
if (bpf_get_link_xdp_id(ifindex_out, &curr_prog_id,
xdp_flags)) {
printf("bpf_get_link_xdp_id failed\n");
exit(1);
}
if (dummy_prog_id == curr_prog_id)
bpf_set_link_xdp_fd(ifindex_out, -1, xdp_flags);
else if (!curr_prog_id)
printf("couldn't find a prog id on iface OUT\n");
else
printf("program on iface OUT changed, not removing\n");
}
exit(0);
}
static void poll_stats(int interval, int ifindex)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
__u64 values[nr_cpus], prev[nr_cpus];
memset(prev, 0, sizeof(prev));
while (1) {
__u64 sum = 0;
__u32 key = 0;
int i;
sleep(interval);
assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0);
for (i = 0; i < nr_cpus; i++)
sum += (values[i] - prev[i]);
if (sum)
printf("ifindex %i: %10llu pkt/s\n",
ifindex, sum / interval);
memcpy(prev, values, sizeof(values));
}
}
static int get_mac_addr(unsigned int ifindex_out, void *mac_addr)
{
char ifname[IF_NAMESIZE];
struct ifreq ifr;
int fd, ret = -1;
fd = socket(AF_INET, SOCK_DGRAM, 0);
if (fd < 0)
return ret;
if (!if_indextoname(ifindex_out, ifname))
goto err_out;
strcpy(ifr.ifr_name, ifname);
if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0)
goto err_out;
memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char));
ret = 0;
err_out:
close(fd);
return ret;
}
static void usage(const char *prog)
{
fprintf(stderr,
"usage: %s [OPTS] <IFNAME|IFINDEX>_IN <IFNAME|IFINDEX>_OUT\n\n"
"OPTS:\n"
" -S use skb-mode\n"
" -N enforce native mode\n"
" -F force loading prog\n"
" -X load xdp program on egress\n",
prog);
}
static const struct option long_options[] = {
{ "help", no_argument, NULL, 'h' },
{ "skb-mode", no_argument, NULL, 'S' },
{ "force", no_argument, NULL, 'F' },
{ "load-egress", no_argument, NULL, 'X' },
{ "stats", no_argument, NULL, 's' },
{ "interval", required_argument, NULL, 'i' },
{ "verbose", no_argument, NULL, 'v' },
{}
};
int main(int argc, char **argv)
{
struct bpf_prog_load_attr prog_load_attr = {
.prog_type = BPF_PROG_TYPE_UNSPEC,
};
struct bpf_program *prog, *dummy_prog, *devmap_prog;
int prog_fd, dummy_prog_fd, devmap_prog_fd = 0;
int tx_port_map_fd, tx_mac_map_fd;
struct bpf_devmap_val devmap_val;
struct bpf_prog_info info = {};
__u32 info_len = sizeof(info);
const char *optstr = "FSNX";
struct bpf_object *obj;
int ret, opt, key = 0;
char filename[256];
struct bpf_devmap_val devmap_val = {};
bool xdp_devmap_attached = false;
struct xdp_redirect_map *skel;
char str[2 * IF_NAMESIZE + 1];
char ifname_out[IF_NAMESIZE];
struct bpf_map *tx_port_map;
char ifname_in[IF_NAMESIZE];
int ifindex_in, ifindex_out;
unsigned long interval = 2;
int ret = EXIT_FAIL_OPTION;
struct bpf_program *prog;
bool generic = false;
bool force = false;
bool tried = false;
bool error = true;
int opt, key = 0;
while ((opt = getopt(argc, argv, optstr)) != -1) {
while ((opt = getopt_long(argc, argv, "hSFXi:vs",
long_options, NULL)) != -1) {
switch (opt) {
case 'S':
xdp_flags |= XDP_FLAGS_SKB_MODE;
break;
case 'N':
/* default, set below */
generic = true;
/* devmap_xmit tracepoint not available */
mask &= ~(SAMPLE_DEVMAP_XMIT_CNT |
SAMPLE_DEVMAP_XMIT_CNT_MULTI);
break;
case 'F':
xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST;
force = true;
break;
case 'X':
xdp_devmap_attached = true;
break;
case 'i':
interval = strtoul(optarg, NULL, 0);
break;
case 'v':
sample_switch_mode();
break;
case 's':
mask |= SAMPLE_REDIRECT_MAP_CNT;
break;
case 'h':
error = false;
default:
usage(basename(argv[0]));
return 1;
sample_usage(argv, long_options, __doc__, mask, error);
return ret;
}
}
if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) {
xdp_flags |= XDP_FLAGS_DRV_MODE;
} else if (xdp_devmap_attached) {
printf("Load xdp program on egress with SKB mode not supported yet\n");
return 1;
}
if (optind == argc) {
printf("usage: %s <IFNAME|IFINDEX>_IN <IFNAME|IFINDEX>_OUT\n", argv[0]);
return 1;
if (argc <= optind + 1) {
sample_usage(argv, long_options, __doc__, mask, true);
goto end;
}
ifindex_in = if_nametoindex(argv[optind]);
@ -182,107 +104,116 @@ int main(int argc, char **argv)
if (!ifindex_out)
ifindex_out = strtoul(argv[optind + 1], NULL, 0);
printf("input: %d output: %d\n", ifindex_in, ifindex_out);
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
prog_load_attr.file = filename;
if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
return 1;
if (xdp_flags & XDP_FLAGS_SKB_MODE) {
prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_general");
tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port_general");
} else {
prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_native");
tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port_native");
}
dummy_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_dummy_prog");
if (!prog || dummy_prog < 0 || tx_port_map_fd < 0) {
printf("finding prog/dummy_prog/tx_port_map in obj file failed\n");
goto out;
}
prog_fd = bpf_program__fd(prog);
dummy_prog_fd = bpf_program__fd(dummy_prog);
if (prog_fd < 0 || dummy_prog_fd < 0 || tx_port_map_fd < 0) {
printf("bpf_prog_load_xattr: %s\n", strerror(errno));
return 1;
if (!ifindex_in || !ifindex_out) {
fprintf(stderr, "Bad interface index or name\n");
sample_usage(argv, long_options, __doc__, mask, true);
goto end;
}
tx_mac_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_mac");
rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt");
if (tx_mac_map_fd < 0 || rxcnt_map_fd < 0) {
printf("bpf_object__find_map_fd_by_name failed\n");
return 1;
skel = xdp_redirect_map__open();
if (!skel) {
fprintf(stderr, "Failed to xdp_redirect_map__open: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end;
}
if (bpf_set_link_xdp_fd(ifindex_in, prog_fd, xdp_flags) < 0) {
printf("ERROR: link set xdp fd failed on %d\n", ifindex_in);
return 1;
ret = sample_init_pre_load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to sample_init_pre_load: %s\n", strerror(-ret));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
ret = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
if (ret) {
printf("can't get prog info - %s\n", strerror(errno));
return ret;
}
prog_id = info.id;
/* Loading dummy XDP prog on out-device */
if (bpf_set_link_xdp_fd(ifindex_out, dummy_prog_fd,
(xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST)) < 0) {
printf("WARN: link set xdp fd failed on %d\n", ifindex_out);
ifindex_out_xdp_dummy_attached = false;
}
memset(&info, 0, sizeof(info));
ret = bpf_obj_get_info_by_fd(dummy_prog_fd, &info, &info_len);
if (ret) {
printf("can't get prog info - %s\n", strerror(errno));
return ret;
}
dummy_prog_id = info.id;
/* Load 2nd xdp prog on egress. */
if (xdp_devmap_attached) {
unsigned char mac_addr[6];
devmap_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_egress");
if (!devmap_prog) {
printf("finding devmap_prog in obj file failed\n");
goto out;
}
devmap_prog_fd = bpf_program__fd(devmap_prog);
if (devmap_prog_fd < 0) {
printf("finding devmap_prog fd failed\n");
goto out;
}
if (get_mac_addr(ifindex_out, mac_addr) < 0) {
printf("get interface %d mac failed\n", ifindex_out);
goto out;
}
ret = bpf_map_update_elem(tx_mac_map_fd, &key, mac_addr, 0);
if (ret) {
perror("bpf_update_elem tx_mac_map_fd");
goto out;
ret = get_mac_addr(ifindex_out, skel->rodata->tx_mac_addr);
if (ret < 0) {
fprintf(stderr, "Failed to get interface %d mac address: %s\n",
ifindex_out, strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
}
signal(SIGINT, int_exit);
signal(SIGTERM, int_exit);
skel->rodata->from_match[0] = ifindex_in;
skel->rodata->to_match[0] = ifindex_out;
ret = xdp_redirect_map__load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to xdp_redirect_map__load: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
ret = sample_init(skel, mask);
if (ret < 0) {
fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
prog = skel->progs.xdp_redirect_map_native;
tx_port_map = skel->maps.tx_port_native;
restart:
if (sample_install_xdp(prog, ifindex_in, generic, force) < 0) {
/* First try with struct bpf_devmap_val as value for generic
* mode, then fallback to sizeof(int) for older kernels.
*/
fprintf(stderr,
"Trying fallback to sizeof(int) as value_size for devmap in generic mode\n");
if (generic && !tried) {
prog = skel->progs.xdp_redirect_map_general;
tx_port_map = skel->maps.tx_port_general;
tried = true;
goto restart;
}
ret = EXIT_FAIL_XDP;
goto end_destroy;
}
/* Loading dummy XDP prog on out-device */
sample_install_xdp(skel->progs.xdp_redirect_dummy_prog, ifindex_out, generic, force);
devmap_val.ifindex = ifindex_out;
devmap_val.bpf_prog.fd = devmap_prog_fd;
ret = bpf_map_update_elem(tx_port_map_fd, &key, &devmap_val, 0);
if (ret) {
perror("bpf_update_elem");
goto out;
if (xdp_devmap_attached)
devmap_val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_redirect_map_egress);
ret = bpf_map_update_elem(bpf_map__fd(tx_port_map), &key, &devmap_val, 0);
if (ret < 0) {
fprintf(stderr, "Failed to update devmap value: %s\n",
strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
poll_stats(2, ifindex_out);
ret = EXIT_FAIL;
if (!if_indextoname(ifindex_in, ifname_in)) {
fprintf(stderr, "Failed to if_indextoname for %d: %s\n", ifindex_in,
strerror(errno));
goto end_destroy;
}
out:
return 0;
if (!if_indextoname(ifindex_out, ifname_out)) {
fprintf(stderr, "Failed to if_indextoname for %d: %s\n", ifindex_out,
strerror(errno));
goto end_destroy;
}
safe_strncpy(str, get_driver_name(ifindex_in), sizeof(str));
printf("Redirecting from %s (ifindex %d; driver %s) to %s (ifindex %d; driver %s)\n",
ifname_in, ifindex_in, str, ifname_out, ifindex_out, get_driver_name(ifindex_out));
snprintf(str, sizeof(str), "%s->%s", ifname_in, ifname_out);
ret = sample_run(interval, NULL, NULL);
if (ret < 0) {
fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
ret = EXIT_OK;
end_destroy:
xdp_redirect_map__destroy(skel);
end:
sample_exit(ret);
}

View File

@ -1,6 +1,10 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2016 John Fastabend <john.r.fastabend@intel.com>
*/
static const char *__doc__ =
"XDP redirect tool, using bpf_redirect helper\n"
"Usage: xdp_redirect <IFINDEX|IFNAME>_IN <IFINDEX|IFNAME>_OUT\n";
#include <linux/bpf.h>
#include <linux/if_link.h>
#include <assert.h>
@ -13,126 +17,73 @@
#include <net/if.h>
#include <unistd.h>
#include <libgen.h>
#include <getopt.h>
#include <sys/resource.h>
#include "bpf_util.h"
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
#include "bpf_util.h"
#include "xdp_sample_user.h"
#include "xdp_redirect.skel.h"
static int ifindex_in;
static int ifindex_out;
static bool ifindex_out_xdp_dummy_attached = true;
static __u32 prog_id;
static __u32 dummy_prog_id;
static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_CNT |
SAMPLE_EXCEPTION_CNT | SAMPLE_DEVMAP_XMIT_CNT_MULTI;
static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
static int rxcnt_map_fd;
static void int_exit(int sig)
{
__u32 curr_prog_id = 0;
if (bpf_get_link_xdp_id(ifindex_in, &curr_prog_id, xdp_flags)) {
printf("bpf_get_link_xdp_id failed\n");
exit(1);
}
if (prog_id == curr_prog_id)
bpf_set_link_xdp_fd(ifindex_in, -1, xdp_flags);
else if (!curr_prog_id)
printf("couldn't find a prog id on iface IN\n");
else
printf("program on iface IN changed, not removing\n");
if (ifindex_out_xdp_dummy_attached) {
curr_prog_id = 0;
if (bpf_get_link_xdp_id(ifindex_out, &curr_prog_id,
xdp_flags)) {
printf("bpf_get_link_xdp_id failed\n");
exit(1);
}
if (dummy_prog_id == curr_prog_id)
bpf_set_link_xdp_fd(ifindex_out, -1, xdp_flags);
else if (!curr_prog_id)
printf("couldn't find a prog id on iface OUT\n");
else
printf("program on iface OUT changed, not removing\n");
}
exit(0);
}
static void poll_stats(int interval, int ifindex)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
__u64 values[nr_cpus], prev[nr_cpus];
memset(prev, 0, sizeof(prev));
while (1) {
__u64 sum = 0;
__u32 key = 0;
int i;
sleep(interval);
assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0);
for (i = 0; i < nr_cpus; i++)
sum += (values[i] - prev[i]);
if (sum)
printf("ifindex %i: %10llu pkt/s\n",
ifindex, sum / interval);
memcpy(prev, values, sizeof(values));
}
}
static void usage(const char *prog)
{
fprintf(stderr,
"usage: %s [OPTS] <IFNAME|IFINDEX>_IN <IFNAME|IFINDEX>_OUT\n\n"
"OPTS:\n"
" -S use skb-mode\n"
" -N enforce native mode\n"
" -F force loading prog\n",
prog);
}
DEFINE_SAMPLE_INIT(xdp_redirect);
static const struct option long_options[] = {
{"help", no_argument, NULL, 'h' },
{"skb-mode", no_argument, NULL, 'S' },
{"force", no_argument, NULL, 'F' },
{"stats", no_argument, NULL, 's' },
{"interval", required_argument, NULL, 'i' },
{"verbose", no_argument, NULL, 'v' },
{}
};
int main(int argc, char **argv)
{
struct bpf_prog_load_attr prog_load_attr = {
.prog_type = BPF_PROG_TYPE_XDP,
};
struct bpf_program *prog, *dummy_prog;
int prog_fd, tx_port_map_fd, opt;
struct bpf_prog_info info = {};
__u32 info_len = sizeof(info);
const char *optstr = "FSN";
struct bpf_object *obj;
char filename[256];
int dummy_prog_fd;
int ret, key = 0;
int ifindex_in, ifindex_out, opt;
char str[2 * IF_NAMESIZE + 1];
char ifname_out[IF_NAMESIZE];
char ifname_in[IF_NAMESIZE];
int ret = EXIT_FAIL_OPTION;
unsigned long interval = 2;
struct xdp_redirect *skel;
bool generic = false;
bool force = false;
bool error = true;
while ((opt = getopt(argc, argv, optstr)) != -1) {
while ((opt = getopt_long(argc, argv, "hSFi:vs",
long_options, NULL)) != -1) {
switch (opt) {
case 'S':
xdp_flags |= XDP_FLAGS_SKB_MODE;
break;
case 'N':
/* default, set below */
generic = true;
mask &= ~(SAMPLE_DEVMAP_XMIT_CNT |
SAMPLE_DEVMAP_XMIT_CNT_MULTI);
break;
case 'F':
xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST;
force = true;
break;
case 'i':
interval = strtoul(optarg, NULL, 0);
break;
case 'v':
sample_switch_mode();
break;
case 's':
mask |= SAMPLE_REDIRECT_CNT;
break;
case 'h':
error = false;
default:
usage(basename(argv[0]));
return 1;
sample_usage(argv, long_options, __doc__, mask, error);
return ret;
}
}
if (!(xdp_flags & XDP_FLAGS_SKB_MODE))
xdp_flags |= XDP_FLAGS_DRV_MODE;
if (optind + 2 != argc) {
printf("usage: %s <IFNAME|IFINDEX>_IN <IFNAME|IFINDEX>_OUT\n", argv[0]);
return 1;
if (argc <= optind + 1) {
sample_usage(argv, long_options, __doc__, mask, true);
return ret;
}
ifindex_in = if_nametoindex(argv[optind]);
@ -143,75 +94,80 @@ int main(int argc, char **argv)
if (!ifindex_out)
ifindex_out = strtoul(argv[optind + 1], NULL, 0);
printf("input: %d output: %d\n", ifindex_in, ifindex_out);
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
prog_load_attr.file = filename;
if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
return 1;
prog = bpf_program__next(NULL, obj);
dummy_prog = bpf_program__next(prog, obj);
if (!prog || !dummy_prog) {
printf("finding a prog in obj file failed\n");
return 1;
}
/* bpf_prog_load_xattr gives us the pointer to first prog's fd,
* so we're missing only the fd for dummy prog
*/
dummy_prog_fd = bpf_program__fd(dummy_prog);
if (prog_fd < 0 || dummy_prog_fd < 0) {
printf("bpf_prog_load_xattr: %s\n", strerror(errno));
return 1;
if (!ifindex_in || !ifindex_out) {
fprintf(stderr, "Bad interface index or name\n");
sample_usage(argv, long_options, __doc__, mask, true);
goto end;
}
tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port");
rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt");
if (tx_port_map_fd < 0 || rxcnt_map_fd < 0) {
printf("bpf_object__find_map_fd_by_name failed\n");
return 1;
skel = xdp_redirect__open();
if (!skel) {
fprintf(stderr, "Failed to xdp_redirect__open: %s\n", strerror(errno));
ret = EXIT_FAIL_BPF;
goto end;
}
if (bpf_set_link_xdp_fd(ifindex_in, prog_fd, xdp_flags) < 0) {
printf("ERROR: link set xdp fd failed on %d\n", ifindex_in);
return 1;
ret = sample_init_pre_load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to sample_init_pre_load: %s\n", strerror(-ret));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
ret = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
if (ret) {
printf("can't get prog info - %s\n", strerror(errno));
return ret;
skel->rodata->from_match[0] = ifindex_in;
skel->rodata->to_match[0] = ifindex_out;
skel->rodata->ifindex_out = ifindex_out;
ret = xdp_redirect__load(skel);
if (ret < 0) {
fprintf(stderr, "Failed to xdp_redirect__load: %s\n", strerror(errno));
ret = EXIT_FAIL_BPF;
goto end_destroy;
}
prog_id = info.id;
ret = sample_init(skel, mask);
if (ret < 0) {
fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
ret = EXIT_FAIL_XDP;
if (sample_install_xdp(skel->progs.xdp_redirect_prog, ifindex_in,
generic, force) < 0)
goto end_destroy;
/* Loading dummy XDP prog on out-device */
if (bpf_set_link_xdp_fd(ifindex_out, dummy_prog_fd,
(xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST)) < 0) {
printf("WARN: link set xdp fd failed on %d\n", ifindex_out);
ifindex_out_xdp_dummy_attached = false;
sample_install_xdp(skel->progs.xdp_redirect_dummy_prog, ifindex_out,
generic, force);
ret = EXIT_FAIL;
if (!if_indextoname(ifindex_in, ifname_in)) {
fprintf(stderr, "Failed to if_indextoname for %d: %s\n", ifindex_in,
strerror(errno));
goto end_destroy;
}
memset(&info, 0, sizeof(info));
ret = bpf_obj_get_info_by_fd(dummy_prog_fd, &info, &info_len);
if (ret) {
printf("can't get prog info - %s\n", strerror(errno));
return ret;
}
dummy_prog_id = info.id;
signal(SIGINT, int_exit);
signal(SIGTERM, int_exit);
/* bpf redirect port */
ret = bpf_map_update_elem(tx_port_map_fd, &key, &ifindex_out, 0);
if (ret) {
perror("bpf_update_elem");
goto out;
if (!if_indextoname(ifindex_out, ifname_out)) {
fprintf(stderr, "Failed to if_indextoname for %d: %s\n", ifindex_out,
strerror(errno));
goto end_destroy;
}
poll_stats(2, ifindex_out);
safe_strncpy(str, get_driver_name(ifindex_in), sizeof(str));
printf("Redirecting from %s (ifindex %d; driver %s) to %s (ifindex %d; driver %s)\n",
ifname_in, ifindex_in, str, ifname_out, ifindex_out, get_driver_name(ifindex_out));
snprintf(str, sizeof(str), "%s->%s", ifname_in, ifname_out);
out:
return ret;
ret = sample_run(interval, NULL, NULL);
if (ret < 0) {
fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret));
ret = EXIT_FAIL;
goto end_destroy;
}
ret = EXIT_OK;
end_destroy:
xdp_redirect__destroy(skel);
end:
sample_exit(ret);
}

View File

@ -0,0 +1,266 @@
// SPDX-License-Identifier: GPL-2.0
/* GPLv2, Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. */
#include "xdp_sample.bpf.h"
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
#include <bpf/bpf_helpers.h>
array_map rx_cnt SEC(".maps");
array_map redir_err_cnt SEC(".maps");
array_map cpumap_enqueue_cnt SEC(".maps");
array_map cpumap_kthread_cnt SEC(".maps");
array_map exception_cnt SEC(".maps");
array_map devmap_xmit_cnt SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
__uint(max_entries, 32 * 32);
__type(key, u64);
__type(value, struct datarec);
} devmap_xmit_cnt_multi SEC(".maps");
const volatile int nr_cpus = 0;
/* These can be set before loading so that redundant comparisons can be DCE'd by
* the verifier, and only actual matches are tried after loading tp_btf program.
* This allows sample to filter tracepoint stats based on net_device.
*/
const volatile int from_match[32] = {};
const volatile int to_match[32] = {};
int cpumap_map_id = 0;
/* Find if b is part of set a, but if a is empty set then evaluate to true */
#define IN_SET(a, b) \
({ \
bool __res = !(a)[0]; \
for (int i = 0; i < ARRAY_SIZE(a) && (a)[i]; i++) { \
__res = (a)[i] == (b); \
if (__res) \
break; \
} \
__res; \
})
static __always_inline __u32 xdp_get_err_key(int err)
{
switch (err) {
case 0:
return 0;
case -EINVAL:
return 2;
case -ENETDOWN:
return 3;
case -EMSGSIZE:
return 4;
case -EOPNOTSUPP:
return 5;
case -ENOSPC:
return 6;
default:
return 1;
}
}
static __always_inline int xdp_redirect_collect_stat(int from, int err)
{
u32 cpu = bpf_get_smp_processor_id();
u32 key = XDP_REDIRECT_ERROR;
struct datarec *rec;
u32 idx;
if (!IN_SET(from_match, from))
return 0;
key = xdp_get_err_key(err);
idx = key * nr_cpus + cpu;
rec = bpf_map_lookup_elem(&redir_err_cnt, &idx);
if (!rec)
return 0;
if (key)
NO_TEAR_INC(rec->dropped);
else
NO_TEAR_INC(rec->processed);
return 0; /* Indicate event was filtered (no further processing)*/
/*
* Returning 1 here would allow e.g. a perf-record tracepoint
* to see and record these events, but it doesn't work well
* in-practice as stopping perf-record also unload this
* bpf_prog. Plus, there is additional overhead of doing so.
*/
}
SEC("tp_btf/xdp_redirect_err")
int BPF_PROG(tp_xdp_redirect_err, const struct net_device *dev,
const struct bpf_prog *xdp, const void *tgt, int err,
const struct bpf_map *map, u32 index)
{
return xdp_redirect_collect_stat(dev->ifindex, err);
}
SEC("tp_btf/xdp_redirect_map_err")
int BPF_PROG(tp_xdp_redirect_map_err, const struct net_device *dev,
const struct bpf_prog *xdp, const void *tgt, int err,
const struct bpf_map *map, u32 index)
{
return xdp_redirect_collect_stat(dev->ifindex, err);
}
SEC("tp_btf/xdp_redirect")
int BPF_PROG(tp_xdp_redirect, const struct net_device *dev,
const struct bpf_prog *xdp, const void *tgt, int err,
const struct bpf_map *map, u32 index)
{
return xdp_redirect_collect_stat(dev->ifindex, err);
}
SEC("tp_btf/xdp_redirect_map")
int BPF_PROG(tp_xdp_redirect_map, const struct net_device *dev,
const struct bpf_prog *xdp, const void *tgt, int err,
const struct bpf_map *map, u32 index)
{
return xdp_redirect_collect_stat(dev->ifindex, err);
}
SEC("tp_btf/xdp_cpumap_enqueue")
int BPF_PROG(tp_xdp_cpumap_enqueue, int map_id, unsigned int processed,
unsigned int drops, int to_cpu)
{
u32 cpu = bpf_get_smp_processor_id();
struct datarec *rec;
u32 idx;
if (cpumap_map_id && cpumap_map_id != map_id)
return 0;
idx = to_cpu * nr_cpus + cpu;
rec = bpf_map_lookup_elem(&cpumap_enqueue_cnt, &idx);
if (!rec)
return 0;
NO_TEAR_ADD(rec->processed, processed);
NO_TEAR_ADD(rec->dropped, drops);
/* Record bulk events, then userspace can calc average bulk size */
if (processed > 0)
NO_TEAR_INC(rec->issue);
/* Inception: It's possible to detect overload situations, via
* this tracepoint. This can be used for creating a feedback
* loop to XDP, which can take appropriate actions to mitigate
* this overload situation.
*/
return 0;
}
SEC("tp_btf/xdp_cpumap_kthread")
int BPF_PROG(tp_xdp_cpumap_kthread, int map_id, unsigned int processed,
unsigned int drops, int sched, struct xdp_cpumap_stats *xdp_stats)
{
struct datarec *rec;
u32 cpu;
if (cpumap_map_id && cpumap_map_id != map_id)
return 0;
cpu = bpf_get_smp_processor_id();
rec = bpf_map_lookup_elem(&cpumap_kthread_cnt, &cpu);
if (!rec)
return 0;
NO_TEAR_ADD(rec->processed, processed);
NO_TEAR_ADD(rec->dropped, drops);
NO_TEAR_ADD(rec->xdp_pass, xdp_stats->pass);
NO_TEAR_ADD(rec->xdp_drop, xdp_stats->drop);
NO_TEAR_ADD(rec->xdp_redirect, xdp_stats->redirect);
/* Count times kthread yielded CPU via schedule call */
if (sched)
NO_TEAR_INC(rec->issue);
return 0;
}
SEC("tp_btf/xdp_exception")
int BPF_PROG(tp_xdp_exception, const struct net_device *dev,
const struct bpf_prog *xdp, u32 act)
{
u32 cpu = bpf_get_smp_processor_id();
struct datarec *rec;
u32 key = act, idx;
if (!IN_SET(from_match, dev->ifindex))
return 0;
if (!IN_SET(to_match, dev->ifindex))
return 0;
if (key > XDP_REDIRECT)
key = XDP_REDIRECT + 1;
idx = key * nr_cpus + cpu;
rec = bpf_map_lookup_elem(&exception_cnt, &idx);
if (!rec)
return 0;
NO_TEAR_INC(rec->dropped);
return 0;
}
SEC("tp_btf/xdp_devmap_xmit")
int BPF_PROG(tp_xdp_devmap_xmit, const struct net_device *from_dev,
const struct net_device *to_dev, int sent, int drops, int err)
{
struct datarec *rec;
int idx_in, idx_out;
u32 cpu;
idx_in = from_dev->ifindex;
idx_out = to_dev->ifindex;
if (!IN_SET(from_match, idx_in))
return 0;
if (!IN_SET(to_match, idx_out))
return 0;
cpu = bpf_get_smp_processor_id();
rec = bpf_map_lookup_elem(&devmap_xmit_cnt, &cpu);
if (!rec)
return 0;
NO_TEAR_ADD(rec->processed, sent);
NO_TEAR_ADD(rec->dropped, drops);
/* Record bulk events, then userspace can calc average bulk size */
NO_TEAR_INC(rec->info);
/* Record error cases, where no frame were sent */
/* Catch API error of drv ndo_xdp_xmit sent more than count */
if (err || drops < 0)
NO_TEAR_INC(rec->issue);
return 0;
}
SEC("tp_btf/xdp_devmap_xmit")
int BPF_PROG(tp_xdp_devmap_xmit_multi, const struct net_device *from_dev,
const struct net_device *to_dev, int sent, int drops, int err)
{
struct datarec empty = {};
struct datarec *rec;
int idx_in, idx_out;
u64 idx;
idx_in = from_dev->ifindex;
idx_out = to_dev->ifindex;
idx = idx_in;
idx = idx << 32 | idx_out;
if (!IN_SET(from_match, idx_in))
return 0;
if (!IN_SET(to_match, idx_out))
return 0;
bpf_map_update_elem(&devmap_xmit_cnt_multi, &idx, &empty, BPF_NOEXIST);
rec = bpf_map_lookup_elem(&devmap_xmit_cnt_multi, &idx);
if (!rec)
return 0;
NO_TEAR_ADD(rec->processed, sent);
NO_TEAR_ADD(rec->dropped, drops);
NO_TEAR_INC(rec->info);
if (err || drops < 0)
NO_TEAR_INC(rec->issue);
return 0;
}

View File

@ -0,0 +1,141 @@
// SPDX-License-Identifier: GPL-2.0
#ifndef _XDP_SAMPLE_BPF_H
#define _XDP_SAMPLE_BPF_H
#include "vmlinux.h"
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
#include <bpf/bpf_helpers.h>
#include "xdp_sample_shared.h"
#define ETH_ALEN 6
#define ETH_P_802_3_MIN 0x0600
#define ETH_P_8021Q 0x8100
#define ETH_P_8021AD 0x88A8
#define ETH_P_IP 0x0800
#define ETH_P_IPV6 0x86DD
#define ETH_P_ARP 0x0806
#define IPPROTO_ICMPV6 58
#define EINVAL 22
#define ENETDOWN 100
#define EMSGSIZE 90
#define EOPNOTSUPP 95
#define ENOSPC 28
typedef struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(map_flags, BPF_F_MMAPABLE);
__type(key, unsigned int);
__type(value, struct datarec);
} array_map;
extern array_map rx_cnt;
extern const volatile int nr_cpus;
enum {
XDP_REDIRECT_SUCCESS = 0,
XDP_REDIRECT_ERROR = 1
};
static __always_inline void swap_src_dst_mac(void *data)
{
unsigned short *p = data;
unsigned short dst[3];
dst[0] = p[0];
dst[1] = p[1];
dst[2] = p[2];
p[0] = p[3];
p[1] = p[4];
p[2] = p[5];
p[3] = dst[0];
p[4] = dst[1];
p[5] = dst[2];
}
#if defined(__BYTE_ORDER__) && defined(__ORDER_LITTLE_ENDIAN__) && \
__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define bpf_ntohs(x) __builtin_bswap16(x)
#define bpf_htons(x) __builtin_bswap16(x)
#elif defined(__BYTE_ORDER__) && defined(__ORDER_BIG_ENDIAN__) && \
__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
#define bpf_ntohs(x) (x)
#define bpf_htons(x) (x)
#else
# error "Endianness detection needs to be set up for your compiler?!"
#endif
/*
* Note: including linux/compiler.h or linux/kernel.h for the macros below
* conflicts with vmlinux.h include in BPF files, so we define them here.
*
* Following functions are taken from kernel sources and
* break aliasing rules in their original form.
*
* While kernel is compiled with -fno-strict-aliasing,
* perf uses -Wstrict-aliasing=3 which makes build fail
* under gcc 4.4.
*
* Using extra __may_alias__ type to allow aliasing
* in this case.
*/
typedef __u8 __attribute__((__may_alias__)) __u8_alias_t;
typedef __u16 __attribute__((__may_alias__)) __u16_alias_t;
typedef __u32 __attribute__((__may_alias__)) __u32_alias_t;
typedef __u64 __attribute__((__may_alias__)) __u64_alias_t;
static __always_inline void __read_once_size(const volatile void *p, void *res, int size)
{
switch (size) {
case 1: *(__u8_alias_t *) res = *(volatile __u8_alias_t *) p; break;
case 2: *(__u16_alias_t *) res = *(volatile __u16_alias_t *) p; break;
case 4: *(__u32_alias_t *) res = *(volatile __u32_alias_t *) p; break;
case 8: *(__u64_alias_t *) res = *(volatile __u64_alias_t *) p; break;
default:
asm volatile ("" : : : "memory");
__builtin_memcpy((void *)res, (const void *)p, size);
asm volatile ("" : : : "memory");
}
}
static __always_inline void __write_once_size(volatile void *p, void *res, int size)
{
switch (size) {
case 1: *(volatile __u8_alias_t *) p = *(__u8_alias_t *) res; break;
case 2: *(volatile __u16_alias_t *) p = *(__u16_alias_t *) res; break;
case 4: *(volatile __u32_alias_t *) p = *(__u32_alias_t *) res; break;
case 8: *(volatile __u64_alias_t *) p = *(__u64_alias_t *) res; break;
default:
asm volatile ("" : : : "memory");
__builtin_memcpy((void *)p, (const void *)res, size);
asm volatile ("" : : : "memory");
}
}
#define READ_ONCE(x) \
({ \
union { typeof(x) __val; char __c[1]; } __u = \
{ .__c = { 0 } }; \
__read_once_size(&(x), __u.__c, sizeof(x)); \
__u.__val; \
})
#define WRITE_ONCE(x, val) \
({ \
union { typeof(x) __val; char __c[1]; } __u = \
{ .__val = (val) }; \
__write_once_size(&(x), __u.__c, sizeof(x)); \
__u.__val; \
})
/* Add a value using relaxed read and relaxed write. Less expensive than
* fetch_add when there is no write concurrency.
*/
#define NO_TEAR_ADD(x, val) WRITE_ONCE((x), READ_ONCE(x) + (val))
#define NO_TEAR_INC(x) NO_TEAR_ADD((x), 1)
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
#endif

View File

@ -0,0 +1,17 @@
// SPDX-License-Identifier: GPL-2.0-only
#ifndef _XDP_SAMPLE_SHARED_H
#define _XDP_SAMPLE_SHARED_H
struct datarec {
size_t processed;
size_t dropped;
size_t issue;
union {
size_t xdp_pass;
size_t info;
};
size_t xdp_drop;
size_t xdp_redirect;
} __attribute__((aligned(64)));
#endif

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,108 @@
// SPDX-License-Identifier: GPL-2.0-only
#ifndef XDP_SAMPLE_USER_H
#define XDP_SAMPLE_USER_H
#include <bpf/libbpf.h>
#include <linux/compiler.h>
#include "xdp_sample_shared.h"
enum stats_mask {
_SAMPLE_REDIRECT_MAP = 1U << 0,
SAMPLE_RX_CNT = 1U << 1,
SAMPLE_REDIRECT_ERR_CNT = 1U << 2,
SAMPLE_CPUMAP_ENQUEUE_CNT = 1U << 3,
SAMPLE_CPUMAP_KTHREAD_CNT = 1U << 4,
SAMPLE_EXCEPTION_CNT = 1U << 5,
SAMPLE_DEVMAP_XMIT_CNT = 1U << 6,
SAMPLE_REDIRECT_CNT = 1U << 7,
SAMPLE_REDIRECT_MAP_CNT = SAMPLE_REDIRECT_CNT | _SAMPLE_REDIRECT_MAP,
SAMPLE_REDIRECT_ERR_MAP_CNT = SAMPLE_REDIRECT_ERR_CNT | _SAMPLE_REDIRECT_MAP,
SAMPLE_DEVMAP_XMIT_CNT_MULTI = 1U << 8,
SAMPLE_SKIP_HEADING = 1U << 9,
};
/* Exit return codes */
#define EXIT_OK 0
#define EXIT_FAIL 1
#define EXIT_FAIL_OPTION 2
#define EXIT_FAIL_XDP 3
#define EXIT_FAIL_BPF 4
#define EXIT_FAIL_MEM 5
int sample_setup_maps(struct bpf_map **maps);
int __sample_init(int mask);
void sample_exit(int status);
int sample_run(int interval, void (*post_cb)(void *), void *ctx);
void sample_switch_mode(void);
int sample_install_xdp(struct bpf_program *xdp_prog, int ifindex, bool generic,
bool force);
void sample_usage(char *argv[], const struct option *long_options,
const char *doc, int mask, bool error);
const char *get_driver_name(int ifindex);
int get_mac_addr(int ifindex, void *mac_addr);
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-truncation"
__attribute__((unused))
static inline char *safe_strncpy(char *dst, const char *src, size_t size)
{
if (!size)
return dst;
strncpy(dst, src, size - 1);
dst[size - 1] = '\0';
return dst;
}
#pragma GCC diagnostic pop
#define __attach_tp(name) \
({ \
if (!bpf_program__is_tracing(skel->progs.name)) \
return -EINVAL; \
skel->links.name = bpf_program__attach(skel->progs.name); \
if (!skel->links.name) \
return -errno; \
})
#define sample_init_pre_load(skel) \
({ \
skel->rodata->nr_cpus = libbpf_num_possible_cpus(); \
sample_setup_maps((struct bpf_map *[]){ \
skel->maps.rx_cnt, skel->maps.redir_err_cnt, \
skel->maps.cpumap_enqueue_cnt, \
skel->maps.cpumap_kthread_cnt, \
skel->maps.exception_cnt, skel->maps.devmap_xmit_cnt, \
skel->maps.devmap_xmit_cnt_multi }); \
})
#define DEFINE_SAMPLE_INIT(name) \
static int sample_init(struct name *skel, int mask) \
{ \
int ret; \
ret = __sample_init(mask); \
if (ret < 0) \
return ret; \
if (mask & SAMPLE_REDIRECT_MAP_CNT) \
__attach_tp(tp_xdp_redirect_map); \
if (mask & SAMPLE_REDIRECT_CNT) \
__attach_tp(tp_xdp_redirect); \
if (mask & SAMPLE_REDIRECT_ERR_MAP_CNT) \
__attach_tp(tp_xdp_redirect_map_err); \
if (mask & SAMPLE_REDIRECT_ERR_CNT) \
__attach_tp(tp_xdp_redirect_err); \
if (mask & SAMPLE_CPUMAP_ENQUEUE_CNT) \
__attach_tp(tp_xdp_cpumap_enqueue); \
if (mask & SAMPLE_CPUMAP_KTHREAD_CNT) \
__attach_tp(tp_xdp_cpumap_kthread); \
if (mask & SAMPLE_EXCEPTION_CNT) \
__attach_tp(tp_xdp_exception); \
if (mask & SAMPLE_DEVMAP_XMIT_CNT) \
__attach_tp(tp_xdp_devmap_xmit); \
if (mask & SAMPLE_DEVMAP_XMIT_CNT_MULTI) \
__attach_tp(tp_xdp_devmap_xmit_multi); \
return 0; \
}
#endif

View File

@ -48,4 +48,57 @@ struct ethtool_channels {
__u32 combined_count;
};
#define ETHTOOL_FWVERS_LEN 32
#define ETHTOOL_BUSINFO_LEN 32
#define ETHTOOL_EROMVERS_LEN 32
/**
* struct ethtool_drvinfo - general driver and device information
* @cmd: Command number = %ETHTOOL_GDRVINFO
* @driver: Driver short name. This should normally match the name
* in its bus driver structure (e.g. pci_driver::name). Must
* not be an empty string.
* @version: Driver version string; may be an empty string
* @fw_version: Firmware version string; may be an empty string
* @erom_version: Expansion ROM version string; may be an empty string
* @bus_info: Device bus address. This should match the dev_name()
* string for the underlying bus device, if there is one. May be
* an empty string.
* @reserved2: Reserved for future use; see the note on reserved space.
* @n_priv_flags: Number of flags valid for %ETHTOOL_GPFLAGS and
* %ETHTOOL_SPFLAGS commands; also the number of strings in the
* %ETH_SS_PRIV_FLAGS set
* @n_stats: Number of u64 statistics returned by the %ETHTOOL_GSTATS
* command; also the number of strings in the %ETH_SS_STATS set
* @testinfo_len: Number of results returned by the %ETHTOOL_TEST
* command; also the number of strings in the %ETH_SS_TEST set
* @eedump_len: Size of EEPROM accessible through the %ETHTOOL_GEEPROM
* and %ETHTOOL_SEEPROM commands, in bytes
* @regdump_len: Size of register dump returned by the %ETHTOOL_GREGS
* command, in bytes
*
* Users can use the %ETHTOOL_GSSET_INFO command to get the number of
* strings in any string set (from Linux 2.6.34).
*
* Drivers should set at most @driver, @version, @fw_version and
* @bus_info in their get_drvinfo() implementation. The ethtool
* core fills in the other fields using other driver operations.
*/
struct ethtool_drvinfo {
__u32 cmd;
char driver[32];
char version[32];
char fw_version[ETHTOOL_FWVERS_LEN];
char bus_info[ETHTOOL_BUSINFO_LEN];
char erom_version[ETHTOOL_EROMVERS_LEN];
char reserved2[12];
__u32 n_priv_flags;
__u32 n_stats;
__u32 testinfo_len;
__u32 eedump_len;
__u32 regdump_len;
};
#define ETHTOOL_GDRVINFO 0x00000003
#endif /* _UAPI_LINUX_ETHTOOL_H */