mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-12-28 13:34:38 +08:00
d41bc48bfa
Add benchmark to measure overhead of uprobes and uretprobes. Also have a baseline (no uprobe attached) benchmark. On my dev machine, baseline benchmark can trigger 130M user_target() invocations. When uprobe is attached, this falls to just 700K. With uretprobe, we get down to 520K: $ sudo ./bench trig-uprobe-base -a Summary: hits 131.289 ± 2.872M/s # UPROBE $ sudo ./bench -a trig-uprobe-without-nop Summary: hits 0.729 ± 0.007M/s $ sudo ./bench -a trig-uprobe-with-nop Summary: hits 1.798 ± 0.017M/s # URETPROBE $ sudo ./bench -a trig-uretprobe-without-nop Summary: hits 0.508 ± 0.012M/s $ sudo ./bench -a trig-uretprobe-with-nop Summary: hits 0.883 ± 0.008M/s So there is almost 2.5x performance difference between probing nop vs non-nop instruction for entry uprobe. And 1.7x difference for uretprobe. This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe and 2 microseconds for non-nop uretprobe. For nop variants, uprobe and uretprobe overhead is down to 0.556 and 1.13 microseconds, respectively. For comparison, just doing a very low-overhead syscall (with no BPF programs attached anywhere) gives: $ sudo ./bench trig-base -a Summary: hits 4.830 ± 0.036M/s So uprobes are about 2.67x slower than pure context switch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org |
||
---|---|---|
.. | ||
accounting | ||
arch | ||
bootconfig | ||
bpf | ||
build | ||
cgroup | ||
counter | ||
debugging | ||
edid | ||
firewire | ||
firmware | ||
gpio | ||
hv | ||
iio | ||
include | ||
io_uring | ||
kvm/kvm_stat | ||
laptop | ||
leds | ||
lib | ||
memory-model | ||
objtool | ||
pci | ||
pcmcia | ||
perf | ||
power | ||
rcu | ||
scripts | ||
spi | ||
testing | ||
thermal/tmon | ||
time | ||
tracing | ||
usb | ||
virtio | ||
vm | ||
wmi | ||
Makefile |