linux/tools
Martin KaFai Lau 695ba2651a bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4
After doing map_perf_test with a much bigger
BPF_F_NO_COMMON_LRU map, the perf report shows a
lot of time spent in rotating the inactive list (i.e.
__bpf_lru_list_rotate_inactive):
> map_perf_test 32 8 10000 1000000 | awk '{sum += $3}END{print sum}'
19644783 (19M/s)
> map_perf_test 32 8 10000000 10000000 |  awk '{sum += $3}END{print sum}'
6283930 (6.28M/s)

By inactive, it usually means the element is not in cache.  Hence,
there is a need to tune the PERCPU_NR_SCANS value.

This patch finds a better number of elements to
scan during each list rotation.  The PERCPU_NR_SCANS (which
is defined the same as PERCPU_FREE_TARGET) decreases
from 16 elements to 4 elements.  This change only
affects the BPF_F_NO_COMMON_LRU map.

The test_lru_dist does not show meaningful difference
between 16 and 4.  Our production L4 load balancer which uses
the LRU map for conntrack-ing also shows little change in cache
hit rate.  Since both benchmark and production data show no
cache-hit difference, PERCPU_NR_SCANS is lowered from 16 to 4.
We can consider making it configurable if we find a usecase
later that shows another value works better and/or use
a different rotation strategy.

After this change:
> map_perf_test 32 8 10000000 10000000 |  awk '{sum += $3}END{print sum}'
9240324 (9.2M/s)

i.e. 6.28M/s -> 9.2M/s

The test_lru_dist has not shown meaningful difference:
> test_lru_dist zipf.100k.a1_01.out 4000 1:
nr_misses: 31575 (Before) vs 31566 (After)

> test_lru_dist zipf.100k.a0_01.out 40000 1
nr_misses: 67036 (Before) vs 67031 (After)

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:55:52 -04:00
..
accounting tools: move accounting tool from Documentation 2016-09-23 13:07:15 -06:00
arch Merge branch 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-20 14:37:08 -08:00
build perf build: Add special fixdep cleaning rule 2017-02-17 16:04:38 -03:00
cgroup
firewire
gpio gpio-hammer: fix make consumer_label suitable to work on gpio-nails 2017-01-26 16:29:09 +01:00
hv tools: hv: Add clean up function for Ubuntu config 2017-03-06 17:10:40 -08:00
iio iio: Add channel for Gravity 2017-01-05 13:02:25 +00:00
include bpf: fix comment typo 2017-04-09 18:26:08 -07:00
kvm/kvm_stat
laptop tools: move laptops dslm tool from Documentation 2016-09-23 13:07:21 -06:00
leds tools/leds: Add led_hw_brightness_mon program 2017-02-14 22:20:23 +01:00
lguest scripts/spelling.txt: add "overide" pattern and fix typo instances 2017-03-09 17:01:09 -08:00
lib tools/lib/bpf: expose bpf_program__set_type() 2017-04-01 12:45:57 -07:00
net tools: bpf_jit_disasm: Add option to dump JIT image to a file. 2017-04-13 13:04:03 -04:00
nfsd
objtool objtool: Fix another GCC jump table detection issue 2017-03-07 08:42:55 +01:00
pcmcia tools: move pcmcia crc32hash tool from Documentation 2016-09-23 13:07:27 -06:00
perf perf/urgent annotate fix for s390: 2017-04-11 21:41:39 +02:00
power cpupower: Fix turbo frequency reporting for pre-Sandy Bridge cores 2017-04-13 14:51:10 +02:00
scripts tools: Suppress request for warning options not existent in clang 2017-02-14 10:34:35 -03:00
spi spi: spidev_test: Fix input file check when transferring file 2016-11-04 09:56:09 -06:00
testing bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
thermal/tmon
time
usb USB: changes for v4.11 2017-01-26 15:36:28 +01:00
virtio tools/virtio/ringtest: tweaks for s390 2017-01-19 23:46:32 +02:00
vm tools/vm: add missing Makefile rules 2017-02-22 16:41:26 -08:00
Makefile tools/leds: Add uledmon program for monitoring userspace LEDs 2016-11-22 12:07:02 +01:00