linux/tools/perf/util/bpf_skel
Namhyung Kim b5711042a1 perf lock contention: Use per-cpu array map for spinlocks
Currently lock contention timestamp is maintained in a hash map keyed by
pid.  That means it needs to get and release a map element (which is
proctected by spinlock!) on each contention begin and end pair.  This
can impact on performance if there are a lot of contention (usually from
spinlocks).

It used to go with task local storage but it had an issue on memory
allocation in some critical paths.  Although it's addressed in recent
kernels IIUC, the tool should support old kernels too.  So it cannot
simply switch to the task local storage at least for now.

As spinlocks create lots of contention and they disabled preemption
during the spinning, it can use per-cpu array to keep the timestamp to
avoid overhead in hashmap update and delete.

In contention_begin, it's easy to check the lock types since it can see
the flags.  But contention_end cannot see it.  So let's try to per-cpu
array first (unconditionally) if it has an active element (lock != 0).
Then it should be used and per-task tstamp map should not be used until
the per-cpu array element is cleared which means nested spinlock
contention (if any) was finished and it nows see (the outer) lock.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20231020204741.1869520-3-namhyung@kernel.org
2023-10-25 10:02:55 -07:00
..
vmlinux perf tools: Do not ignore the default vmlinux.h 2023-10-18 15:35:20 -07:00
.gitignore perf build: Add ability to build with a generated vmlinux.h 2023-06-23 21:35:45 -07:00
augmented_raw_syscalls.bpf.c perf trace: Use the right bpf_probe_read(_str) variant for reading user data 2023-10-19 22:41:46 -07:00
bench_uprobe.bpf.c perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk 2023-07-20 11:33:24 -03:00
bperf_cgroup.bpf.c perf stat: Support old kernels for bperf cgroup counting 2022-10-14 10:29:05 -03:00
bperf_follower.bpf.c perf bpf_skel: Do not use typedef to avoid error on old clang 2021-12-06 21:57:53 -03:00
bperf_leader.bpf.c perf bpf_skel: Do not use typedef to avoid error on old clang 2021-12-06 21:57:53 -03:00
bperf_u.h perf stat: Introduce 'bperf' to share hardware PMCs with BPF 2021-03-23 17:46:44 -03:00
bpf_prog_profiler.bpf.c perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros 2021-12-06 21:57:53 -03:00
func_latency.bpf.c perf ftrace latency: Add -n/--use-nsec option 2022-03-22 17:43:46 -03:00
kwork_top.bpf.c perf kwork top: Add BPF-based statistics on softirq event support 2023-09-12 17:31:59 -03:00
kwork_trace.bpf.c perf kwork: Add workqueue trace BPF support 2022-07-26 16:31:54 -03:00
lock_contention.bpf.c perf lock contention: Use per-cpu array map for spinlocks 2023-10-25 10:02:55 -07:00
lock_data.h perf lock contention: Add --lock-cgroup option 2023-09-12 17:32:00 -03:00
off_cpu.bpf.c perf test: Fix offcpu test prev_state check 2023-02-19 07:58:23 -03:00
sample_filter.bpf.c perf bpf filter: Fix a broken perf sample data naming for BPF CO-RE 2023-05-26 15:21:08 -03:00
sample-filter.h perf bpf filter: Add logical OR operator 2023-03-15 11:08:36 -03:00