linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-29 07:34:06 +08:00

Author	SHA1	Message	Date
Jiri Olsa	77e0faf855	perf stat: Pass 'evlist' to aggr_update_shadow() Pass a 'evlist' argument to aggr_update_shadow(), to get rid of the global 'evsel_list' variable dependency. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-31-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:24 -03:00
Jiri Olsa	ae2d7da554	perf stat: Pass 'struct perf_stat_config' to first_shadow_cpu() Pass a 'struct perf_stat_config' arg to first_shadow_cpu(), so that the function does not depend on the 'perf stat' command object local 'stat_config' variable and can then be moved out. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-30-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:24 -03:00
Jiri Olsa	ee1760e2cf	perf stat: Move 'metric_only_len' to 'struct perf_stat_config' Move the static 'metric_only_len' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-29-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	d97ae04b3d	perf stat: Move 'run_count' to 'struct perf_stat_config' Move the static 'run_count' variable to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-28-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	0c538a9462	perf stat: Use 'evsel->evlist' instead of 'evsel_list' in collect_all_aliases() Use 'evsel->evlist' instead of 'evsel_list' in collect_all_aliases(), to get rid of the global 'evsel_list' variable dependency. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-27-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	bc0bcda201	perf stat: Pass 'evlist' argument to print functions Add 'evlist' argument to print functions to get rid of the global 'evsel_list' variable dependency. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-26-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	c512e0eae4	perf stat: Add 'target' argument to perf_evlist__print_counters() Add 'struct target' argument to perf_evlist__print_counters(), so the function does not depend on the 'perf stat' command object local target and can be moved out. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-25-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	df4f7b4d4b	perf stat: Move 'unit_width' to 'struct perf_stat_config' Move the static 'unit_width' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-24-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	0ce5aa0266	perf stat: Move 'metric_only' to 'struct perf_stat_config' Move the static 'metric_only' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	132c6ba3c4	perf stat: Move 'interval_clear' to 'struct perf_stat_config' Move the static 'interval_clear' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	fa7070a386	perf stat: Move csv_* to 'struct perf_stat_config' Move the static csv_* variables to 'struct perf_stat_config', so that it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	6ca9a082b1	perf stat: Pass a 'struct perf_stat_config' argument to global print functions Add 'struct perf_stat_config' argument to the global print functions, so that these functions can be used out of the 'perf stat' command code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	f3ca50e61f	perf stat: Pass 'struct perf_stat_config' argument to local print functions Add 'struct perf_stat_config' argument to print functions, so that those functions can be moved out of the 'perf stat' command to a generic class in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-19-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:23 -03:00
Jiri Olsa	b64df7f337	perf stat: Add 'struct perf_stat_config' argument to perf_evlist__print_counters() Add a 'struct perf_stat_config' argument to perf_evlist__print_counters(), so that it can be moved out of the 'perf stat' command to generic object in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-18-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	0174820a8b	perf stat: Move STAT_RECORD out of perf_evlist__print_counters() It's stat related and should stay in the 'perf stat' command. The perf_evlist__print_counters function will be moved out in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-17-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	a5a9eac1a0	perf stat: Introduce perf_evlist__print_counters() To be in charge of printing out the stat output. It will be moved out of the 'perf stat' command in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	0a4e64d391	perf stat: Move perf_stat_synthesize_config() to stat.c So that it can be used globally. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	c2c247f2dd	perf stat: Add 'perf_event__handler_t' argument to perf_stat_synthesize_config() So that it's completely independent and can be used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-14-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	1c21e9899d	perf stat: Add 'struct perf_evlist' argument to perf_stat_synthesize_config() Get rid of the 'evsel_list' global variable dependency, here in perf_stat_synthesize_config() we are adding the 'evlist' arg. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	1821f4eb48	perf stat: Add 'struct perf_tool' argument to perf_stat_synthesize_config() So that we can use the function outside the 'perf stat' command with standard synthesize functions, that take 'struct perf_tool *' argument. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	73d586c391	perf stat: Add 'struct perf_stat_config' argument to perf_stat_synthesize_config() Add a 'struct perf_stat_config' argument to perf_stat_synthesize_config(), so we could synthesize arbitrary config. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	491073a612	perf stat: Rename 'is_pipe' argument to 'attrs' in perf_stat_synthesize_config() The attrs name makes more sense. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	d09cefd2ef	perf stat: Move create_perf_stat_counter() to stat.c Move create_perf_stat_counter() to the 'stat' class, so that we can use it globally. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:22 -03:00
Jiri Olsa	650d622046	perf evsel: Introduce perf_evsel__store_ids() Add perf_evsel__store_ids() from stat's store_counter_ids() code to the evsel class, so that it can be used globally. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	318ec1841a	perf tools: Switch 'session' argument to 'evlist' in perf_event__synthesize_attrs() To be able to pass in other than session's evlist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	7d9ad16afe	perf stat: Add 'identifier' flag to 'struct perf_stat_config' Add 'identifier' flag to 'struct perf_stat_config' to carry the info whether to use PERF_SAMPLE_IDENTIFIER for events. This makes create_perf_stat_counter() independent. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	35386233fc	perf stat: Use local config arg for scale in create_perf_stat_counter() Use the local 'scale' member in the 'struct perf_stat_config' argument instead of the global 'stat_config' variable, to make the function independent. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	5698f26b46	perf stat: Move 'no_inherit' to 'struct perf_stat_config' Move the static 'no_inherit' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	728c0ee0a8	perf stat: Move 'initial_delay' to 'struct perf_stat_config' Move the static 'initial_delay' variable to 'struct perf_stat_config', so it can be passed around and used outside the 'perf stat' command. Add 'struct perf_stat_config' argument to create_perf_stat_counter() and use its 'initial_delay' member instead of the static one. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Jiri Olsa	d50ed0ce82	perf stat: Use evsel->threads in create_perf_stat_counter() Get rid of the evsel_list dependency, here we can use the evsel->threads copy of the struct thread_map. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830063252.23729-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Arnaldo Carvalho de Melo	c4191e55b8	perf trace: Show comm and tid for tracepoint events So that all events have that info, improving reading by having information better aligned, etc. Before: # echo 1 > /proc/sys/vm/drop_caches # perf trace -e block:,ext4:,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c 0.000 ( ): #include <stdio.h> int syscall_enter(openat)(void args) { puts("Hello, world\n"); return 0; } license(GPL); cat/2731 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 0.025 ( ): syscalls:sys_exit_openat:0x3 0.063 ( 0.022 ms): cat/2731 close(fd: 3) = 0 0.110 ( ): cat/2731 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) 0.123 ( ): syscalls:sys_exit_openat:0x3 0.243 ( 0.008 ms): cat/2731 close(fd: 3) = 0 0.485 ( ): cat/2731 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) 0.500 ( ): syscalls:sys_exit_open:0x3 0.531 ( 0.017 ms): cat/2731 close(fd: 3) = 0 0.587 ( ): cat/2731 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c) 0.601 ( ): syscalls:sys_exit_openat:0x3 0.631 ( ): ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0 0.639 ( ): ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10 0.654 ( ): block:block_bio_queue:253,2 R 42213208 + 8 [cat] 0.663 ( ): block:block_bio_remap:8,0 R 58206040 + 8 <- (253,2) 42213208 0.671 ( ): block:block_bio_remap:8,0 R 175570776 + 8 <- (8,6) 58206040 0.678 ( ): block:block_bio_queue:8,0 R 175570776 + 8 [cat] 0.692 ( ): block:block_getrq:8,0 R 175570776 + 8 [cat] 0.700 ( ): block:block_plug:[cat] 0.708 ( ): block:block_rq_insert:8,0 R 4096 () 175570776 + 8 [cat] 0.713 ( ): block:block_unplug:[cat] 1 0.716 ( ): block:block_rq_issue:8,0 R 4096 () 175570776 + 8 [cat] 0.949 ( 0.007 ms): cat/2731 close(fd: 3) = 0 0.969 ( 0.006 ms): cat/2731 close(fd: 1) = 0 0.982 ( 0.006 ms): cat/2731 close(fd: 2) = 0 # After: # echo 1 > /proc/sys/vm/drop_caches # perf trace -e block:,ext4:,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c 0.000 ( ): cat/1380 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)#include <stdio.h> int syscall_enter(openat)(void args) { puts("Hello, world\n"); return 0; } license(GPL); 0.024 ( ): cat/1380 syscalls:sys_exit_openat:0x3 0.063 ( 0.024 ms): cat/1380 close(fd: 3) = 0 0.114 ( ): cat/1380 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) 0.127 ( ): cat/1380 syscalls:sys_exit_openat:0x3 0.247 ( 0.009 ms): cat/1380 close(fd: 3) = 0 0.484 ( ): cat/1380 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) 0.499 ( ): cat/1380 syscalls:sys_exit_open:0x3 0.613 ( 0.010 ms): cat/1380 close(fd: 3) = 0 0.662 ( ): cat/1380 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c) 0.678 ( ): cat/1380 syscalls:sys_exit_openat:0x3 0.712 ( ): cat/1380 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0 0.721 ( ): cat/1380 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10 0.734 ( ): cat/1380 block:block_bio_queue:253,2 R 42213208 + 8 [cat] 0.745 ( ): cat/1380 block:block_bio_remap:8,0 R 58206040 + 8 <- (253,2) 42213208 0.754 ( ): cat/1380 block:block_bio_remap:8,0 R 175570776 + 8 <- (8,6) 58206040 0.761 ( ): cat/1380 block:block_bio_queue:8,0 R 175570776 + 8 [cat] 0.780 ( ): cat/1380 block:block_getrq:8,0 R 175570776 + 8 [cat] 0.791 ( ): cat/1380 block:block_plug:[cat] 0.802 ( ): cat/1380 block:block_rq_insert:8,0 R 4096 () 175570776 + 8 [cat] 0.806 ( ): cat/1380 block:block_unplug:[cat] 1 0.810 ( ): cat/1380 block:block_rq_issue:8,0 R 4096 () 175570776 + 8 [cat] 1.005 ( 0.011 ms): cat/1380 close(fd: 3) = 0 1.031 ( 0.008 ms): cat/1380 close(fd: 1) = 0 1.048 ( 0.008 ms): cat/1380 close(fd: 2) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-us1mwsupxffs4jlm3uqm5dvj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Arnaldo Carvalho de Melo	f5b076dc01	perf trace augmented_syscalls: Hook into syscalls:sys_exit_SYSCALL too Hook the pair enter/exit when using augmented_{filename,sockaddr,etc}_syscall(), this way we'll be able to see what entries are in the ELF sections generated from augmented_syscalls.c and filter them out from the main raw_syscalls:* tracepoints used by 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-cyav42qj5yylolw4attcw99z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Arnaldo Carvalho de Melo	4c8f0a726e	perf trace augmented_syscalls: Rename augmented__syscall__enter to just _syscall As we'll also hook into the syscalls:sys_exit_SYSCALL for which there are enter hooks. This way we'll be able to iterate the ELF file for the eBPF program, find the syscalls that have hooks and filter them out from the general raw_syscalls:sys_{enter,exit} tracepoint for not-yet-augmented (the ones with pointer arguments not yet being attached to the usual syscalls tracepoint payload) and non augmentable syscalls (syscalls without pointer arguments). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-cl1xyghwb1usp500354mv37h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:21 -03:00
Arnaldo Carvalho de Melo	5e2d8a5acc	perf augmented_syscalls: Update the header comments Reflecting the fact that it now augments more than syscalls:sys_enter_SYSCALL tracepoints that have filename strings as args. Also mention how the extra data is handled by the by now modified 'perf trace' beautifiers, that will use special "augmented" beautifiers when extra data is found after the expected syscall enter/exit tracepoints. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ybskanehmdilj5fs7080nz1g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	664b6a95d7	perf bpf: Add syscall_exit() helper So that we can hook to the syscalls:sys_exit_SYSCALL tracepoints in addition to the syscalls:sys_enter_SYSCALL we hook using the syscall_enter() helper. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6qh8aph1jklyvdu7w89c0izc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Tzvetomir Stoyanov (VMware)	266b851cc2	tools lib traceevent, perf tools: Split trace-seq related APIs in a separate header file In order to make libtraceevent into a proper library, all its APIs should be defined in corresponding header files. This patch splits trace-seq related APIs in a separate header file: trace-seq.h Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20180828185038.2dcb2743@gandalf.local.home Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Thomas Richter	766e0618e4	perf report: Create auxiliary trace data files for s390 Create auxiliary trace data log files when invoked with option --itrace=d as in: [root@s35lp76 perf] perf report -i perf.data.aux1 --stdio --itrace=d perf report creates several data files in the current directory named aux.smp.## where ## is a 2 digit hex number with leading zeros representing the CPU number this trace data was recorded from. The file contents is binary and contains the CPU-Measurement Sampling Data Blocks (SDBs). The directory to save the auxiliary trace buffer can be changed using the perf config file and command. Specify section 'auxtrace' keyword 'dumpdir' and assign it a valid directory name. If the directory does not exist or has the wrong file type, the current directory is used. [root@p23lp27 perf]# perf config auxtrace.dumpdir=/tmp [root@p23lp27 perf]# perf config --user -l auxtrace.dumpdir=/tmp [root@p23lp27 perf]# perf report ... [root@p23lp27 perf]# ll /tmp/aux.smp.00 -rw-r--r-- 1 root root 204800 Aug 2 13:48 /tmp/aux.smp.00 [root@p23lp27 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180809045650.89197-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	b043cb524d	perf trace beauty: Reorganize 'struct sockaddr *' beautifier Use an array to multiplex by sockaddr->sa_family, this way adding new families gets a bit easier and tidy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v3s85ra659tc40g1s1xaqoun@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	6ebb686225	perf trace augmented_syscalls: Augment sendto's 'addr' arg Its a 'struct sockaddr' pointer, augment it with the same beautifier as for 'connect' and 'bind', that all receive from userspace that pointer. Doing it in the other direction remains to be done, hooking at the syscalls:sys_exit_{accept4?,recvmsg} tracepoints somehow. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-k2eu68lsphnm2fthc32gq76c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	02ef288420	perf trace augmented_syscalls: Augment bind's 'myaddr' sockaddr arg One more, to reuse the augmented_sockaddr_syscall_enter() macro introduced from the augmentation of connect's sockaddr arg, also to get a subset of the struct arg augmentations done using the manual method, before switching to something automatic, using tracefs's format file or, even better, BTF containing the syscall args structs. # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c 0.000 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: NETLINK }, addrlen: 12) 1.752 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: INET, port: 22, addr: 0.0.0.0 }, addrlen: 16) 1.924 sshd/11479 bind(fd: 4<socket:[170338]>, umyaddr: { .family: INET6, port: 22, addr: :: }, addrlen: 28) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-a2drqpahpmc7uwb3n3gj2plu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	24a6c2cd1d	perf trace augmented_syscalls: Add augmented_sockaddr_syscall_enter() From the one for 'connect', so that we can use it with sendto and others that receive a 'struct sockaddr'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-8bdqv1q0ndcjl1nqns5r5je2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	d5a7e6613b	perf trace augmented_syscalls: Augment connect's 'sockaddr' arg As the first example of augmenting something other than a 'filename', augment the 'struct sockaddr' argument for the 'connect' syscall: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c ssh -6 fedorapeople.org 0.000 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 0.042 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 1.329 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 1.362 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 1.458 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 1.478 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110) 1.683 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET, port: 53, addr: 192.168.43.1 }, addrlen: 16) 4.710 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET6, port: 22, addr: 2610:28:3090:3001:5054:ff:fea7:9474 }, addrlen: 28) root@fedorapeople.org: Permission denied (publickey). # This is still just augmenting the syscalls:sys_enter_connect part, later we'll wire this up to augment the enter+exit combo, like in the tradicional 'perf trace' and 'strace' outputs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-s7l541cbiqb22ifio6z7dpf6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	403f833d15	perf bpf: Add linux/socket.h to the headers accessible to bpf proggies So that we don't have to define sockaddr_storage in the augmented_syscalls.c bpf example when hooking into syscalls needing it, idea is to mimic the system headers. Eventually we probably need to have sys/socket.h, etc. Start by having at least linux/socket.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-yhzarcvsjue8pgpvkjhqgioc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	d35b168c3d	perf bpf: Give precedence to bpf header dir I need to check the need for $KERNEL_INC_OPTIONS when building eBPF restricted C programs, for now just give precedence to $PERF_BPF_INC_OPTIONS so that we can get a linux/socket.h usable in eBPF programs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-5z7qw529sdebrn9y1xxqw9hf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:20 -03:00
Arnaldo Carvalho de Melo	9ab5aadebe	perf trace: Add a etcsnoop.c augmented syscalls eBPF utility We need to put common stuff into a separate header in tools/perf/include/bpf/ for these augmented syscalls, but I couldn't resist adding a etcsnoop.c tool, combining augmented syscalls + filtering, that in the future will be passed from 'perf trace''s command line, to use in building the eBPF program to do that specific filtering at the source, inside the kernel: Running system wide: (hope there isn't any embarassing stuff here... ;-) ) # perf trace -e tools/perf/examples/bpf/etcsnoop.c 0.000 sed/21878 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 1741.473 cat/21883 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 1741.892 cat/21883 openat(dfd: CWD, filename: /etc/passwd) 1748.948 sed/21886 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 1777.136 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1777.738 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.158 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.528 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.595 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.901 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.939 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.966 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1778.992 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.019 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.045 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.071 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.095 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.121 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.148 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.175 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.202 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.229 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.254 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.279 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.309 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.336 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.363 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.388 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.414 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.442 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.470 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.500 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.529 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.557 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.586 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.617 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.648 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.679 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.706 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.739 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.769 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.798 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.823 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.844 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.862 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.880 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.911 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.942 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1779.972 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1780.004 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 1780.035 gvfs-udisks2-v/2302 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 13059.154 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13060.739 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13061.990 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13063.177 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13064.265 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13065.483 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13067.383 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13068.902 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13069.922 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13070.915 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13072.612 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13074.816 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13077.343 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13078.731 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13559.064 DNS Res~er #22/21054 open(filename: /etc/hosts, flags: CLOEXEC) 22419.522 sed/21896 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 24473.313 git/21900 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 24491.988 less/21901 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 24493.793 git/21901 openat(dfd: CWD, filename: /etc/sysless) 24565.772 sed/21924 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 25878.752 git/21928 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 26075.666 git/21928 open(filename: /etc/localtime, flags: CLOEXEC) 26075.565 less/21929 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 26076.060 less/21929 openat(dfd: CWD, filename: /etc/sysless) 26346.395 sed/21932 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 26483.583 sed/21938 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 26954.890 sed/21944 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 27016.165 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime) 27016.414 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime) 27712.313 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) 27712.616 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) 27829.035 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27829.368 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27829.584 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27829.800 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27830.107 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27830.521 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 27961.516 git/21948 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 27987.568 less/21949 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 27988.948 bash/21949 openat(dfd: CWD, filename: /etc/sysless) 28043.536 sed/21972 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 28736.008 sed/21978 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 34882.664 git/21991 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 34882.664 sort/21990 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 34884.441 uniq/21992 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 35593.098 git/21997 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 35638.839 git/21997 openat(dfd: CWD, filename: /etc/gitattributes) 35702.851 sed/22000 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 36076.039 sed/22006 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 37569.049 git/22014 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 37673.712 git/22014 open(filename: /etc/localtime, flags: CLOEXEC) 37781.710 vim/22040 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 37783.667 git/22040 openat(dfd: CWD, filename: /etc/vimrc) 37792.394 git/22040 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 37792.436 git/22040 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 37792.580 git/22040 open(filename: /etc/passwd, flags: CLOEXEC) 43893.625 DNS Res~er #23/21365 open(filename: /etc/hosts, flags: CLOEXEC) 48060.409 nm-dhcp-helper/22044 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48071.745 systemd/1 openat(dfd: CWD, filename: /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service, flags: CLOEXEC\|NOFOLLOW\|NOCTTY) 48082.780 nm-dispatcher/22049 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48111.418 systemd/22049 open(filename: /etc/NetworkManager/dispatcher.d, flags: CLOEXEC\|DIRECTORY\|NONBLOCK) 48111.904 systemd/22049 open(filename: /etc/localtime, flags: CLOEXEC) 48118.357 00-netreport/22052 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48119.668 systemd/22052 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48119.762 systemd/22052 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48119.887 systemd/22052 open(filename: /etc/passwd, flags: CLOEXEC) 48120.025 systemd/22052 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/00-netreport) 48124.144 hostname/22054 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48125.492 systemd/22052 openat(dfd: CWD, filename: /etc/init.d/functions) 48127.253 systemd/22052 openat(dfd: CWD, filename: /etc/profile.d/lang.sh) 48127.388 systemd/22052 openat(dfd: CWD, filename: /etc/locale.conf) 48137.749 cat/22056 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48143.519 04-iscsi/22058 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48144.438 04-iscsi/22058 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48144.478 04-iscsi/22058 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48144.577 04-iscsi/22058 open(filename: /etc/passwd, flags: CLOEXEC) 48144.819 04-iscsi/22058 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/04-iscsi) 48145.620 10-ifcfg-rh-ro/22059 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48146.169 systemd/22059 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48146.207 systemd/22059 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48146.287 systemd/22059 open(filename: /etc/passwd, flags: CLOEXEC) 48146.387 systemd/22059 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/10-ifcfg-rh-routes.sh) 48147.215 11-dhclient/22060 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48147.787 11-dhclient/22060 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48147.813 11-dhclient/22060 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48147.929 11-dhclient/22060 open(filename: /etc/passwd, flags: CLOEXEC) 48148.016 11-dhclient/22060 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/11-dhclient) 48148.906 grep/22063 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48151.165 11-dhclient/22060 openat(dfd: CWD, filename: /etc/sysconfig/network) 48151.560 11-dhclient/22060 open(filename: /etc/dhcp/dhclient.d/, flags: CLOEXEC\|DIRECTORY\|NONBLOCK) 48151.704 11-dhclient/22060 openat(dfd: CWD, filename: /etc/dhcp/dhclient.d/chrony.sh) 48153.593 20-chrony/22065 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48154.695 20-chrony/22065 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48154.756 20-chrony/22065 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48154.914 20-chrony/22065 open(filename: /etc/passwd, flags: CLOEXEC) 48155.067 20-chrony/22065 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/20-chrony) 48156.962 25-polipo/22066 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48157.824 systemd/22066 open(filename: /etc/nsswitch.conf, flags: CLOEXEC) 48157.866 systemd/22066 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 48157.981 systemd/22066 open(filename: /etc/passwd, flags: CLOEXEC) 48158.090 systemd/22066 openat(dfd: CWD, filename: /etc/NetworkManager/dispatcher.d/25-polipo) 48533.616 gsd-housekeepi/2412 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC) 87122.021 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime) 87122.146 gsd-color/1762 openat(dfd: CWD, filename: /etc/localtime) 87825.582 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) 87825.844 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) 87829.524 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 87830.531 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 87831.288 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 87832.011 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 87832.672 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) 87833.276 gnome-shell/2125 openat(dfd: CWD, filename: /etc/localtime) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0o770jvdcy04ee6vhv6v471m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	16cc63593f	perf trace: Augment 'newstat' (aka 'stat') filename ptr This one will need some more work, that 'statbuf' pointer requires a beautifier in 'perf trace'. # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c 0.000 weechat/3596 stat(filename: /etc/localtime, statbuf: 0x7ffd87d11f60) 0.186 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_stat/format) 0.279 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_newstat/for) 0.670 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form) 60.805 DNS Res~er #20/21308 stat(filename: /etc/resolv.conf, statbuf: 0x7ffa733fe4a0) 60.836 DNS Res~er #20/21308 open(filename: /etc/hosts, flags: CLOEXEC) 60.931 perf/29818 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format) 607.070 DNS Res~er #21/29812 stat(filename: /etc/resolv.conf, statbuf: 0x7ffa5e1fe3f0) 607.098 DNS Res~er #21/29812 open(filename: /etc/hosts, flags: CLOEXEC) 999.336 weechat/3596 stat(filename: /etc/localtime, statbuf: 0x7ffd87d11f60) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4lhabe7m4uzo76lnqpyfmnvk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	f6618ce6c0	perf trace: Introduce augmented_filename_syscall_enter() declarator Helping with tons of boilerplate for syscalls that only want to augment a filename. Now supporting one such syscall is just a matter of declaring its arguments struct + using: augmented_filename_syscall_enter(openat); Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ls7ojdseu8fxw7fvj77ejpao@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	9779fc0214	perf trace: Augment inotify_add_watch pathname syscall arg Again, just changing tools/perf/examples/bpf/augmented_syscalls.c, that is starting to have too much boilerplate, some macro will come to the rescue. # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c 0.000 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/cache/app-info/yaml, mask: 16789454) 0.023 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/app-info/xmls, mask: 16789454) 0.028 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/app-info/yaml, mask: 16789454) 0.032 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/share/app-info/yaml, mask: 16789454) 0.039 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/local/share/app-info/xmls, mask: 16789454) 0.045 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /usr/local/share/app-info/yaml, mask: 16789454) 0.049 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /home/acme/.local/share/app-info/yaml, mask: 16789454) 0.056 gmain/2590 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: , mask: 16789454) 0.010 gmain/2245 inotify_add_watch(fd: 7<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) 0.087 perf/20116 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_inotify_add) 0.436 perf/20116 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form) 56.042 gmain/2791 inotify_add_watch(fd: 4<anon_inode:inotify>, pathname: /var/lib/fwupd/remotes.d/lvfs-testing, mask: 16789454) 113.986 gmain/1721 inotify_add_watch(fd: 3<anon_inode:inotify>, pathname: /var/lib/gdm/~, mask: 16789454) 3777.265 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) 3777.550 gsd-color/2408 openat(dfd: CWD, filename: /etc/localtime) ^C[root@jouet perf]# Still not combining raw_syscalls:sys_enter + raw_syscalls:sys_exit, to get it strace-like, but that probably will come very naturally with some more wiring up... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ol83juin2cht9vzquynec5hz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	daa1284af3	perf trace: Augment the 'open' syscall 'filename' arg As described in the previous cset, all we had to do was to touch the augmented_syscalls.c eBPF program, fire up 'perf trace' with that new eBPF script in system wide mode and wait for 'open' syscalls, in addition to 'openat' ones to see that it works: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c 0.000 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs.js, flags: CREAT\|EXCL\|TRUNC\|WRONLY, mode: IRUSR\|IWUSR) 0.065 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs-1.js, flags: CREAT\|EXCL\|TRUNC\|WRONLY, mode: IRUSR\|IWUSR) 0.435 StreamT~s #200/16150 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/fqxhj76d.default/prefs-1.js, flags: CREAT\|TRUNC\|WRONLY, mode: IRUSR\|IWUSR) 1.875 perf/16772 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/form) 1227.260 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat) 1227.397 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat) 7227.619 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat) 7227.661 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat) 10018.079 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat) 10018.514 perf/16772 openat(dfd: CWD, filename: /proc/1237/status) 10018.568 perf/16772 openat(dfd: CWD, filename: /proc/1237/status) 10022.409 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat) 10090.044 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat) 10090.351 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 10090.407 perf/16772 openat(dfd: CWD, filename: /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format) 10091.763 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat) 10091.812 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 10092.807 NetworkManager/1237 openat(dfd: CWD, filename: /proc/2125/stat) 10092.851 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 10094.650 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat) 10094.926 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 10096.010 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat) 10096.057 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 10097.056 NetworkManager/1237 openat(dfd: CWD, filename: /proc/1463/stat) 10097.099 NetworkManager/1237 open(filename: /etc/passwd, flags: CLOEXEC) 13228.345 gnome-shell/1463 openat(dfd: CWD, filename: /proc/self/stat) 13232.734 gnome-shell/2125 openat(dfd: CWD, filename: /proc/self/stat) 15198.956 lighttpd/16748 open(filename: /proc/loadavg, mode: ISGID\|IXOTH) ^C# It even catches 'perf' itself looking at the sys_enter_open and sys_enter_openat tracefs format dictionaries when it first finds them in the trace... :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-upmogc57uatljr6el6u8537l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	75d1e30681	perf trace: Use the augmented filename, expanding syscall enter pointers This is the final touch in showing how a syscall argument beautifier can access the augmented args put in place by the tools/perf/examples/bpf/augmented_syscalls.c eBPF script, right after the regular raw syscall args, i.e. the up to 6 long integer values in the syscall interface. With this we are able to show the 'openat' syscall arg, now with up to 64 bytes, but in time this will be configurable, just like with the 'strace -s strsize' argument, from 'strace''s man page: -s strsize Specify the maximum string size to print (the default is 32). This actually is the maximum string to _collect_ and store in the ring buffer, not just print. Before: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC) 0.017 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC) = 3 0.049 ( ): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC) 0.051 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC) = 3 0.377 ( ): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b) 0.379 ( 0.005 ms): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b) = 3 # After: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): cat/11966 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) 0.006 ( 0.006 ms): cat/11966 openat(dfd: CWD, filename: 0x4bfdcda8, flags: CLOEXEC) = 3 0.034 ( ): cat/11966 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) 0.036 ( 0.008 ms): cat/11966 openat(dfd: CWD, filename: 0x4c1e4ce0, flags: CLOEXEC) = 3 0.375 ( ): cat/11966 openat(dfd: CWD, filename: /etc/passwd) 0.377 ( 0.005 ms): cat/11966 openat(dfd: CWD, filename: 0xe87906b) = 3 # This cset should show all the aspects of establishing a protocol between an eBPF syscall arg augmenter program, tools/perf/examples/bpf/augmented_syscalls.c and a 'perf trace' beautifier, the one associated with all 'char ' point syscall args with names that can heuristically be associated with filenames. Now to wire up 'open' to show a second syscall using this scheme, all we have to do now is to change tools/perf/examples/bpf/augmented_syscalls.c, as 'perf trace' will notice that the perf_sample.raw_size is more than what is expected for a particular syscall payload as defined by its tracefs format file and will then use the augmented payload in the 'filename' syscall arg beautifier. The same protocol will be used for structs such as 'struct sockaddr ', 'struct pollfd', etc, with additions for handling arrays. This will all be done under the hood when 'perf trace' realizes the system has the necessary components, and also can be done by providing a precompiled augmented_syscalls.c eBPF ELF object. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-gj9kqb61wo7m3shtpzercbcr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	c96f4edcc3	perf trace: Show comm/tid for augmented_syscalls To get us a bit more like the sys_enter + sys_exit combo: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) 0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3 0.051 ( ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) 0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3 0.539 ( ): openat(dfd: CWD, filename: 0xca71506b) 0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3 # After: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC) 0.007 ( 0.005 ms): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC) = 3 0.032 ( ): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC) 0.033 ( 0.006 ms): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC) = 3 0.301 ( ): cat/4919 openat(dfd: CWD, filename: 0x91fa306b) 0.304 ( 0.004 ms): cat/4919 openat(dfd: CWD, filename: 0x91fa306b) = 3 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-6w8ytyo5y655a1hsyfpfily6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	6dcbd212ff	perf trace: Extract the comm/tid printing for syscall enter Will be used with augmented syscalls, where we haven't transitioned completely to combining sys_enter_FOO with sys_exit_FOO, so we'll go as far as having it similar to the end result, strace like, as possible. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-canomaoiybkswwnhj69u9ae4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	1cdf618f23	perf trace: Print the syscall name for augmented_syscalls Since we copy all the payload for raw_syscalls:sys_enter plus add expanded pointers, we can use the syscall id to get its name, etc: # grep 'field:.* id' /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format field:long id; offset:8; size:8; signed:1; # Before: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC 0.006 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC) = 3 0.041 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC 0.042 ( 0.007 ms): cat/2395 openat(dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC) = 3 0.376 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xac0a806b 0.379 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xac0a806b) = 3 # After: # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) 0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3 0.051 ( ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) 0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3 0.539 ( ): openat(dfd: CWD, filename: 0xca71506b) 0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-epz6y9i0eavmerc5ha98t7gn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	6ccc18a9a1	perf trace: Make the augmented_syscalls filter out the tracepoint event When we attach a eBPF object to a tracepoint, if we return 1, then that tracepoint will be stored in the perf's ring buffer. In the augmented_syscalls.c case we want to just attach and _override_ the tracepoint payload with an augmented, extended one. In this example, tools/perf/examples/bpf/augmented_syscalls.c, we are attaching to the 'openat' syscall, and adding, after the syscalls:sys_enter_openat usual payload as defined by /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format, a snapshot of its sole pointer arg: # grep 'field:.\' /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format field:const char * filename; offset:24; size:8; signed:0; # For now this is not being considered, the next csets will make use of it, but as this is overriding the syscall tracepoint enter, we don't want that event appearing on the ring buffer, just our synthesized one. Before: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC 0.006 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: , flags: CLOEXEC 0.007 ( 0.004 ms): cat/24044 openat(dfd: CWD, filename: 0x216dda8, flags: CLOEXEC ) = 3 0.028 ( ): __augmented_syscalls__:dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC 0.030 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: , flags: CLOEXEC 0.031 ( 0.006 ms): cat/24044 openat(dfd: CWD, filename: 0x2375ce0, flags: CLOEXEC ) = 3 0.291 ( ): __augmented_syscalls__:dfd: CWD, filename: /etc/passwd 0.293 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0.294 ( 0.004 ms): cat/24044 openat(dfd: CWD, filename: 0x637db06b ) = 3 # After: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x9c6a1da8, flags: CLOEXEC 0.005 ( 0.015 ms): cat/27341 openat(dfd: CWD, filename: 0x9c6a1da8, flags: CLOEXEC ) = 3 0.040 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x9c8a9ce0, flags: CLOEXEC 0.041 ( 0.006 ms): cat/27341 openat(dfd: CWD, filename: 0x9c8a9ce0, flags: CLOEXEC ) = 3 0.294 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x482a706b 0.296 ( 0.067 ms): cat/27341 openat(dfd: CWD, filename: 0x482a706b ) = 3 # Now lets replace that __augmented_syscalls__ name with the syscall name, using: # grep 'field:.*syscall_nr' /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat/format field:int __syscall_nr; offset:8; size:4; signed:1; # That the synthesized payload has exactly where the syscall enter tracepoint puts it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-og4r9k87mzp9hv7el046idmd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:19 -03:00
Arnaldo Carvalho de Melo	7a983a0fe2	perf trace: Pass augmented args to the arg formatters when available If the tracepoint payload is bigger than what a syscall expected from what is in its format file in tracefs, then that will be used as augmented args, i.e. the expansion of syscall arg pointers, with things like a filename, structs, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-bsbqx7xi2ot4q9bf570f7tqs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:52:18 -03:00
Kim Phillips	4e67b2a5df	perf annotate: Fix parsing aarch64 branch instructions after objdump update Starting with binutils 2.28, aarch64 objdump adds comments to the disassembly output to show the alternative names of a condition code [1]. It is assumed that commas in objdump comments could occur in other arches now or in the future, so this fix is arch-independent. The fix could have been done with arm64 specific jump__parse and jump__scnprintf functions, but the jump__scnprintf instruction would have to have its comment character be a literal, since the scnprintf functions cannot receive a struct arch easily. This inconvenience also applies to the generic jump__scnprintf, which is why we add a raw_comment pointer to struct ins_operands, so the __parse function assigns it to be re-used by its corresponding __scnprintf function. Example differences in 'perf annotate --stdio2' output on an aarch64 perf.data file: BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b AFTER : ↓ b.cs 18c BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b AFTER : ↓ b.cc 31c The branch target labels 18c and 31c also now appear in the output: BEFORE: add x26, x29, #0x80 AFTER : 18c: add x26, x29, #0x80 BEFORE: add x21, x21, #0x8 AFTER : 31c: add x21, x21, #0x8 The Fixes: tag below is added so stable branches will get the update; it doesn't necessarily mean that commit was broken at the time, rather it didn't withstand the aarch64 objdump update. Tested no difference in output for sample x86_64, power arch perf.data files. [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730 Signed-off-by: Kim Phillips <kim.phillips@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anton Blanchard <anton@samba.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: `b13bbeee5e` ("perf annotate: Fix branch instruction with multiple operands") Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:51:54 -03:00
Sandipan Das	fa694160cc	perf probe powerpc: Ignore SyS symbols irrespective of endianness This makes sure that the SyS symbols are ignored for any powerpc system, not just the big endian ones. Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: `fb6d594231` ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc") Link: http://lkml.kernel.org/r/20180828090848.1914-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 15:15:11 -03:00
Chris Phlipot	c9f23d2bc2	perf event-parse: Use fixed size string for comms Some implementations of libc do not support the 'm' width modifier as part of the scanf string format specifier. This can cause the parsing to fail. Since the parser never checks if the scanf parsing was successesful, this can result in a crash. Change the comm string to be allocated as a fixed size instead of dynamically using 'm' scanf width modifier. This can be safely done since comm size is limited to 16 bytes by TASK_COMM_LEN within the kernel. This change prevents perf from crashing when linked against bionic as well as reduces the total number of heap allocations and frees invoked while accomplishing the same task. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180830021950.15563-1-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:51:45 -03:00
Chris Phlipot	a72f642613	perf util: Fix bad memory access in trace info. In the write to the output_fd in the error condition of record_saved_cmdline(), we are writing 8 bytes from a memory location on the stack that contains a primitive that is only 4 bytes in size. Change the primitive to 8 bytes in size to match the size of the write in order to avoid reading unknown memory from the stack. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180829061954.18871-1-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:50:50 -03:00
Arnaldo Carvalho de Melo	dad2762aac	perf tools: Streamline bpf examples and headers installation We were emitting 4 lines, two of them misleading: make: Entering directory '/home/acme/git/perf/tools/perf' <SNIP> INSTALL lib INSTALL include/bpf INSTALL lib INSTALL examples/bpf <SNIP> make: Leaving directory '/home/acme/git/perf/tools/perf' Make it more compact by showing just two lines: make: Entering directory '/home/acme/git/perf/tools/perf' INSTALL bpf-headers INSTALL bpf-examples make: Leaving directory '/home/acme/git/perf/tools/perf' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0nvkyciqdkrgy829lony5925@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:49:25 -03:00
Hisao Tanabe	fd8d270279	perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx() If evsel is NULL, we should return NULL to avoid a NULL pointer dereference a bit later in the code. Signed-off-by: Hisao Tanabe <xtanabe@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: `03e0a7df3e` ("perf tools: Introduce bpf-output event") LPU-Reference: 20180824154556.23428-1-xtanabe@gmail.com Link: https://lkml.kernel.org/n/tip-e5plzjhx6595a5yjaf22jss3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:49:25 -03:00
Kim Phillips	5ab1de932e	perf arm64: Fix include path for asm-generic/unistd.h The new syscall table support for arm64 mistakenly used the system's asm-generic/unistd.h file when processing the tools/arch/arm64/include/uapi/asm/unistd.h file's include directive: #include <asm-generic/unistd.h> See "Committer notes" section of commit `2b58824356` "perf arm64: Generate system call table from asm/unistd.h" for more details. This patch removes the committer's temporary workaround, and instructs the host compiler to search the build tree's include path for the right copy of the unistd.h file, instead of the one on the system's /usr/include path. It thus fixes the committer's test that cross-builds an arm64 perf on an x86 platform running Ubuntu 14.04.5 LTS with an old toolchain: $ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc `pwd`/tools tools/arch/arm64/include/uapi/asm/unistd.h \| grep bpf [280] = "bpf", Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Fixes: `2b58824356` ("perf arm64: Generate system call table from asm/unistd.h") Link: http://lkml.kernel.org/r/20180806172800.bbcec3cfcc51e2facc978bf2@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:49:24 -03:00
Jiri Olsa	9b3579fc6c	perf tests: Add breakpoint modify tests Adding to tests that aims on kernel breakpoint modification bugs. First test creates HW breakpoint, tries to change it and checks it was properly changed. It aims on kernel issue that prevents HW breakpoint to be changed via ptrace interface. The first test forks, the child sets itself as ptrace tracee and waits in signal for parent to trace it, then it calls bp_1 and quits. The parent does following steps: - creates a new breakpoint (id 0) for bp_2 function - changes that breakpoint to bp_1 function - waits for the breakpoint to hit and checks it has proper rip of bp_1 function This test aims on an issue in kernel preventing to change disabled breakpoints Second test mimics the first one except for few steps in the parent: - creates a new breakpoint (id 0) for bp_1 function - changes that breakpoint to bogus (-1) address - waits for the breakpoint to hit and checks it has proper rip of bp_1 function This test aims on an issue in kernel disabling enabled breakpoint after unsuccesful change. Committer testing: # uname -a Linux jouet 4.18.0-rc8-00002-g1236568ee3cb #12 SMP Tue Aug 7 14:08:26 -03 2018 x86_64 x86_64 x86_64 GNU/Linux # perf test -v "bp modify" 62: x86 bp modify : --- start --- test child forked, pid 25671 in bp_1 tracee exited prematurely 2 FAILED arch/x86/tests/bp-modify.c:209 modify test 1 failed test child finished with -1 ---- end ---- x86 bp modify: FAILED! # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180827091228.2878-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:49:22 -03:00
Martin Liška	1dc27f6330	perf annotate: Properly interpret indirect call The patch changes the parsing of: callq 0x8(%rbx) from: 0.26 │ → callq 8 to: 0.26 │ → callq 0x8(%rbx) in this case an address is followed by a register, thus one can't parse only the address. Committer testing: 1) run 'perf record sleep 10' 2) before applying the patch, run: perf annotate --stdio2 > /tmp/before 3) after applying the patch, run: perf annotate --stdio2 > /tmp/after 4) diff /tmp/before /tmp/after: --- /tmp/before 2018-08-28 11:16:03.238384143 -0300 +++ /tmp/after 2018-08-28 11:15:39.335341042 -0300 @@ -13274,7 +13274,7 @@ ↓ jle 128 hash_value = hash_table->hash_func (key); mov 0x8(%rsp),%rdi - 0.91 → callq 30 + 0.91 → callq 0x30(%r12) mov $0x2,%r8d cmp $0x2,%eax node_hash = hash_table->hashes[node_index]; @@ -13848,7 +13848,7 @@ mov %r14,%rdi sub %rbx,%r13 mov %r13,%rdx - → callq 38 + → callq 0x38(%r15) cmp %rax,%r13 1.91 ↓ je 240 1b4: mov $0xffffffff,%r13d @@ -14026,7 +14026,7 @@ mov %rcx,-0x500(%rbp) mov %r15,%rsi mov %r14,%rdi - → callq 38 + → callq *0x38(%rax) mov -0x500(%rbp),%rcx cmp %rax,%rcx ↓ jne 9b0 <SNIP tons of other such cases> Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kim Phillips <kim.phillips@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/bd1f3932-be2b-85f9-7582-111ee0a43b07@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-30 14:49:22 -03:00
Jiri Olsa	721f0dfc3c	perf python: Fix pyrf_evlist__read_on_cpu() interface Jaroslav reported errors from valgrind over perf python script: # echo 0 > /sys/devices/system/cpu/cpu4/online # valgrind ./test.py ==7524== Memcheck, a memory error detector ... ==7524== Command: ./test.py ==7524== pid 7526 exited ==7524== Invalid read of size 8 ==7524== at 0xCC2C2B3: perf_mmap__read_forward (evlist.c:780) ==7524== by 0xCC2A681: pyrf_evlist__read_on_cpu (python.c:959) ... ==7524== Address 0x65c4868 is 16 bytes after a block of size 459,36.. ==7524== at 0x4C2B955: calloc (vg_replace_malloc.c:711) ==7524== by 0xCC2F484: zalloc (util.h:35) ==7524== by 0xCC2F484: perf_evlist__alloc_mmap (evlist.c:978) ... The reason for this is in the python interface, that allows a script to pass arbitrary cpu number, which is then used to access struct perf_evlist::mmap array. That's obviously wrong and works only when if all cpus are available and fails if some cpu is missing, like in the example above. This patch makes pyrf_evlist__read_on_cpu() search the evlist's maps array for the proper map to access. It's linear search at the moment. Based on the way how is the read_on_cpu used, I don't think we need to be fast in here. But we could add some hash in the middle to make it fast/er. We don't allow python interface to set write_backward event attribute, so it's safe to check only evlist's mmaps. Reported-by: Jaroslav Škarvada <jskarvad@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817114556.28000-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	31fb4c0d7b	perf mmap: Store real cpu number in 'struct perf_mmap' Store the real cpu number in 'struct perf_mmap', which will be used by python interface that allows user to read a particular memory map for given cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jaroslav Škarvada <jskarvad@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817114556.28000-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	b946cd3734	perf tools: Remove ext from struct kmod_path Having comp carrying the compression ID, we no longer need return the extension. Removing it and updating the automated test. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-14-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	88c74dc76a	perf tools: Add gzip_is_compressed function Add implementation of the is_compressed callback for gzip. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	4b57fd44b6	perf tools: Add lzma_is_compressed function Add implementation of the is_compressed callback for lzma. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	8b42b7e5e8	perf tools: Add is_compressed callback to compressions array Add is_compressed callback to the compressions array, that returns 0 if the file is compressed or != 0 if not. The new callback is used to recognize the situation when we have a 'compressed' object, like: /lib/modules/.../drivers/net/ethernet/intel/igb/igb.ko.xz but we need to read its debug data from debuginfo files, which might not be compressed, like: /root/.debug/.build-id/d6/...c4b301f/debug So even for a 'compressed' object we read debug data from a plain uncompressed object. To keep this transparent, we detect this in decompress_kmodule() and return the file descriptor to the uncompressed file. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	c9a8a6131f	perf tools: Move the temp file processing into decompress_kmodule We will add a compression check in the following patch and it makes it easier if the file processing is done in a single place. It also makes the current code simpler. The decompress_kmodule function now returns the fd of the uncompressed file and the file name in the pathname arg, if it's provided. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	dde755a90e	perf tools: Use compression id in decompress_kmodule() Once we parsed out the compression ID, we dont need to iterate all available compressions and we can call it directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	2af5247530	perf tools: Store compression id into struct dso Add comp to 'struct dso' to hold the compression index. It will be used in the following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	4b838b0db4	perf tools: Add compression id into 'struct kmod_path' Store a decompression ID in 'struct kmod_path', so it can be later stored in 'struct dso'. Switch 'struct kmod_path's 'comp' from 'bool' to 'int' to return the compressions array index. Add 0 index item into compressions array, so that the comp usage stays as it was: 0 - no compression, != 0 compression index. Update the kmod_path tests. Committer notes: Use a designated initializer + terminating comma, e.g. { .fmt = NULL, }, to fix the build in several distros: centos:6: util/dso.c:201: error: missing initializer centos:6: util/dso.c:201: error: (near initialization for 'compressions[0].decompress') debian:9: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] fedora:25: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] fedora:26: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] fedora:27: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] oraclelinux:6: util/dso.c:201: error: missing initializer oraclelinux:6: util/dso.c:201: error: (near initialization for 'compressions[0].decompress') ubuntu:12.04.5: util/dso.c:201:2: error: missing initializer [-Werror=missing-field-initializers] ubuntu:12.04.5: util/dso.c:201:2: error: (near initialization for 'compressions[0].decompress') [-Werror=missing-field-initializers] ubuntu:16.04: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] ubuntu:16.10: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] ubuntu:16.10: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] ubuntu:17.10: util/dso.c:201:24: error: missing field 'decompress' initializer [-Werror,-Wmissing-field-initializers] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	e1e139463d	perf tools: Make is_supported_compression() static There's no outside user of it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	85e1d419e7	perf tools: Make decompress_to_file() function static There's no outside user of it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	d68a29c282	perf tools: Get rid of dso__needs_decompress() call in __open_dso() There's no need to call dso__needs_decompress() twice in the function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	2354ae9bdc	perf tools: Get rid of dso__needs_decompress() call in symbol__disassemble() There's no need to call dso__needs_decompress() twice in the function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Jiri Olsa	bcd4287ead	perf tools: Get rid of dso__needs_decompress() call in read_object_code() There's no need to call dso__needs_decompress() twice in the function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180817094813.15086-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:59 -03:00
Arnaldo Carvalho de Melo	cb76371441	perf llvm: Allow passing options to llc in addition to clang The newly added 'llvm.opts' variable allows passing options directly to llc, like needed to get sane DWARF in BPF ELF debug sections: With: [root@seventh perf]# cat ~/.perfconfig [llvm] dump-obj = true clang-opt = -g [root@seventh perf]# We get: [root@seventh perf]# perf trace -e tools/perf/examples/bpf/hello.c cat /etc/passwd > /dev/null LLVM: dumping tools/perf/examples/bpf/hello.o 0.000 __bpf_stdout__:Hello, world 0.015 __bpf_stdout__:Hello, world 0.187 __bpf_stdout__:Hello, world [root@seventh perf]# pahole tools/perf/examples/bpf/hello.o struct clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c) { clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); /* 0 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 4 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 8 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 12 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 16 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 20 4 / clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566e.org clang version 8.0.0 (http://llvm.org/git/clang.git 8587270a739ee30c926a76d5657e65e85b560f6e) (http://llvm.org/git/llvm.git 0566eefef9c3777bd780ec4cbb9efa764633b76c); / 24 4 / / size: 28, cachelines: 1, members: 7 / / last cacheline: 28 bytes / }; [root@seventh perf]# Adding these options to be passed to llvm's llc: [root@seventh perf]# cat ~/.perfconfig [llvm] dump-obj = true clang-opt = -g opts = -mattr=dwarfris [root@seventh perf]# We get sane output: [root@seventh perf]# perf trace -e tools/perf/examples/bpf/hello.c cat /etc/passwd > /dev/null LLVM: dumping tools/perf/examples/bpf/hello.o 0.000 __bpf_stdout__:Hello, world 0.015 __bpf_stdout__:Hello, world 0.185 __bpf_stdout__:Hello, world [root@seventh perf]# pahole tools/perf/examples/bpf/hello.o struct bpf_map { unsigned int type; / 0 4 / unsigned int key_size; / 4 4 / unsigned int value_size; / 8 4 / unsigned int max_entries; / 12 4 / unsigned int map_flags; / 16 4 / unsigned int inner_map_idx; / 20 4 / unsigned int numa_node; / 24 4 / / size: 28, cachelines: 1, members: 7 / / last cacheline: 28 bytes */ }; [root@seventh perf]# Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com>, Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-0lrwmrip4dru1651rm8xa7tq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:58 -03:00
Jack Henschel	49836f7811	perf parser: Improve error message for PMU address filters This is the second version of a patch that improves the error message of the perf events parser when the PMU hardware does not support address filters. Previously, the perf returned the following error: $ perf record -e intel_pt// --filter 'filter sys_write' --filter option should follow a -e tracepoint or HW tracer option This implies there is some syntax error present in the command line, which is not true. Rather, notify the user that the CPU does not have support for this feature. For example, Intel chips based on the Broadwell micro-archticture have the Intel PT PMU, but do not support address filtering. Now, perf prints the following error message: $ perf record -e intel_pt// --filter 'filter sys_write' This CPU does not support address filtering Signed-off-by: Jack Henschel <jackdev@mailbox.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180704121345.19025-1-jackdev@mailbox.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:58 -03:00
Rasmus Villemoes	da15fc2fa9	perf tools: Disable parallelism for 'make clean' The Yocto build system does a 'make clean' when rebuilding due to changed dependencies, and that consistently fails for me (causing the whole BSP build to fail) with errors such as \| find: '[...]/perf/1.0-r9/perf-1.0/plugin_mac80211.so': No such file or directory \| find: '[...]/perf/1.0-r9/perf-1.0/plugin_mac80211.so': No such file or directory \| find: find: '[...]/perf/1.0-r9/perf-1.0/libtraceevent.a''[...]/perf/1.0-r9/perf-1.0/libtraceevent.a': No such file or directory: No such file or directory \| [...] \| find: cannot delete '/mnt/xfs/devel/pil/yocto/tmp-glibc/work/wandboard-oe-linux-gnueabi/perf/1.0-r9/perf-1.0/util/.pstack.o.cmd': No such file or directory Apparently (despite the comment), 'make clean' ends up launching multiple sub-makes that all want to remove the same things - perhaps this only happens in combination with a O=... parameter. In any case, we don't lose much by explicitly disabling the parallelism for the clean target, and it makes automated builds much more reliable. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180705131527.19749-1-linux@rasmusvillemoes.dk Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-20 08:54:58 -03:00
Adrian Hunter	99cbbe56eb	perf auxtrace: Fix queue resize When the number of queues grows beyond 32, the array of queues is resized but not all members were being copied. Fix by also copying 'tid', 'cpu' and 'set'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `e502789302` ("perf auxtrace: Add helpers for queuing AUX area tracing data") Link: http://lkml.kernel.org/r/20180814084608.6563-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-14 19:00:53 -03:00
Arnaldo Carvalho de Melo	5508672d7f	perf python: Remove -mcet and -fcf-protection when building with clang These options are not present in older clang versions, so when we build for a distro that has a gcc new enough to have these options and that the distro python build config settings use them but clang doesn't support, b00m. This is the case with fedora 28 and rawhide, so check if clang has the options and remove the missing ones from CFLAGS. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-7asds7yn6gzg6ns1lw17ukul@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-14 18:50:20 -03:00
Kim Phillips	3443533665	perf arm spe: Fix uninitialized record error variable The auxtrace init variable 'err' was not being initialized, leading perf to abort early in an SPE record command when there was no explicit error, rather only based whatever memory contents were on the stack. Initialize it explicitly on getting an SPE successfully, the same way cs-etm does. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dongjiu Geng <gengdongjiu@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `ffd3d18c20` ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support") Link: http://lkml.kernel.org/r/20180810174512.52900813e57cbccf18ce99a2@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-14 15:10:44 -03:00
Jiri Olsa	c9b51a0170	perf tools: Move syscall_64.tbl check into check-headers.sh Probably leftover from the time we introducd the check-headers.sh script. Committer testing: Remove the 'rseq' syscall from tools/perf/arch/x86/entry/syscalls/syscall_64.tbl to fake a diff: make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j4' parallel build Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl CC /tmp/build/perf/util/syscalltbl.o INSTALL trace_plugins <SNIP> $ diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl --- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 2018-08-13 15:49:50.896585176 -0300 +++ arch/x86/entry/syscalls/syscall_64.tbl 2018-07-20 12:04:04.536858304 -0300 @@ -342,6 +342,7 @@ 331 common pkey_free __x64_sys_pkey_free 332 common statx __x64_sys_statx 333 common io_pgetevents __x64_sys_io_pgetevents +334 common rseq __x64_sys_rseq # # x32-specific system call numbers start at 512 to avoid cache impact $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Kapshuk <alexander.kapshuk@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180813111504.3568-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-14 15:10:40 -03:00
Jiri Olsa	7ea6e983b2	perf tools: Make check-headers.sh check based on kernel dir Changing the logic to compare files with paths relative to kernel source base dir. This way we can keep the output message for 2 unrelated files, which is coming in following patch. Committer testing: Remove a line from tools/arch/x86/lib/memcpy_64.S to have it detected: make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j4' parallel build Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S INSTALL GTK UI INSTALL binaries Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Kapshuk <alexander.kapshuk@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180813111504.3568-1-jolsa@kernel.org Link: http://lkml.kernel.org/r/20180814072726.GA13931@krava [ Do not use pushd/popd, its a bashism, reported by Michael Ellerman, fixed by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-14 15:08:33 -03:00
Alexander Kapshuk	51d8aac236	perf tools: Fix check-headers.sh AND list path of execution The '\|\|' path of execution in the 'test' block of the check_2() function may also be taken if file2 does not exist, in which case the warning message about the ABI headers being different would still be printed where it should not be. See below. % file1=file1; file2=file2 % cmd="echo diff $file1 $file2" % test -f $file2 && \ eval $cmd \|\| echo "Warning: Kernel ABI header at 'tools/$file1' differs from latest version at '$file2'" >&2 Warning: Kernel ABI header at 'tools/file1' differs from latest version at 'file2' The proposed patch converts the code following the '&&' operator into a compound list to be executed in the current process environment only if file2 does exist. Should the files being compared differ, a diff command to compare the files concerned is printed on standard output. E.g. $ diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Committer testing: Remove a line from that tools/arch/x86/lib/memcpy_64.S file to test this: BUILD: Doing 'make -j4' parallel build Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o Signed-off-by: Alexander Kapshuk <alexander.kapshuk@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180811083915.17471-1-alexander.kapshuk@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:46:19 -03:00
Benno Evers	3f4417d693	perf tools: Check for null when copying nsinfo. The argument to nsinfo__copy() was assumed to be valid, but some code paths exist that will lead to NULL being passed. In particular, running 'perf script -D' on a perf.data file containing an PERF_RECORD_MMAP event associating the '[vdso]' dso with pid 0 earlier in the event stream will lead to a segfault. Since all calling code is already checking for a non-null return value, just return NULL for this case as well. Signed-off-by: Benno Evers <bevers@mesosphere.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180810133614.9925-1-bevers@mesosphere.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:39:09 -03:00
Tzvetomir Stoyanov (VMware)	6fed932e92	tools lib traceevent, perf tools: Rename 'enum pevent_flag' to 'enum tep_flag' In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes pevent_get_page_size API and enum pevent_flag to enum tep_flag Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180701.623942406@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:18 -03:00
Tzvetomir Stoyanov (VMware)	fc9b69710e	tools lib traceevent, perf tools: Rename traceevent_* APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "traceevent_". This changes APIs: traceevent_host_bigendian, traceevent_load_plugins and traceevent_unload_plugins Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180701.484691639@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:16 -03:00
Tzvetomir Stoyanov (VMware)	ece2a4f483	tools lib traceevent, perf tools: Rename pevent_set_* APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_set_file_bigendian, pevent_set_flag, pevent_set_function_resolver, pevent_set_host_bigendian, pevent_set_long_size, pevent_set_page_size and pevent_get_long_size Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180701.256265951@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:10 -03:00
Tzvetomir Stoyanov (VMware)	13a418904e	tools lib traceevent, perf tools: Rename pevent_register_* APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_register_comm, pevent_register_print_string Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.948980691@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:08 -03:00
Tzvetomir Stoyanov (VMware)	59c1baee25	tools lib traceevent, perf tools: Rename pevent_read_number_* APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_read_number, pevent_read_number_field Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.804271434@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:05 -03:00
Tzvetomir Stoyanov (VMware)	6a48dc298e	tools lib traceevent, perf tools: Rename pevent print APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_print_field, pevent_print_fields, pevent_print_funcs, pevent_print_printk Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.654453763@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:22:01 -03:00
Tzvetomir Stoyanov (VMware)	c60167c187	tools lib traceevent, perf tools: Rename pevent parse APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_parse_event, pevent_parse_format, pevent_parse_header_page Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.469749700@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:21:57 -03:00
Tzvetomir Stoyanov (VMware)	af85cd1952	tools lib traceevent, perf tools: Rename pevent find APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_find_any_field, pevent_find_common_field, pevent_find_event, pevent_find_field Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.316995920@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:21:51 -03:00
Tzvetomir Stoyanov (VMware)	4d5c58b15c	tools lib traceevent, perf tools: Rename pevent alloc / free APIs In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_alloc, pevent_free, pevent_event_info and pevent_func_resolver_t Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180700.152609945@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:21:43 -03:00
Tzvetomir Stoyanov (VMware)	cbc49b25b9	tools lib traceevent, perf tools: Rename 'struct pevent_record' to 'struct tep_record' In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes the 'struct pevent_record' to 'struct tep_record'. Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180659.866021298@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-13 15:21:13 -03:00
Tzvetomir Stoyanov (VMware)	096177a8b5	tools lib traceevent, perf tools: Rename struct pevent to struct tep_handle In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes the struct pevent to struct tep_handle. Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lkml.kernel.org/r/20180808180659.706175783@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-10 15:29:35 -03:00
Sandipan Das	354b064b8e	perf probe powerpc: Fix trace event post-processing In some cases, a symbol may have multiple aliases. Attempting to add an entry probe for such symbols results in a probe being added at an incorrect location while it fails altogether for return probes. This is only applicable for binaries with debug information. During the arch-dependent post-processing, the offset from the start of the symbol at which the probe is to be attached is determined and added to the start address of the symbol to get the probe's location. In case there are multiple aliases, this offset gets added multiple times for each alias of the symbol and we end up with an incorrect probe location. This can be verified on a powerpc64le system as shown below. $ nm /lib/modules/$(uname -r)/build/vmlinux \| grep "sys_open$" ... c000000000414290 T __se_sys_open c000000000414290 T sys_open $ objdump -d /lib/modules/$(uname -r)/build/vmlinux \| grep -A 10 "<__se_sys_open>:" c000000000414290 <__se_sys_open>: c000000000414290: 19 01 4c 3c addis r2,r12,281 c000000000414294: 70 c4 42 38 addi r2,r2,-15248 c000000000414298: a6 02 08 7c mflr r0 c00000000041429c: e8 ff a1 fb std r29,-24(r1) c0000000004142a0: f0 ff c1 fb std r30,-16(r1) c0000000004142a4: f8 ff e1 fb std r31,-8(r1) c0000000004142a8: 10 00 01 f8 std r0,16(r1) c0000000004142ac: c1 ff 21 f8 stdu r1,-64(r1) c0000000004142b0: 78 23 9f 7c mr r31,r4 c0000000004142b4: 78 1b 7e 7c mr r30,r3 For both the entry probe and the return probe, the probe location should be _text+4276888 (0xc000000000414298). Since another alias exists for 'sys_open', the post-processing code will end up adding the offset (8 for powerpc64le) twice and perf will attempt to add the probe at _text+4276896 (0xc0000000004142a0) instead. Before: # perf probe -v -a sys_open probe-definition(0): sys_open symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Writing event: p:probe/sys_open _text+4276896 Added new event: probe:sys_open (on sys_open) ... # perf probe -v -a sys_open%return $retval probe-definition(0): sys_open%return symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/README write=0 Opening /sys/kernel/debug/tracing/kprobe_events write=1 Parsing probe_events: p:probe/sys_open _text+4276896 Group:probe Event:sys_open probe:p Writing event: r:probe/sys_open__return _text+4276896 Failed to write event: Invalid argument Error: Failed to add events. Reason: Invalid argument (Code: -22) After: # perf probe -v -a sys_open probe-definition(0): sys_open symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Writing event: p:probe/sys_open _text+4276888 Added new event: probe:sys_open (on sys_open) ... # perf probe -v -a sys_open%return $retval probe-definition(0): sys_open%return symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/README write=0 Opening /sys/kernel/debug/tracing/kprobe_events write=1 Parsing probe_events: p:probe/sys_open _text+4276888 Group:probe Event:sys_open probe:p Writing event: r:probe/sys_open__return _text+4276888 Added new event: probe:sys_open__return (on sys_open%return) ... Reported-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: `99e608b595` ("perf probe ppc64le: Fix probe location when using DWARF") Link: http://lkml.kernel.org/r/20180809161929.35058-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-09 14:40:11 -03:00
Konstantin Khlebnikov	6a9405b56c	perf map: Optimize maps__fixup_overlappings() This function splits and removes overlapping areas. Maps in tree are ordered by start address thus we could find first overlap and stop if next map does not overlap. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/153365189407.435244.7234821822450484712.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:56:00 -03:00
Konstantin Khlebnikov	e5adfc3e7e	perf map: Synthesize maps only for thread group leader Threads share map_groups, all map events are merged into it. Thus we could send mmaps only for thread group leader. Otherwise it took ages to attach and record something from processes with many vmas and threads. Thread group leader could be already dead, but it seems perf cannot handle this case anyway. Testing dummy: #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <unistd.h> void thread(void arg) { pause(); } int main(int argc, char **argv) { int threads = 10000; int vmas = 50000; pthread_t th; for (int i = 0; i < threads; i++) pthread_create(&th, NULL, thread, NULL); for (int i = 0; i < vmas; i++) mmap(NULL, 4096, (i & 1) ? PROT_READ : PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS \| MAP_NORESERVE, -1, 0); sleep(60); return 0; } Comment by Jiri Olsa: We actualy synthesize the group leader (if we found one) for the thread even if it's not present in the thread_map, so the process maps are always in data. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/153363294102.396323.6277944760215058174.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:59 -03:00
Arnaldo Carvalho de Melo	88cf7084f9	perf trace: Wire up the augmented syscalls with the syscalls:sys_enter_FOO beautifier We just check that the evsel is the one we associated with the bpf-output event associated with the "__augmented_syscalls__" eBPF map, to show that the formatting is done properly: # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC 0.006 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC 0.007 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC ) = 3 0.029 ( ): __augmented_syscalls__:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC 0.030 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC 0.031 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC ) = 3 0.249 ( ): __augmented_syscalls__:dfd: CWD, filename: 0xc3700d6 0.250 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xc3700d6 0.252 ( 0.003 ms): cat/11486 openat(dfd: CWD, filename: 0xc3700d6 ) = 3 # Now we just need to get the full blown enter/exit handlers to check if the evsel being processed is the augmented_syscalls one to go pick the pointer payloads from the end of the payload. We also need to state somehow what is the layout for multi pointer arg syscalls. Also handy would be to have a BTF file with the struct definitions used in syscalls, compact, generated at kernel built time and available for use in eBPF programs. Till we get there we can go on doing some manual coupling of the most relevant syscalls with some hand built beautifiers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r6ba5izrml82nwfmwcp7jpkm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:59 -03:00
Arnaldo Carvalho de Melo	d3d1c4bdf5	perf trace: Setup the augmented syscalls bpf-output event fields The payload that is put in place by the eBPF script attached to syscalls:sys_enter_openat (and other syscalls with pointers, in the future) can be consumed by the existing sys_enter beautifiers if evsel->priv is setup with a struct syscall_tp with struct tp_fields for the 'syscall_id' and 'args' fields expected by the beautifiers, this patch does just that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-xfjyog8oveg2fjys9r1yy1es@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:58 -03:00
Arnaldo Carvalho de Melo	78e890ea86	perf bpf: Make bpf__setup_output_event() return the bpf-output event We're calling it to setup that event, and we'll need it later to decide if the bpf-output event we're handling is the one setup for a specific purpose, return it using ERR_PTR, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zhachv7il2n1lopt9aonwhu7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:58 -03:00
Arnaldo Carvalho de Melo	e0b6d2ef32	perf trace: Handle "bpf-output" events associated with "__augmented_syscalls__" BPF map Add an example BPF script that writes syscalls:sys_enter_openat raw tracepoint payloads augmented with the first 64 bytes of the "filename" syscall pointer arg. Then catch it and print it just like with things written to the "__bpf_stdout__" map associated with a PERF_COUNT_SW_BPF_OUTPUT software event, by just letting the default tracepoint handler in 'perf trace', trace__event_handler(), to use bpf_output__fprintf(trace, sample), just like it does with all other PERF_COUNT_SW_BPF_OUTPUT events, i.e. just do a dump on the payload, so that we can check if what is being printed has at least the first 64 bytes of the "filename" arg: The augmented_syscalls.c eBPF script: # cat tools/perf/examples/bpf/augmented_syscalls.c // SPDX-License-Identifier: GPL-2.0 #include <stdio.h> struct bpf_map SEC("maps") __augmented_syscalls__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; struct syscall_enter_openat_args { unsigned long long common_tp_fields; long syscall_nr; long dfd; char filename_ptr; long flags; long mode; }; struct augmented_enter_openat_args { struct syscall_enter_openat_args args; char filename[64]; }; int syscall_enter(openat)(struct syscall_enter_openat_args args) { struct augmented_enter_openat_args augmented_args; probe_read(&augmented_args.args, sizeof(augmented_args.args), args); probe_read_str(&augmented_args.filename, sizeof(augmented_args.filename), args->filename_ptr); perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, sizeof(augmented_args)); return 1; } license(GPL); # So it will just prepare a raw_syscalls:sys_enter payload for the "openat" syscall. This will eventually be done for all syscalls with pointer args, globally or just when the user asks, using some spec, which args of which syscalls it wants "expanded" this way, we'll probably start with just all the syscalls that have char * pointers with familiar names, the ones we already handle with the probe:vfs_getname kprobe if it is in place hooking the kernel getname_flags() function used to copy from user the paths. Running it we get: # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null 0.000 ( ): __augmented_syscalls__:X?.C......................`\..................../etc/ld.so.cache..#......,....ao.k...............k......1."......... 0.006 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC 0.008 ( 0.005 ms): cat/31292 openat(dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC ) = 3 0.036 ( ): __augmented_syscalls__:X?.C.......................\..................../lib64/libc.so.6......... .\....#........?.......=.C..../."......... 0.037 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC 0.039 ( 0.007 ms): cat/31292 openat(dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC ) = 3 0.323 ( ): __augmented_syscalls__:X?.C.....................P....................../etc/passwd......>.C....@................>.C.....,....ao.>.C........ 0.325 ( ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xe8be50d6 0.327 ( 0.004 ms): cat/31292 openat(dfd: CWD, filename: 0xe8be50d6 ) = 3 # We need to go on optimizing this to avoid seding trash or zeroes in the pointer content payload, using the return from bpf_probe_read_str(), but to keep things simple at this stage and make incremental progress, lets leave it at that for now. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-g360n1zbj6bkbk6q0qo11c28@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:57 -03:00
Arnaldo Carvalho de Melo	8fa25f303a	perf bpf: Add wrappers to BPF_FUNC_probe_read(_str) functions Will be used shortly in the augmented syscalls work together with a PERF_COUNT_SW_BPF_OUTPUT software event to insert syscalls + pointer contents in the perf ring buffer, to be consumed by 'perf trace' beautifiers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ajlkpz4cd688ulx1u30htkj3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:57 -03:00
Arnaldo Carvalho de Melo	aa31be3a48	perf bpf: Add bpf__setup_output_event() strerror() counterpart That is just bpf__strerror_setup_stdout() renamed to the more general "setup_output_event" method, keep the existing stdout() as a wrapper. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-nwnveo428qn0b48axj50vkc7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:56 -03:00
Arnaldo Carvalho de Melo	92bbe8d834	perf bpf: Generalize bpf__setup_stdout() We will use it to set up other bpf-output events, for instance to generate augmented syscall entry tracepoints with pointer contents. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4r7kw0nsyi4vyz6xm1tzx6a3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:56 -03:00
Arnaldo Carvalho de Melo	5941d856a9	perf bpf: Make bpf__for_each_stdout_map() generic By passing a 'name' arg, that will eventually be used to setup more "bpf-output" events, e.g. to create a event where to create raw_syscalls like events that in addition to the syscall arguments will also copy the pointer contents being passed from/to userspace. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-talrnxps9p3qozk3aeh91fgv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:55 -03:00
Arnaldo Carvalho de Melo	53a5d7b800	perf bpf: Add bpf/stdio.h wrapper to bpf_perf_event_output function That, together with the map __bpf_output__ that is already handled by 'perf trace' to print that event's contents as strings provides a debugging facility, to show it in use, print a simple string everytime the syscalls:sys_enter_openat() syscall tracepoint is hit: # cat tools/perf/examples/bpf/hello.c #include <stdio.h> int syscall_enter(openat)(void *args) { puts("Hello, world\n"); return 0; } license(GPL); # # perf trace -e openat,tools/perf/examples/bpf/hello.c cat /etc/passwd > /dev/null 0.016 ( ): __bpf_stdout__:Hello, world 0.018 ( 0.010 ms): cat/9079 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 0.057 ( ): __bpf_stdout__:Hello, world 0.059 ( 0.011 ms): cat/9079 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3 0.417 ( ): __bpf_stdout__:Hello, world 0.419 ( 0.009 ms): cat/9079 openat(dfd: CWD, filename: /etc/passwd) = 3 # This is part of an ongoing experimentation on making eBPF scripts as consumed by perf to be as concise as possible and using familiar concepts such as stdio.h functions, that end up just wrapping the existing BPF functions, trying to hide as much boilerplate as possible while using just conventions and C preprocessor tricks. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4tiaqlx5crf0fwpe7a6j84x7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:55 -03:00
Arnaldo Carvalho de Melo	7402e543a7	perf bpf: Add struct bpf_map struct A helper structure used by eBPF C program to describe map attributes to elf_bpf loader, to be used initially by the special __bpf_stdout__ map used to print strings into the perf ring buffer in BPF scripts, e.g.: Using the upcoming stdio.h and puts() macros to use the __bpf_stdout__ map to add strings to the ring buffer: # cat tools/perf/examples/bpf/hello.c #include <stdio.h> int syscall_enter(openat)(void args) { puts("Hello, world\n"); return 0; } license(GPL); # # cat ~/.perfconfig [llvm] dump-obj = true # perf trace -e openat,tools/perf/examples/bpf/hello.c/call-graph=dwarf/ cat /etc/passwd > /dev/null LLVM: dumping tools/perf/examples/bpf/hello.o 0.016 ( ): __bpf_stdout__:Hello, world 0.018 ( 0.010 ms): cat/9079 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 0.057 ( ): __bpf_stdout__:Hello, world 0.059 ( 0.011 ms): cat/9079 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3 0.417 ( ): __bpf_stdout__:Hello, world 0.419 ( 0.009 ms): cat/9079 openat(dfd: CWD, filename: /etc/passwd ) = 3 # # file tools/perf/examples/bpf/hello.o tools/perf/examples/bpf/hello.o: ELF 64-bit LSB relocatable, unknown arch 0xf7* version 1 (SYSV), not stripped # readelf -SW tools/perf/examples/bpf/hello.o There are 10 section headers, starting at offset 0x208: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 [ 1] .strtab STRTAB 0000000000000000 000188 00007f 00 0 0 1 [ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4 [ 3] syscalls:sys_enter_openat PROGBITS 0000000000000000 000040 000088 00 AX 0 0 8 [ 4] .relsyscalls:sys_enter_openat REL 0000000000000000 000178 000010 10 9 3 8 [ 5] maps PROGBITS 0000000000000000 0000c8 00001c 00 WA 0 0 4 [ 6] .rodata.str1.1 PROGBITS 0000000000000000 0000e4 00000e 01 AMS 0 0 1 [ 7] license PROGBITS 0000000000000000 0000f2 000004 00 WA 0 0 1 [ 8] version PROGBITS 0000000000000000 0000f8 000004 00 WA 0 0 4 [ 9] .symtab SYMTAB 0000000000000000 000100 000078 18 1 1 8 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), p (processor specific) # readelf -s tools/perf/examples/bpf/hello.o Symbol table '.symtab' contains 5 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 5 __bpf_stdout__ 2: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 7 _license 3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 8 _version 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 syscall_enter_openat # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-81fg60om2ifnatsybzwmiga3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:54 -03:00
Jiri Olsa	e6902d1b73	perf report: Add --percent-type option Set annotation percent type from following choices: global-period, local-period, global-hits, local-hits With following report option setup the percent type will be passed to annotation browser: $ perf report --percent-type period-local The local/global keywords set if the percentage is computed in the scope of the function (local) or the whole data (global). The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:54 -03:00
Jiri Olsa	88c2119077	perf annotate: Add --percent-type option Add --percent-type option to set annotation percent type from following choices: global-period, local-period, global-hits, local-hits Examples: $ perf annotate --percent-type period-local --stdio \| head -1 Percent \| Source code ... es, percent: local period) $ perf annotate --percent-type hits-local --stdio \| head -1 Percent \| Source code ... es, percent: local hits) $ perf annotate --percent-type hits-global --stdio \| head -1 Percent \| Source code ... es, percent: global hits) $ perf annotate --percent-type period-global --stdio \| head -1 Percent \| Source code ... es, percent: global period) The local/global keywords set if the percentage is computed in the scope of the function (local) or the whole data (global). The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:53 -03:00
Jiri Olsa	4c04868fbe	perf annotate: Display percent type in stdio output In following patches we will allow to switch percent type even for stdio annotation outputs. Adding the percent type value into the annotation outputs title. $ perf annotate --stdio Percent \| Sou ... instructions:u } (2805 samples, percent: local period) --------------------------- ... ------------------------------------------------------ ... $ perf annotate --stdio2 Samples: 2K of events 'anon ... count (approx.): 156525487, [percent: local period] safe_write.c() /usr/bin/yes Percent ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-19-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:53 -03:00
Jiri Olsa	addba8b66f	perf annotate: Make local period the default percent type Currently we display the percentages in annotation output based on number of samples hits. Switching it to period based percentage by default, because it corresponds more to the time spent on the line. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-18-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:52 -03:00
Jiri Olsa	3e0d795319	perf annotate: Add support to toggle percent type Add new key bindings to toggle percent type/base in annotation UI browser: 'p' to switch between local and global percent type 'b' to switch between hits and perdio percent base Add the following help messages to the UI browser '?' window: ... p Toggle percent type [local/global] b Toggle percent base [period/hits] ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-17-jolsa@kernel.org [ Moved percent_type to be the last arg to sym_title(), its an arg to what is being formmated (buf, size) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:52 -03:00
Jiri Olsa	d4265b1a1b	perf annotate: Pass browser percent_type in annotate_browser__calc_percent() Pass browser percent_type in annotate_browser__calc_percent(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-16-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:51 -03:00
Jiri Olsa	4c650ddc2e	perf annotate: Pass 'struct annotation_options' to map_symbol__annotation_dump() Pass 'struct annotation_options' to map_symbol__annotation_dump(), to carry on and pass the percent_type value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-15-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:51 -03:00
Jiri Olsa	c849c12cf3	perf annotate: Pass struct annotation_options to symbol__calc_lines() Pass struct annotation_options to symbol__calc_lines(), to carry on and pass the percent_type value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-14-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:50 -03:00
Jiri Olsa	796ca33d5c	perf annotate: Add percent_type to struct annotation_options It will be used to carry user selection of percent type for annotation output. Passing the percent_type to the annotation_line__print function as the first step and making it default to current percentage type (PERCENT_HITS_LOCAL) value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:50 -03:00
Jiri Olsa	e58684df91	perf annotate: Add PERCENT_PERIOD_GLOBAL percent value Adding and computing global period percent value for annotation line. Storing it in struct annotation_data percent array under new PERCENT_PERIOD_GLOBAL index. At the moment it's not displayed, it's coming in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:49 -03:00
Jiri Olsa	ab371169fb	perf annotate: Add PERCENT_PERIOD_LOCAL percent value Adding and computing local period percent value for annotation line. Storing it in struct annotation_data percent array under new PERCENT_PERIOD_LOCAL index. At the moment it's not displayed, it's coming in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:49 -03:00
Jiri Olsa	75a8c1ff28	perf annotate: Add PERCENT_HITS_GLOBAL percent value Adding and computing global hits percent value for annotation line. Storing it in struct annotation_data percent array under new PERCENT_HITS_GLOBAL index. At the moment it's not displayed, it's coming in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:48 -03:00
Jiri Olsa	6d9f0c2d5e	perf annotate: Switch struct annotation_data::percent to array So we can hold multiple percent values for annotation line. The first member of this array is current local hits percent value (PERCENT_HITS_LOCAL index), so no functional change is expected. Adding annotation_data__percent function to return requested percent value from struct annotation_data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:48 -03:00
Jiri Olsa	2bcf73069b	perf annotate: Loop group events directly in annotation__calc_percent() We need to bring in 'struct hists' object and for that we need 'struct perf_evsel' object in the scope. Switching the group data loop with the evsel group loop. It does the same thing, but it brings evsel object, that we can use later get the 'struct hists' object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:47 -03:00
Jiri Olsa	48a1e4f238	perf annotate: Rename hist to sym_hist in annotation__calc_percent We will need to bring in 'struct hists' variable in this scope, so it's better we do this rename first. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:47 -03:00
Jiri Olsa	0440af74dc	perf annotate: Rename local sample variables to data Based on previous rename, changing also the local variable names to fit properly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:47 -03:00
Jiri Olsa	c2f938ba5a	perf annotate: Rename struct annotation_line::samples* to data* The name 'samples*' is little confusing because we have nested 'struct sym_hist_entry' under annotation_line struct, which holds 'nr_samples' as well. Also the holding struct name is 'annotation_data' so the 'data' name fits better. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:46 -03:00
Jiri Olsa	0683d13c1a	perf annotate: Get rid of annotation__scnprintf_samples_period() We have more current function tto get the title for annotation, which is hists__scnprintf_title. They both have same output as far as the annotation's header line goes. They differ in counting of the nr_samples, hists__scnprintf_title provides more accurate number based on the setup of the symbol_conf.filter_relative variable. Plus it also displays any uid/thread/dso/socket filters/zooms if there are set any, which annotation__scnprintf_samples_period does not. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:46 -03:00
Jiri Olsa	5ecf7d30eb	perf annotate: Make annotation_line__max_percent static There's no outside user of it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:45 -03:00
Jiri Olsa	7a3e71e0d8	perf annotate: Make symbol__annotate_fprintf2() local There's no outside user of it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/r/20180804130521.11408-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:45 -03:00
Arnaldo Carvalho de Melo	dda9ac966d	perf bpf: Add 'syscall_enter' probe helper for syscall enter tracepoints Allowing one to hook into the syscalls:sys_enter_NAME tracepoints, an example is provided that hooks into the 'openat' syscall. Using it with the probe:vfs_getname probe into getname_flags to get the filename args as it is copied from userspace: # perf probe -l probe:vfs_getname (on getname_flags:73@acme/git/linux/fs/namei.c with pathname) # perf trace -e probe:*getname,tools/perf/examples/bpf/sys_enter_openat.c cat /etc/passwd > /dev/null 0.000 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/ld.so.preload" 0.022 syscalls:sys_enter_openat:dfd: CWD, filename: 0xafbe8da8, flags: CLOEXEC 0.027 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/ld.so.cache" 0.054 syscalls:sys_enter_openat:dfd: CWD, filename: 0xafdf0ce0, flags: CLOEXEC 0.057 probe:vfs_getname:(ffffffffbd2a8983) pathname="/lib64/libc.so.6" 0.316 probe:vfs_getname:(ffffffffbd2a8983) pathname="/usr/lib/locale/locale-archive" 0.375 syscalls:sys_enter_openat:dfd: CWD, filename: 0xe2b2b0b4 0.379 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/passwd" # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2po9jcqv1qgj0koxlg8kkg30@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:44 -03:00
Yury Norov	3c8b818640	perf tools: Drop unneeded bitmap_zero() calls bitmap_zero() is called after bitmap_alloc() in perf code. But bitmap_alloc() internally uses calloc() which guarantees that allocated area is zeroed. So following bitmap_zero is unneeded. Drop it. This happened because of confusing name for bitmap allocator. It should has name bitmap_zalloc instead of bitmap_alloc. This series: https://lkml.org/lkml/2018/6/18/841 introduces a new API for bitmap allocations in kernel, and functions there are named correctly. Following patch propogates the API to tools, and fixes naming issue. Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Link: http://lkml.kernel.org/r/20180623073502.16321-1-ynorov@caviumnetworks.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:44 -03:00
Sean V Kelley	704089e77a	perf vendor events arm64: Enable JSON events for eMAG This patch adds the Ampere Computing eMAG file. This platform follows the ARMv8 recommended IMPLEMENTATION DEFINED events, where applicable. Signed-off-by: Sean V Kelley <seanvk.dev@oregontracks.org> Reviewed-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org LPU-Reference: 20180803041811.17065-1-seanvk.dev@oregontracks.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:55:43 -03:00
Thomas Richter	33d9e1832e	perf report: Add GUI report support for s390 auxiliary trace Add support for s390 auxiliary trace support. Use 'perf record -e rbd000 -- ls' to create the perf.data file. Use 'perf report' to display the auxiliary trace data. Output before: [root@s35lp76 perf]# ./perf report --stdio 0x128 [0x10]: failed to process type: 70 Error: failed to process sample [root@s35lp76 perf]# Output after: [root@s35lp76 perf]# ./perf report --stdio 18.21% 18.21% ls [kernel.kallsyms] [k] ftrace_likely_update 9.52% 9.52% ls [kernel.kallsyms] [k] lock_acquire 9.38% 9.38% ls [kernel.kallsyms] [k] lock_release 3.45% 3.45% ls [kernel.kallsyms] [k] lock_acquired 2.88% 2.88% ls [kernel.kallsyms] [k] link_path_walk 2.63% 2.63% ls [kernel.kallsyms] [k] __d_lookup 2.38% 2.38% ls [kernel.kallsyms] [k] __d_lookup_rcu 2.04% 2.04% ls [kernel.kallsyms] [k] ___might_sleep 1.83% 1.83% ls [kernel.kallsyms] [k] debug_lockdep_rcu_enabled 1.44% 1.44% ls [kernel.kallsyms] [k] dput .... Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180802074622.13641-4-tmricht@linux.ibm.com [ Use PRI[xd]64 to fix the build on debian:experimental-x-mips (gcc 8.1.0) and others ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:49:17 -03:00
Thomas Richter	2b1444f2e2	perf report: Add raw report support for s390 auxiliary trace Add support for s390 auxiliary trace support. Use 'perf record -e rbd000' to create the perf.data file. The event also has the symbolic name SF_CYCLES_BASIC_DIAG, using 'perf record -e SF_CYCLES_BASIC_DIAG' is equivalent. Use 'perf report -D' to display the auxiliary trace data. Output before: 0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000 offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4 Nothing else Output after: 0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000 offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4 . . ... s390 AUX data: size 262144 bytes [00000000] Basic Def:0001 Inst:0000 TW AS:3 ASN:0xffff IA:0x0000000000c2f1bc CL:1 HPP:0x8000000000000000 GPP:000000000000000000 [0x000020] Diag Def:8005 [0x0000bf] Basic Def:0001 Inst:0000 TW AS:3 ASN:0xffff IA:0x0000000000c2f1bc CL:1 HPP:0x8000000000000000 GPP:000000000000000000 [0x0000df] Diag Def:8005 [0x00017e] Basic Def:0001 Inst:0000 TW AS:3 ASN:0xffff IA:0x0000000000c2f1bc CL:1 HPP:0x8000000000000000 GPP:000000000000000000 .... [0x000fc0] Trailer F T bsdes:32 dsdes:159 Overflow:0 Time:0xd4ab59a8450fa108 C:1 TOD:0xd4ab4ec98ceb3832 1:0x8000000000000000 2:0xd4ab4ec98ceb3832 This output is shown for every sampled data block. The output contains the - basic-sampling data entry - diagnostic-sampling data entry - trailer entry The basic sampling entry and diagnostic sampling entry sizes can be extracted using the trailer entries in the SDB. On older hardware these values (bsdes and dsdes in the trailer entry) are reserved and zero. Older hardware use hard coded values based on the s390 machine type. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20180802074622.13641-3-tmricht@linux.ibm.com Link: http://lkml.kernel.org/r/eda2632e-7919-5ffd-5f68-821e77d216fa@linux.ibm.com [ Merged a fix for a 'tipe puned' problem reported by Michael Ellerman see last Link tag. ] [ Removed __packed from two structs, they're already naturally packed and having that. ] [ attribute breaks the build in gcc 8.1.1 mips, 4.4.7 x86_64, 7.1.1 ARCompact ISA, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-08 15:26:48 -03:00
Thomas Richter	b96e6615cd	perf auxtrace: Support for perf report -D for s390 Add initial support for s390 auxiliary traces using the CPU-Measurement Sampling Facility. Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data file. Later patches will show the contents of the auxiliary traces. Setup the auxtrace queues and data structures for s390. A raw dump of the perf.data file now does not show an error when an auxtrace event is encountered. Output before: [root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace 0x128 [0x10]: failed to process type: 70 Error: failed to process sample 0x128 [0x10]: event: 70 . . ... raw event: size 16 bytes . 0000: 00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00 ...F............ 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0 [root@s35lp76 perf]# Output after: # ./perf report -D -i perf.data.auxtrace \|fgrep PERF_RECORD_AUXTRACE 0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5 0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000 offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4 .... Additional notes about the underlying hardware and software implementation, provided by Hendrik Brueckner (see Link: below). ============================================================================= The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain performance information on the mainframe. Basically, it was introduced with System z10 years ago for the z/Architecture, that means, 64-bit. For Linux, there are two facilities of interest, counter facility and sampling facility. The counter facility provides hardware counters for instructions, cycles, crypto-activities, and many more. The sampling facility is a hardware sampler that when started will write samples at a particular interval into a sampling buffer. At some point, for example, if a sample block is full, it generates an interrupt to collect samples (while the sampler continues to run). Few years ago, I started to provide the a perf PMU to use the counter and sampling facilities. Recently, the device driver was updated to also "export" the sampling buffer into the AUX area. Thomas now completed the related perf work to interpret and process these AUX data. If people are more interested in the sampling facility, they can have a look into: - The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05 http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a and to learn how-to use it for Linux on Z, have look at chapter 54, "Using the CPU-measurement facilities" in the: - Device Drivers, Features, and Commands, SC33-8411-34 http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf ============================================================================= Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-03 10:34:18 -03:00
Arnaldo Carvalho de Melo	f3acd8869b	perf trace: Use perf_evsel__sc_tp_{uint,ptr} for "id"/"args" handling syscalls:* events Now it looks just about the same as for the trace__sys_{enter,exit}. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-y59may7zx1eccnp4m3qm4u0b@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 15:39:00 -03:00
Arnaldo Carvalho de Melo	d32855fa35	perf trace: Setup struct syscall_tp for syscalls:sys_{enter,exit}_NAME events Mapping "__syscall_nr" to "id" and setting up "args" from the offset of "__syscall_nr" + sizeof(u64), as the payload for syscalls:* is the same as for raw_syscalls:*, just the fields have different names. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ogeenrpviwcpwl3oy1l55f3m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 15:38:57 -03:00
Arnaldo Carvalho de Melo	aa823f58f7	perf trace: Allow setting up a syscall_tp struct without a format_field To avoid having to ask libtraceevent to find a field by name when handling each tracepoint event, we setup a struct syscall_tp with a tp_field struct having an extractor function + the offset for the "id", "args" and "ret" raw_syscalls:sys_{enter,exit} tracepoints. Now that we want to do the same with syscalls:sys_{entry,exit}_NAME individual syscall tracepoints, where we have "id" as "__syscall_nr" and "args" as the actual series of per syscall parameters, we need more flexibility from the routines that set up these pre-looked up syscall tracepoint arg fields. The next cset will use it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v59q5e0jrlzkpl9a1c7t81ni@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 15:07:33 -03:00
Arnaldo Carvalho de Melo	63f11c80e5	perf trace: Rename some syscall_tp methods to raw_syscall Because raw_syscalls have the field for the syscall number as 'id' while the syscalls:sys_{enter,exit}_NAME have it as __syscall_nr... Since we want to support both for being able to enable just a syscalls:sys_{enter,exit}_name instead of asking for raw_syscalls:sys_{enter,exit} plus filters, make the method names for each kind of tracepoint more explicit. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4rixbfzco6tsry0w9ghx3ktb@git.kernel.org Signef-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 15:07:28 -03:00
Arnaldo Carvalho de Melo	a98392bb1e	perf trace: Use beautifiers on syscalls:sys_enter_ handlers We were using the beautifiers only when processing the raw_syscalls:sys_enter events, but we can as well use them for the syscalls:sys_enter_NAME events, as the layout is the same. Some more tweaking is needed as we're processing them straight away, i.e. there is no buffering in the sys_enter_NAME event to wait for things like vfs_getname to provide pointer contents and then flushing at sys_exit_NAME, so we need to state in the syscall_arg that this is unbuffered, just print the pointer values, beautifying just non-pointer syscall args. This just shows an alternative way of processing tracepoints, that we will end up using when creating "tracepoint" payloads that already copy pointer contents (or chunks of it, i.e. not the whole filename, but just the end of it, not all the bf for a read/write, but just the start, etc), directly in the kernel using eBPF. E.g.: # perf trace -e syscalls:entersleep,sleep sleep 1 0.303 ( ): syscalls:sys_enter_nanosleep:rqtp: 0x7ffc93d5ecc0 0.305 (1000.229 ms): sleep/8746 nanosleep(rqtp: 0x7ffc93d5ecc0) = 0 # perf trace -e syscalls:_sleep,sleep sleep 1 0.288 ( ): syscalls:sys_enter_nanosleep:rqtp: 0x7ffecde87e40 0.289 ( ): sleep/8748 nanosleep(rqtp: 0x7ffecde87e40) ... 1000.479 ( ): syscalls:sys_exit_nanosleep:0x0 0.289 (1000.208 ms): sleep/8748 ... [continued]: nanosleep()) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-jehyd2zwhw00z3p7v7mg9632@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 15:07:19 -03:00
Arnaldo Carvalho de Melo	6a648b534d	perf trace: Associate vfs_getname()'ed pathname with fd returned from 'openat' When the vfs_getname() wannabe tracepoint is in place: # perf probe -l probe:vfs_getname (on getname_flags:73@acme/git/linux/fs/namei.c with pathname) # 'perf trace' will use it to get the pathname when it is copied from userspace to the kernel, right after syscalls:sys_enter_open, copied in the 'probe:vfs_getname', stash it somewhere and then, at syscalls:sys_exit_open time, if the 'open' return is not -1, i.e. a successfull open syscall, associate that pathname to this return, i.e. the fd. We were not doing this for the 'openat' syscall, which would cause 'perf trace' to fallback to using /proc to get the fd, change it so that we use what we got from probe:vfs_getname, reducing the 'openat' beautification process cost, ditching the syscalls performed to read procfs state and avoiding some possible races in the process. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-xnp44ao3bkb6ejeczxfnjwsh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-02 10:30:08 -03:00
Arnaldo Carvalho de Melo	b912885ab7	perf trace: Do not require --no-syscalls to suppress strace like output So far the --syscalls option was the default, requiring explicit --no-syscalls when wanting to process just some other event, invert that and assume it only when no other event was specified, allowing its explicit enablement when wanting to see all syscalls together with some other event: E.g: The existing default is maintained for a single workload: # perf trace sleep 1 <SNIP> 0.264 ( 0.003 ms): sleep/12762 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7f62cbf04000 0.271 ( 0.001 ms): sleep/12762 close(fd: 3) = 0 0.295 (1000.130 ms): sleep/12762 nanosleep(rqtp: 0x7ffd15194fd0) = 0 1000.469 ( 0.006 ms): sleep/12762 close(fd: 1) = 0 1000.480 ( 0.004 ms): sleep/12762 close(fd: 2) = 0 1000.502 ( ): sleep/12762 exit_group() # For a pid: # pidof ssh 7826 3961 3226 2628 2493 # perf trace -p 3961 ? ( ): ... [continued]: select()) = 1 0.023 ( 0.005 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce870 ) = 0 0.036 ( 0.009 ms): read(fd: 5</dev/pts/7>, buf: 0x7ffcc8fca7b0, count: 16384 ) = 3 0.060 ( 0.004 ms): getpid( ) = 3961 (ssh) 0.079 ( 0.004 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce8e0 ) = 0 0.088 ( 0.003 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce7c0 ) = 0 <SNIP> For system wide, threads, cgroups, user, etc when no event is specified, the existing behaviour is maintained, i.e. --syscalls is selected. When some event is specified, then --no-syscalls doesn't need to be specified: # perf trace -e tcp:tcp_probe ssh localhost 0.000 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=53 snd_nxt=0xb67ce8f7 snd_una=0xb67ce8f7 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43690 0.010 tcp:tcp_probe:src=[::1]:39074 dest=[::1]:22 mark=0 length=32 snd_nxt=0xa8f9ef38 snd_una=0xa8f9ef23 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43690 srtt=31 rcv_wnd=43776 4.525 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=1240 snd_nxt=0xb67ce90c snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43776 7.242 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=80 snd_nxt=0xb67ced44 snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=174720 The authenticity of host 'localhost (::1)' can't be established. ECDSA key fingerprint is SHA256:TKZS58923458203490asekfjaklskljmkjfgPMBfHzY. ECDSA key fingerprint is MD5:d8:29:54:40:71:fa:b8:44:89:52:64:8a:35:42:d0:e8. Are you sure you want to continue connecting (yes/no)? ^C # To get the previous behaviour just use --syscalls and get all syscalls formatted strace like + the specified extra events: # trace -e sched:switch --syscalls sleep 1 <SNIP> 0.160 ( 0.003 ms): sleep/12877 mprotect(start: 0x7fdfe2361000, len: 4096, prot: READ) = 0 0.164 ( 0.009 ms): sleep/12877 munmap(addr: 0x7fdfe2345000, len: 113155) = 0 0.211 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce68e000 0.212 ( 0.002 ms): sleep/12877 brk(brk: 0x55d3ce6af000) = 0x55d3ce6af000 0.215 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce6af000 0.219 ( 0.004 ms): sleep/12877 open(filename: 0xe1f07c00, flags: CLOEXEC) = 3 0.225 ( 0.001 ms): sleep/12877 fstat(fd: 3, statbuf: 0x7fdfe2138aa0) = 0 0.227 ( 0.003 ms): sleep/12877 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7fdfdb1b8000 0.234 ( 0.001 ms): sleep/12877 close(fd: 3) = 0 0.257 ( ): sleep/12877 nanosleep(rqtp: 0x7fffb36b6020) ... 0.260 ( ): sched:sched_switch:prev_comm=sleep prev_pid=12877 prev_prio=120 prev_state=D ==> next_comm=swapper/3 next_pid=0 next_prio=120 0.257 (1000.134 ms): sleep/12877 ... [continued]: nanosleep()) = 0 1000.428 ( 0.006 ms): sleep/12877 close(fd: 1) = 0 1000.440 ( 0.004 ms): sleep/12877 close(fd: 2) = 0 1000.461 ( ): sleep/12877 exit_group() # When specifiying just some syscalls, the behaviour doesn't change, i.e.: # trace -e nanosleep -e sched:switch sleep 1 0.000 ( ): sleep/14974 nanosleep(rqtp: 0x7ffc344ba9c0 ) ... 0.007 ( ): sched:sched_switch:prev_comm=sleep prev_pid=14974 prev_prio=120 prev_state=D ==> next_comm=swapper/2 next_pid=0 next_prio=120 0.000 (1000.139 ms): sleep/14974 ... [continued]: nanosleep()) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-om2fulll97ytnxv40ler8jkf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-01 16:20:28 -03:00
Arnaldo Carvalho de Melo	822c2621da	perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h The next example scripts need the definition for the BPF functions, i.e. things like BPF_FUNC_probe_read, and in time will require lots of other definitions found in uapi/linux/bpf.h, so include it from the bpf.h file included from the eBPF scripts build with clang via '-e bpf_script.c' like in this example: $ tail -8 tools/perf/examples/bpf/5sec.c #include <bpf.h> int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec) { return sec == 5; } license(GPL); $ That 'bpf.h' include in the 5sec.c eBPF example will come from a set of header files crafted for building eBPF objects, that in a end-user system will come from: /usr/lib/perf/include/bpf/bpf.h And will include <uapi/linux/bpf.h> either from the place where the kernel was built, or from a kernel-devel rpm package like: -working-directory /lib/modules/4.17.9-100.fc27.x86_64/build That is set up by tools/perf/util/llvm-utils.c, and can be overriden by setting the 'kbuild-dir' variable in the "llvm" ~/.perfconfig file, like: # cat ~/.perfconfig [llvm] kbuild-dir = /home/foo/git/build/linux This usually doesn't need any change, just documenting here my findings while working with this code. In the future we may want to instead just use what is in /usr/include/linux/bpf.h, that comes from the UAPI provided from the kernel sources, for now, to avoid getting the kernel's non-UAPI "linux/bpf.h" file, that will cause clang to fail and is not what we want anyway (no BPF function definitions, etc), do it explicitely by asking for "uapi/linux/bpf.h". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zd8zeyhr2sappevojdem9xxt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-01 12:34:06 -03:00
Christophe Leroy	21b8732eb4	perf tools: Allow overriding MAX_NR_CPUS at compile time After update of kernel, the perf tool doesn't run anymore on my 32MB RAM powerpc board, but still runs on a 128MB RAM board: ~# strace perf execve("/usr/sbin/perf", ["perf"], [/* 12 vars /]) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} --- +++ killed by SIGSEGV +++ Segmentation fault objdump -x shows that .bss section has a huge size of 24Mbytes: 27 .bss 016baca8 101cebb8 101cebb8 001cd988 2*3 With especially the following objects having quite big size: 10205f80 l O .bss 00140000 runtime_cycles_stats 10345f80 l O .bss 00140000 runtime_stalled_cycles_front_stats 10485f80 l O .bss 00140000 runtime_stalled_cycles_back_stats 105c5f80 l O .bss 00140000 runtime_branches_stats 10705f80 l O .bss 00140000 runtime_cacherefs_stats 10845f80 l O .bss 00140000 runtime_l1_dcache_stats 10985f80 l O .bss 00140000 runtime_l1_icache_stats 10ac5f80 l O .bss 00140000 runtime_ll_cache_stats 10c05f80 l O .bss 00140000 runtime_itlb_cache_stats 10d45f80 l O .bss 00140000 runtime_dtlb_cache_stats 10e85f80 l O .bss 00140000 runtime_cycles_in_tx_stats 10fc5f80 l O .bss 00140000 runtime_transaction_stats 11105f80 l O .bss 00140000 runtime_elision_stats 11245f80 l O .bss 00140000 runtime_topdown_total_slots 11385f80 l O .bss 00140000 runtime_topdown_slots_retired 114c5f80 l O .bss 00140000 runtime_topdown_slots_issued 11605f80 l O .bss 00140000 runtime_topdown_fetch_bubbles 11745f80 l O .bss 00140000 runtime_topdown_recovery_bubbles This is due to commit `4d255766d2` ("perf: Bump max number of cpus to 1024"), because many tables are sized with MAX_NR_CPUS This patch gives the opportunity to redefine MAX_NR_CPUS via $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170922112043.8349468C57@po15668-vm-win7.idsi0.si.c-s.fr Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-08-01 12:33:24 -03:00
Arnaldo Carvalho de Melo	739e2edc84	perf bpf: Show better message when failing to load an object Before: libbpf: license of tools/perf/examples/bpf/etcsnoop.c is GPL libbpf: section(6) version, size 4, link 0, flags 3, type=1 libbpf: kernel version of tools/perf/examples/bpf/etcsnoop.c is 41200 libbpf: section(7) .symtab, size 120, link 1, flags 0, type=2 bpf: config program 'syscalls:sys_enter_openat' libbpf: load bpf program failed: Operation not permitted libbpf: failed to load program 'syscalls:sys_enter_openat' libbpf: failed to load object 'tools/perf/examples/bpf/etcsnoop.c' bpf: load objects failed After: (just the last line changes) bpf: load objects failed: err=-4009: (Incorrect kernel version) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-wi44iid0yjfht3lcvplc75fm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 11:58:57 -03:00
Michael Petlan	95f04328e4	perf list: Unify metric group description format with PMU event description PMU event descriptions use 7 spaces + '[' or 8 spaces as indentation. Metric groups used a tab + '['. This patch unifies it to the way PMU event descriptions are indented. BEFORE: $ perf list [...] Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] [...] AFTER: $ perf list [...] Metric Groups: DSB: DSB_Coverage [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)] [...] Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> LPU-Reference: 771439042.22924766.1532986504631.JavaMail.zimbra@redhat.com Link: https://lkml.kernel.org/n/tip-mlo850517m6u1rbjndvd1bwr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 11:35:44 -03:00
Ganapatrao Kulkarni	b9b77222d4	perf vendor events arm64: Update ThunderX2 implementation defined pmu core events Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ganapatrao Kulkarni <gklkml16@gmail.com> Cc: Jan Glauber <jan.glauber@cavium.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Vadim Lomovtsev <vadim.lomovtsev@cavium.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180731100251.23575-1-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 11:28:44 -03:00
Leo Yan	14a85b1eca	perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet CS_ETM_TRACE_ON packet itself can give the info that there have a discontinuity in the trace, this patch is to add branch sample for CS_ETM_TRACE_ON packet if it is inserted in the middle of CS_ETM_RANGE packets; as result we can have hint for the trace discontinuity. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-7-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 11:22:50 -03:00
Leo Yan	d603b4e9f9	perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet If one CS_ETM_TRACE_ON packet is inserted, we miss to generate branch sample for the previous CS_ETM_RANGE packet. This patch is to generate branch sample when receiving a CS_ETM_TRACE_ON packet, so this can save complete info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON packet. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-6-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 11:22:11 -03:00
Leo Yan	6035b6804b	perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and 'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in the decoder layer as dummy value, but the dummy value is pointless for branch sample when we use 'perf script' command to check program flow. This patch is a preparation to support CS_ETM_TRACE_ON packet for branch sample, it converts the dummy address value to zero for more readable; this is accomplished by cs_etm__last_executed_instr() and cs_etm__first_executed_instr(). The later one is a new function introduced by this patch. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-5-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:58:29 -03:00
Leo Yan	3eb3e07bcf	perf cs-etm: Fix start tracing packet handling Usually the start tracing packet is a CS_ETM_TRACE_ON packet, this packet is passed to cs_etm__flush(); cs_etm__flush() will check the condition 'prev_packet->sample_type == CS_ETM_RANGE' but 'prev_packet' is allocated by zalloc() so 'prev_packet->sample_type' is zero in initialization and this condition is false. So cs_etm__flush() will directly bail out without handling the start tracing packet. This patch is to introduce a new sample type CS_ETM_EMPTY, which is used to indicate the packet is an empty packet. cs_etm__flush() will swap packets when it finds the previous packet is empty, so this can record the start tracing packet into 'etmq->prev_packet'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1531295145-596-4-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:57:56 -03:00
Thomas Richter	83868bf71d	perf build: Fix installation directory for eBPF The perf tool build and install is controlled via a Makefile. The 'install' rule creates directories and copies files. Among them are header files installed in /usr/lib/include/perf/bpf/. However all listed examples are installing its header files in /usr/lib/<tool-name>/...[/include]/header.h and not in /usr/lib/include/<tool-name>/.../header.h. Background information: Building the Fedora 28 glibc RPM on s390x and s390 fails on s390 (gcc -m31) as gcc is not able to find header-files like stdbool.h. In the glibc.spec file, you can see that glibc is configured with "--with-headers". In this case, first -nostdinc is added to the CFLAGS and then further include paths are added via -isystem. One of those paths should contain header files like stdbool.h. In order to get this path, gcc is invoked with: - on Fedora 28 (with 4.18 kernel): $ gcc -print-file-name=include /usr/lib/gcc/s390x-redhat-linux/8/include $ gcc -m31 -print-file-name=include /usr/lib/gcc/s390x-redhat-linux/8/../../../../lib/include => If perf is installed, this is: /usr/lib/include On my machine this directory is only containing the directory "perf". If perf is not installed gcc returns: /usr/lib/gcc/s390x-redhat-linux/8/include - on Ubuntu 18.04 (with 4.15 kernel): $ gcc -print-file-name=include /usr/lib/gcc/s390x-linux-gnu/7/include $ gcc -m31 -print-file-name=include /usr/lib/gcc/s390x-linux-gnu/7/include => gcc returns the correct path even if perf is installed. In each case, the introduction of the subdirectory /usr/lib/include leads to the regression that one can not build the glibc RPM for s390 anymore as gcc can not find headers like stdbool.h. To remedy this install bpf.h to /usr/lib/perf/include/bpf/bpf.h Output before using the command 'perf test -Fv 40': echo '...[bpf-program-source]...' \| /usr/bin/clang ... \ -I/root/lib/include/perf/bpf ... ^^^^^^^^^^^^ ... [root@p23lp27 perf]# perf test -F 40 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok [root@p23lp27 perf]# Output after using command 'perf test -Fv 40': echo '...[bpf-program-source]...' \| /usr/bin/clang ... \ -I/root/lib/perf/include/bpf ... ^^^^^^^^^^^^ ... [root@p23lp27 perf]# perf test -F 40 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok [root@p23lp27 perf]# Committer testing: While the above 'perf test -F 40' (or 'perf test bpf') will allow us to see that the correct path is now added via -I, to actually test this we better try to use a bpf script that includes files in the changed directory. We have the files that now reside in /root/lib/perf/examples/bpf/ to do just that: # tail -8 /root/lib/perf/examples/bpf/5sec.c #include <bpf.h> int probe(hrtimer_nanosleep, rqtp->tv_sec)(void ctx, int err, long sec) { return sec == 5; } license(GPL); # perf trace -e sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 4 0.333 (4000.086 ms): sleep/9248 nanosleep(rqtp: 0x7ffc155f3300) = 0 # perf trace -e sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5 0.287 ( ): sleep/9659 nanosleep(rqtp: 0x7ffeafe38200) ... 0.290 ( ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5 0.287 (5000.059 ms): sleep/9659 ... [continued]: nanosleep()) = 0 # perf trace -e sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 6 0.247 (5999.951 ms): sleep/10068 nanosleep(rqtp: 0x7fff2086d900) = 0 # perf trace -e *sleep -e /root/lib/perf/examples/bpf/5sec.c sleep 5.987 0.293 ( ): sleep/10489 nanosleep(rqtp: 0x7ffdd4fc10e0) ... 0.296 ( ): perf_bpf_probe:hrtimer_nanosleep:(ffffffff9911efe0) tv_sec=5 0.293 (5986.912 ms): sleep/10489 ... [continued]: nanosleep()) = 0 # Suggested-by: Stefan Liebler <stli@linux.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: `1b16fffa38` ("perf llvm-utils: Add bpf include path to clang command line") Link: http://lkml.kernel.org/r/20180731073254.91090-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:54:50 -03:00
Jiri Olsa	7397833257	perf c2c report: Fix crash for empty browser 'perf c2c' scans read/write accesses and tries to find false sharing cases, so when the events it wants were not asked for or ended up not taking place, we get no histograms. So do not try to display entry details if there's not any. Currently this ends up in crash: $ perf c2c report # then press 'd' perf: Segmentation fault $ Committer testing: Before: Record a perf.data file without events of interest to 'perf c2c report', then call it and press 'd': # perf record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (6 samples) ] # perf c2c report perf: Segmentation fault -------- backtrace -------- perf[0x5b1d2a] /lib64/libc.so.6(+0x346df)[0x7fcb566e36df] perf[0x46fcae] perf[0x4a9f1e] perf[0x4aa220] perf(main+0x301)[0x42c561] /lib64/libc.so.6(__libc_start_main+0xe9)[0x7fcb566cff29] perf(_start+0x29)[0x42c999] # After the patch the segfault doesn't take place, a follow up patch to tell the user why nothing changes when 'd' is pressed would be good. Reported-by: rodia@autistici.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `f1c5fd4d0b` ("perf c2c report: Add TUI cacheline browser") Link: http://lkml.kernel.org/r/20180724062008.26126-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:53:20 -03:00
Sandipan Das	aa90f9f955	perf tests: Fix indexing when invoking subtests Recently, the subtest numbering was changed to start from 1. While it is fine for displaying results, this should not be the case when the subtests are actually invoked. Typically, the subtests are stored in zero-indexed arrays and invoked based on the index passed to the main test function. Since the index now starts from 1, the second subtest in the array (index 1) gets invoked instead of the first (index 0). This applies to all of the following subtests but for the last one, the subtest always fails because it does not meet the boundary condition of the subtest index being lesser than the number of subtests. This can be observed on powerpc64 and x86_64 systems running Fedora 28 as shown below. Before: # perf test "builtin clang support" 55: builtin clang support : 55.1: builtin clang compile C source to IR : Ok 55.2: builtin clang compile C source to ELF object : FAILED! # perf test "LLVM search and compile" 38: LLVM search and compile : 38.1: Basic BPF llvm compile : Ok 38.2: kbuild searching : Ok 38.3: Compile source for BPF prologue generation : Ok 38.4: Compile source for BPF relocation : FAILED! # perf test "BPF filter" 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : FAILED! After: # perf test "builtin clang support" 55: builtin clang support : 55.1: builtin clang compile C source to IR : Ok 55.2: builtin clang compile C source to ELF object : Ok # perf test "LLVM search and compile" 38: LLVM search and compile : 38.1: Basic BPF llvm compile : Ok 38.2: kbuild searching : Ok 38.3: Compile source for BPF prologue generation : Ok 38.4: Compile source for BPF relocation : Ok # perf test "BPF filter" 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Fixes: `9ef0112442` ("perf test: Fix subtest number when showing results") Link: http://lkml.kernel.org/r/20180726171733.33208-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:51 -03:00
Arnaldo Carvalho de Melo	162d3edbe5	perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args For instance: $ trace -e socket* ssh sandy 0.000 ( 0.031 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 0.052 ( 0.015 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 1.568 ( 0.020 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 1.603 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 1.699 ( 0.014 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 1.724 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM\|CLOEXEC\|NONBLOCK ) = 3 1.804 ( 0.020 ms): ssh/19919 socket(family: INET, type: STREAM, protocol: TCP ) = 3 17.549 ( 0.098 ms): ssh/19919 socket(family: LOCAL, type: STREAM ) = 4 acme@sandy's password: Just like with other syscall args, the common bits are supressed so that the output is more compact, i.e. we use "TCP" instead of "IPPROTO_TCP", but we can make this show the original constant names if we like it by using some command line knob or ~/.perfconfig "[trace]" section variable. Also needed is to make perf's event parser accept things like: $ perf trace -e socket*/protocol=TCP/ By using both the tracefs event 'format' files and these tables built from the kernel sources. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-l39jz1vnyda0b6jsufuc8bz7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:49 -03:00
Arnaldo Carvalho de Melo	03aeb6c818	perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg It'll be wired to 'perf trace' in the next cset. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2i9vkvm1ik8yu4hgjmxhsyjv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:47 -03:00
Arnaldo Carvalho de Melo	bc972ada4f	perf trace beauty: Do not print NULL strarray entries We may have string tables where not all slots have values, in those cases its better to print the numeric value, for instance: In the table below we would show "protocol: (null)" for socket_ipproto[3] Where it would be better to show "protocol: 3". $ tools/perf/trace/beauty/socket_ipproto.sh static const char *socket_ipproto[] = { [0] = "IP", [103] = "PIM", [108] = "COMP", [12] = "PUP", [132] = "SCTP", [136] = "UDPLITE", [137] = "MPLS", [17] = "UDP", [1] = "ICMP", [22] = "IDP", [255] = "RAW", [29] = "TP", [2] = "IGMP", [33] = "DCCP", [41] = "IPV6", [46] = "RSVP", [47] = "GRE", [4] = "IPIP", [50] = "ESP", [51] = "AH", [6] = "TCP", [8] = "EGP", [92] = "MTP", [94] = "BEETPH", [98] = "ENCAP", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-7djfak94eb3b9ltr79cpn3ti@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:46 -03:00
Arnaldo Carvalho de Melo	9849eec3a4	perf beauty: Add a generator for IPPROTO_ socket's protocol constants It'll use tools/include copy of linux/in.h to generate a table to be used by tools, initially by the 'socket' and 'socketpair' beautifiers in 'perf trace', but that could also be used to translate from a string constant to the integer value to be used in a eBPF or tracefs tracepoint filter. When used without any args it produces: $ tools/perf/trace/beauty/socket_ipproto.sh static const char *socket_ipproto[] = { [0] = "IP", [103] = "PIM", [108] = "COMP", [12] = "PUP", [132] = "SCTP", [136] = "UDPLITE", [137] = "MPLS", [17] = "UDP", [1] = "ICMP", [22] = "IDP", [255] = "RAW", [29] = "TP", [2] = "IGMP", [33] = "DCCP", [41] = "IPV6", [46] = "RSVP", [47] = "GRE", [4] = "IPIP", [50] = "ESP", [51] = "AH", [6] = "TCP", [8] = "EGP", [92] = "MTP", [94] = "BEETPH", [98] = "ENCAP", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v9rafqh3qn6b9kp9vfvj9f8s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:41 -03:00
Arnaldo Carvalho de Melo	a4b2061242	tools include uapi: Grab a copy of linux/in.h We'll use it to create tables for the 'protocol' argument to the socket syscall when the 'family' arg is one of AF_INET or AF_INET6. Add it to check_headers.sh so that when a new protocol gets added we get a notification during the build process. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2amnveu1ns4emjn70xuavpje@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:37 -03:00
Sandipan Das	a6f39cecf7	perf tests: Fix complex event name parsing The 'umask' event parameter is unsupported on some architectures like powerpc64. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf test "Parse event definition strings" -v 6: Parse event definition strings : --- start --- test child forked, pid 45915 ... running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp'Invalid event/parameter 'umask' Invalid event/parameter 'umask' failed to parse event 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp', err 1, str 'unknown term' event syntax error: '..,event=0x2,umask=0x3/ukp' \___ unknown term valid terms: event,mark,pmc,cache_sel,pmcxsel,unit,thresh_stop,thresh_start,combine,thresh_sel,thresh_cmp,sample_mode,config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,no-overwrite,overwrite,driver-config mem_access -> cpu/event=0x10401e0/ running test 0 'config=10,config1,config2=3,umask=1' test child finished with 1 ---- end ---- Parse event definition strings: FAILED! Committer testing: After applying the patch these test passes and in verbose mode we get: # perf test -v "event definition" 6: Parse event definition strings: --- start --- test child forked, pid 11061 running test 0 'syscalls:sys_enter_openat'Using CPUID GenuineIntel-6-9E <SNIP> running test 53 'cycles/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'/Duk' running test 0 'cpu/config=10,config1,config2=3,period=1000/u' running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u' running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/' running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp' <SNIP> test child finished with 0 ---- end ---- Parse event definition strings: Ok # Suggested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: `06dc5bf21f` ("perf tests: Check that complex event name is parsed correctly") Link: http://lkml.kernel.org/r/20180726105502.31670-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 10:52:23 -03:00
Kan Liang	95035c5e16	perf evlist: Fix error out while applying initial delay and LBR 'perf record' will error out if both --delay and LBR are applied. For example: # perf record -D 1000 -a -e cycles -j any -- sleep 2 Error: dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' # A dummy event is added implicitly for initial delay, which has the same configurations as real sampling events. The dummy event is a software event. If LBR is configured, perf must error out. The dummy event will only be used to track PERF_RECORD_MMAP while perf waits for the initial delay to enable the real events. The BRANCH_STACK bit can be safely cleared for the dummy event. After applying the patch: # perf record -D 1000 -a -e cycles -j any -- sleep 2 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.054 MB perf.data (828 samples) ] # Reported-by: Sunil K Pandey <sunil.k.pandey@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1531145722-16404-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 09:56:46 -03:00
Arnaldo Carvalho de Melo	61b229ce2c	perf trace beauty: Default header_dir to cwd to work without parms Useful when checking the effects of header synchs for the files it uses as a input to generate string tables, in retrospect this is how it should've been done from day 1, not requiring the header_dir to be set on the Makefile, will change everything later, so that the only parm, common to all generators will be $(srctree) and $(beauty_outdir). So, to see what it generates, just call it without any parameters: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh static const char vhost_virtio_ioctl_cmds[] = { [0x00] = "SET_FEATURES", [0x01] = "SET_OWNER", [0x02] = "RESET_OWNER", [0x03] = "SET_MEM_TABLE", [0x04] = "SET_LOG_BASE", [0x07] = "SET_LOG_FD", [0x10] = "SET_VRING_NUM", [0x11] = "SET_VRING_ADDR", [0x12] = "SET_VRING_BASE", [0x13] = "SET_VRING_ENDIAN", [0x14] = "GET_VRING_ENDIAN", [0x20] = "SET_VRING_KICK", [0x21] = "SET_VRING_CALL", [0x22] = "SET_VRING_ERR", [0x23] = "SET_VRING_BUSYLOOP_TIMEOUT", [0x24] = "GET_VRING_BUSYLOOP_TIMEOUT", [0x30] = "NET_SET_BACKEND", [0x40] = "SCSI_SET_ENDPOINT", [0x41] = "SCSI_CLEAR_ENDPOINT", [0x42] = "SCSI_GET_ABI_VERSION", [0x43] = "SCSI_SET_EVENTS_MISSED", [0x44] = "SCSI_GET_EVENTS_MISSED", [0x60] = "VSOCK_SET_GUEST_CID", [0x61] = "VSOCK_SET_RUNNING", }; static const char vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", [0x12] = "GET_VRING_BASE", }; $ Or: $ tools/perf/trace/beauty/sndrv_pcm_ioctl.sh static const char *sndrv_pcm_ioctl_cmds[] = { [0x00] = "PVERSION", [0x01] = "INFO", [0x02] = "TSTAMP", [0x03] = "TTSTAMP", [0x04] = "USER_PVERSION", [0x10] = "HW_REFINE", [0x11] = "HW_PARAMS", [0x12] = "HW_FREE", [0x13] = "SW_PARAMS", [0x20] = "STATUS", [0x21] = "DELAY", [0x22] = "HWSYNC", [0x23] = "SYNC_PTR", [0x24] = "STATUS_EXT", [0x32] = "CHANNEL_INFO", [0x40] = "PREPARE", [0x41] = "RESET", [0x42] = "START", [0x43] = "DROP", [0x44] = "DRAIN", [0x45] = "PAUSE", [0x46] = "REWIND", [0x47] = "RESUME", [0x48] = "XRUN", [0x49] = "FORWARD", [0x50] = "WRITEI_FRAMES", [0x51] = "READI_FRAMES", [0x52] = "WRITEN_FRAMES", [0x53] = "READN_FRAMES", [0x60] = "LINK", [0x61] = "UNLINK", }; $ Etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-90am4vm8hh1osms894dp2otr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 09:56:46 -03:00
Arnaldo Carvalho de Melo	c2586cfbb9	Merge remote-tracking branch 'tip/perf/urgent' into perf/core To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-31 09:55:45 -03:00
Arnaldo Carvalho de Melo	44fe619b14	perf tools: Fix the build on the alpine:edge distro The UAPI file byteorder/little_endian.h uses the __always_inline define without including the header where it is defined, linux/stddef.h, this ends up working in all the other distros because that file gets included seemingly by luck from one of the files included from little_endian.h. But not on Alpine:edge, that fails for all files where perf_event.h is included but linux/stddef.h isn't include before that. Adding the missing linux/stddef.h file where it breaks on Alpine:edge to fix that, in all other distros, that is just a very small header anyway. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-9r1pifftxvuxms8l7ir73p5l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-30 13:15:03 -03:00
Arnaldo Carvalho de Melo	1f27a050fc	tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy' To cope with the changes in: `12c89130a5` ("x86/asm/memcpy_mcsafe: Add write-protection-fault handling") `60622d6822` ("x86/asm/memcpy_mcsafe: Return bytes remaining") `bd131544aa` ("x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling") `da7bc9c57e` ("x86/asm/memcpy_mcsafe: Remove loop unrolling") This needed introducing a file with a copy of the mcsafe_handle_tail() function, that is used in the new memcpy_64.S file, as well as a dummy mcsafe_test.h header. Testing it: $ nm ~/bin/perf \| grep mcsafe 0000000000484130 T mcsafe_handle_tail 0000000000484300 T __memcpy_mcsafe $ $ perf bench mem memcpy # Running 'mem/memcpy' benchmark: # function 'default' (Default memcpy() provided by glibc) # Copying 1MB bytes ... 44.389205 GB/sec # function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 22.710756 GB/sec # function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 42.459239 GB/sec # function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 42.459239 GB/sec $ This silences this perf tools build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-igdpciheradk3gb3qqal52d0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-30 12:36:51 -03:00
Thomas Richter	9ef0112442	perf test: Fix subtest number when showing results Perf test 40 for example has several subtests numbered 1-4 when displaying the start of the subtest. When the subtest results are displayed the subtests are numbered 0-3. Use this command to generate trace output: [root@s35lp76 perf]# ./perf test -Fv 40 2>/tmp/bpf1 Fix this by adjusting the subtest number when show the subtest result. Output before: [root@s35lp76 perf]# egrep '(^40\.[0-4]\| subtest [0-4]:)' /tmp/bpf1 40.1: Basic BPF filtering : BPF filter subtest 0: Ok 40.2: BPF pinning : BPF filter subtest 1: Ok 40.3: BPF prologue generation : BPF filter subtest 2: Ok 40.4: BPF relocation checker : BPF filter subtest 3: Ok [root@s35lp76 perf]# Output after: root@s35lp76 ~]# egrep '(^40\.[0-4]\| subtest [0-4]:)' /tmp/bpf1 40.1: Basic BPF filtering : BPF filter subtest 1: Ok 40.2: BPF pinning : BPF filter subtest 2: Ok 40.3: BPF prologue generation : BPF filter subtest 3: Ok 40.4: BPF relocation checker : BPF filter subtest 4: Ok [root@s35lp76 ~]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180724134858.100644-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:55:51 -03:00
Jiri Olsa	0aa802a794	perf stat: Get rid of extra clock display function There's no reason to have separate function to display clock events. It's only purpose was to convert the nanosecond value into microseconds. We do that now in generic code, if the unit and scale values are properly set, which this patch do for clock events. The output differs in the unit field being displayed in its columns rather than having it added as a suffix of the event name. Plus the value is rounded into 2 decimal numbers as for any other event. Before: # perf stat -e cpu-clock,task-clock -C 0 sleep 3 Performance counter stats for 'CPU(s) 0': 3001.123137 cpu-clock (msec) # 1.000 CPUs utilized 3001.133250 task-clock (msec) # 1.000 CPUs utilized 3.001159813 seconds time elapsed Now: # perf stat -e cpu-clock,task-clock -C 0 sleep 3 Performance counter stats for 'CPU(s) 0': 3,001.05 msec cpu-clock # 1.000 CPUs utilized 3,001.05 msec task-clock # 1.000 CPUs utilized 3.001077794 seconds time elapsed There's a small difference in csv output, as we now output the unit field, which was empty before. It's in the proper spot, so there's no compatibility issue. Before: # perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3 3001.065177,,cpu-clock,3001064187,100.00,1.000,CPUs utilized 3001.077085,,task-clock,3001077085,100.00,1.000,CPUs utilized # perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3 3000.80,msec,cpu-clock,3000799026,100.00,1.000,CPUs utilized 3000.80,msec,task-clock,3000799550,100.00,1.000,CPUs utilized Add perf_evsel__is_clock to replace nsec_counter. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180720110036.32251-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:54:58 -03:00
Jiri Olsa	2d6cae13f1	perf tools: Use perf_evsel__match instead of open coded equivalent Use perf_evsel__match() helper in perf_evsel__is_bpf_output(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180720110036.32251-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:54:13 -03:00
Jiri Olsa	46b3722cc7	perf tools: Fix struct comm_str removal crash We occasionaly hit following assert failure in 'perf top', when processing the /proc info in multiple threads. perf: ...include/linux/refcount.h:109: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. The gdb backtrace looks like this: [Switching to Thread 0x7ffff11ba700 (LWP 13749)] 0x00007ffff50839fb in raise () from /lib64/libc.so.6 (gdb) #0 0x00007ffff50839fb in raise () from /lib64/libc.so.6 #1 0x00007ffff5085800 in abort () from /lib64/libc.so.6 #2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000535373 in refcount_inc (r=0x7fffdc009be0) at ...include/linux/refcount.h:109 #5 0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0) at util/comm.c:24 #6 0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2", root=0xbed5c0 <comm_str_root>) at util/comm.c:72 #7 0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2", root=0xbed5c0 <comm_str_root>) at util/comm.c:95 #8 0x000000000053582e in comm__new (str=0x7fffd000b260 ":2", timestamp=0, exec=false) at util/comm.c:111 #9 0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57 #10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38, threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457 #11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38, ... The failing assertion is this one: REFCOUNT_WARN(!refcount_inc_not_zero(r), ... The problem is that we keep global comm_str_root list, which is accessed by multiple threads during the 'perf top' startup and following 2 paths can race: thread 1: ... thread__new comm__new comm_str__findnew down_write(&comm_str_lock); __comm_str__findnew comm_str__get thread 2: ... comm__override or comm__free comm_str__put refcount_dec_and_test down_write(&comm_str_lock); rb_erase(&cs->rb_node, &comm_str_root); Because thread 2 first decrements the refcnt and only after then it removes the struct comm_str from the list, the thread 1 can find this object on the list with refcnt equls to 0 and hit the assert. This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects that already dropped the refcnt to 0. For the rest of the objects we take the refcnt before comparing its name and release it afterwards with comm_str__put, which can also release the object completely. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20180720101740.GA27176@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:54:03 -03:00
Jiri Olsa	b57334b945	perf machine: Use last_match threads cache only in single thread mode There's an issue with using threads::last_match in multithread mode which is enabled during the perf top synthesize. It might crash with following assertion: perf: ...include/linux/refcount.h:109: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed. The gdb backtrace looks like this: 0x00007ffff50839fb in raise () from /lib64/libc.so.6 (gdb) #0 0x00007ffff50839fb in raise () from /lib64/libc.so.6 #1 0x00007ffff5085800 in abort () from /lib64/libc.so.6 #2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70) at ...include/linux/refcount.h:109 #5 0x0000000000536771 in thread__get (thread=0x7fffe8009a40) at util/thread.c:115 #6 0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38, threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432 #7 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:489 #8 0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38, pid=2, tid=2) at util/machine.c:499 #9 0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38, ... The failing assertion is this one: REFCOUNT_WARN(!refcount_inc_not_zero(r), ... the problem is that we don't serialize access to threads::last_match. We serialize the access to the threads tree, but we don't care how's threads::last_match being accessed. Both locked/unlocked paths use that data and can set it. In multithreaded mode we can end up with invalid object in thread__get call, like in following paths race: thread 1 ... machine__findnew_thread down_write(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 2 ... machine__find_thread down_read(&threads->lock); __machine__findnew_thread ____machine__findnew_thread th = threads->last_match; if (th->tid == tid) { thread__get thread 3 ... machine__process_fork_event machine__remove_thread __machine__remove_thread threads->last_match = NULL thread__put thread__put Thread 1 and 2 might got stale last_match, before thread 3 clears it. Thread 1 and 2 then race with thread 3's thread__put and they might trigger the refcnt == 0 assertion above. The patch is disabling the last_match cache for multiple thread mode. It was originally meant for single thread scenarios, where it's common to have multiple sequential searches of the same thread. In multithread mode this does not make sense, because top's threads processes different /proc entries and so the 'struct threads' object is queried for various threads. Moreover we'd need to add more locks to make it work. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:52 -03:00
Jiri Olsa	67fda0f32c	perf machine: Add threads__set_last_match function Separating threads::last_match cache set into separate threads__set_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:42 -03:00
Jiri Olsa	f8b2ebb532	perf machine: Add threads__get_last_match function Separating threads::last_match cache read/check into separate threads__get_last_match function. This will be useful in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lukasz Odzioba <lukasz.odzioba@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:31 -03:00
Jiri Olsa	e8fedff1cc	perf tools: Synthesize GROUP_DESC feature in pipe mode Stephan reported, that pipe mode does not carry the group information and thus the piped report won't display the grouped output for following command: # perf record -e '{cycles,instructions,branches}' -a sleep 4 \| perf report It has no idea about the group setup, so it will display events separately: # Overhead Command Shared Object ... # ........ ............... ....................... # 6.71% swapper [kernel.kallsyms] 2.28% offlineimap libpython2.7.so.1.0 0.78% perf [kernel.kallsyms] ... Fix GROUP_DESC feature record to be synthesized in pipe mode, so the report output is grouped if there are groups defined in record: # Overhead Command Shared ... # ........................ ............... ....... # 7.57% 0.16% 0.30% swapper [kernel 1.87% 3.15% 2.46% offlineimap libpyth 1.33% 0.00% 0.00% perf [kernel ... Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180712135202.14774-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:20 -03:00
Sandipan Das	2a9d5050dc	perf script: Show correct offsets for DWARF-based unwinding When perf/data is recorded with the dwarf call-graph option, the callchain shown by 'perf script' still shows the binary offsets of the userspace symbols instead of their virtual addresses. Since the symbol offset calculation is based on using virtual address as the ip, we see incorrect offsets as well. The use of virtual addresses affects the ability to find out the line number in the corresponding source file to which an address maps to as described in commit `6754075915` ("perf unwind: Use addr_location::addr instead of ip for entries"). This has also been addressed by temporarily converting the virtual address to the correponding binary offset so that it can be mapped to the source line number correctly. This is a follow-up for commit `1961018469` ("perf script: Show virtual addresses instead of offsets"). This can be verified on a powerpc64le system running Fedora 27 as shown below: # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton # perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1 Before: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c \| ---gaih_inet getaddrinfo.c:537 (inlined) __GI_getaddrinfo getaddrinfo.c:2304 (inlined) main ping.c:519 generic_start_main libc-start.c:308 (inlined) __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004ca0af28] 10fa53 gaih_inet+0xffff000099160f43 libc-2.26.so[ffff80004c9bfa53] (inlined) 1105b3 __GI_getaddrinfo+0xffff000099160163 libc-2.26.so[ffff80004c9c05b3] (inlined) 2d6f main+0xfffffffd9f1003df (/usr/bin/ping) ping[fffffffecf882d6f] 2369f generic_start_main+0xffff00009916013f libc-2.26.so[ffff80004c8d369f] (inlined) 23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so) libc-2.26.so[ffff80004c8d3897] After: # perf report --stdio --no-children -s sym,srcline -g address # Samples: 1 of event 'probe_libc:inet_pton' # Event count (approx.): 1 # # Overhead Symbol Source:Line # ........ .................... ........... # 100.00% [.] __GI___inet_pton inet_pton.c \| ---gaih_inet.constprop.7 getaddrinfo.c:537 getaddrinfo getaddrinfo.c:2304 main ping.c:519 generic_start_main.isra.0 libc-start.c:308 __libc_start_main libc-start.c:102 ... # perf script -F comm,ip,sym,symoff,srcline,dso ping 7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) inet_pton.c:68 7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so) getaddrinfo.c:537 7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so) getaddrinfo.c:2304 130782d6f main+0x3df (/usr/bin/ping) ping.c:519 7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so) libc-start.c:308 7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so) libc-start.c:102 Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: `6754075915` ("perf unwind: Use addr_location::addr instead of ip for entries") Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:11 -03:00
Kim Phillips	a7f660d657	perf trace arm64: Use generated syscall table This should speed up accessing new system calls introduced with the kernel rather than waiting for libaudit updates to include them. It also enables users to specify wildcards, for example, perf trace -e 'open*', just like was already possible on x86, s390, and powerpc, which means arm64 can now pass the "Check open filename arg using perf trace + vfs_getname" test. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180706163454.f714b9ab49ecc8566a0b3565@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:53:01 -03:00
Kim Phillips	2b58824356	perf arm64: Generate system call table from asm/unistd.h This should speed up accessing new system calls introduced with the kernel rather than waiting for libaudit updates to include them. Using the existing other arch scripts resulted in this error: tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: 25: printf: __NR3264_ftruncate: expected numeric value because, unlike other arches, asm-generic's unistd.h does things like: #define __NR_ftruncate __NR3264_ftruncate Turning the scripts printf's %d into a %s resulted in this in the generated syscalls.c file: static const char syscalltbl_arm64[] = { [__NR3264_ftruncate] = "ftruncate", So we use the host C compiler to fold the macros, and print them out from within a temporary C program, in order to get the correct output: static const char syscalltbl_arm64[] = { [46] = "ftruncate", Committer notes: Testing this with a container with an old toolchain breaks because it ends up using the system's /usr/include/asm-generic/unistd.h, included from tools/arch/arm64/include/uapi/asm/unistd.h when what is desired is for it to include tools/include/uapi/asm-generic/unistd.h. Since all that tools/arch/arm64/include/uapi/asm/unistd.h is to set a define and then include asm-generic/unistd.h, do that directly and use tools/include/uapi/asm-generic/unistd.h as the file to get the syscall definitions to expand. Testing it: tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/include/uapi/asm-generic/unistd.h Now works and generates in the syscall string table. Before it ended up as: $ tools/perf/arch/arm64/entry/syscalls/mksyscalltbl /gcc-linaro-5.4.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc gcc tools/arch/arm64/include/uapi/asm/unistd.h static const char *syscalltbl_arm64[] = { <stdin>: In function 'main': <stdin>:257:38: error: '__NR_getrandom' undeclared (first use in this function) <stdin>:257:38: note: each undeclared identifier is reported only once for each function it appears in <stdin>:258:41: error: '__NR_memfd_create' undeclared (first use in this function) <stdin>:259:32: error: '__NR_bpf' undeclared (first use in this function) <stdin>:260:37: error: '__NR_execveat' undeclared (first use in this function) tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: 47: tools/perf/arch/arm64/entry/syscalls/mksyscalltbl: /tmp/create-table-60liya: Permission denied }; $ Signed-off-by: Kim Phillips <kim.phillips@arm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180706163443.22626f5e9e10e5bab5e5c662@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:52:48 -03:00
Kim Phillips	34b009cfde	tools include: Grab copies of arm64 dependent unistd.h files Will be used for generating the syscall id/string translation table. The arm64 unistd.h file simply #includes the asm-generic/unistd.h, so, since we will want to know whether either change, we grab both: arch/arm64/include/uapi/asm/unistd.h and include/uapi/asm-generic/unistd.h Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180706163434.1b64ffbcc0284fb79982f53b@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:52:39 -03:00
Sandipan Das	60089e42d3	perf tests: Fix record+probe_libc_inet_pton.sh when event exists If the event 'probe_libc:inet_pton' already exists, this test fails and deletes the existing event before exiting. This will then pass for any subsequent executions. Instead of skipping to deleting the existing event because of failing to add a new event, a duplicate event is now created and the script continues with the usual checks. Only the new duplicate event that is created at the beginning of the test is deleted as a part of the cleanups in the end. All existing events remain as it is. This can be observed on a powerpc64 system running Fedora 27 as shown below. # perf probe -x /usr/lib64/power8/libc-2.26.so -a inet_pton Added new event: probe_libc:inet_pton (on inet_pton in /usr/lib64/power8/libc-2.26.so) Before: # perf test -v "probe libc's inet_pton & backtrace it with ping" 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 21302 test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! # perf probe --list After: # perf test -v "probe libc's inet_pton & backtrace it with ping" 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 21490 ping 21513 [035] 39357.565561: probe_libc:inet_pton_1: (7fffa4c623b0) 7fffa4c623b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so) 7fffa4c190dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so) 7fffa4c19c4c getaddrinfo+0x15c (/usr/lib64/power8/libc-2.26.so) 111d93c20 main+0x3e0 (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # perf probe --list probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so) Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/e11fecff96e6cf4c65cdbd9012463513d7b8356c.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:52:19 -03:00
Sandipan Das	83e3b6d73e	perf tests: Fix record+probe_libc_inet_pton.sh to ensure cleanups If there is a mismatch in the perf script output, this test fails and exits before the event and temporary files created during its execution are cleaned up. This can be observed on a powerpc64 system running Fedora 27 as shown below. # perf test -v "probe libc's inet_pton & backtrace it with ping" 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 18655 ping 18674 [013] 24511.496995: probe_libc:inet_pton: (7fffa6b423b0) 7fffa6b423b0 __GI___inet_pton+0x0 (/usr/lib64/power8/libc-2.26.so) 7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so) FAIL: expected backtrace entry "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]$/usr/lib64/power8/libc-2.26.so$$" got "7fffa6af90dc gaih_inet.constprop.7+0xf4c (/usr/lib64/power8/libc-2.26.so)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! # ls /tmp/expected.* /tmp/perf.data.* /tmp/perf.script.* /tmp/expected.u31 /tmp/perf.data.Pki /tmp/perf.script.Bhs # perf probe --list probe_libc:inet_pton (on __inet_pton@resolv/inet_pton.c in /usr/lib64/power8/libc-2.26.so) Cleanup of the event and the temporary files are now ensured by allowing the cleanup code to be executed even if the lines from the backtrace do not match their expected patterns instead of simply exiting from the point of failure. Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/ce9fb091dd3028fba8749a1a267cfbcb264bbfb1.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:52:09 -03:00
Sandipan Das	3eae52f842	perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 For powerpc64, this test currently fails due to a mismatch in the expected output. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf test -v "probe libc's inet_pton & backtrace it with ping" Before: 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 23948 ping 23965 [003] 71136.075084: probe_libc:inet_pton: (7fff996aaf28) 7fff996aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fff9965fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) FAIL: expected backtrace entry 2 "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]$/usr/lib64/libc-2.26.so$$" got "7fff9965fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! After: 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 24638 ping 24655 [001] 71208.525396: probe_libc:inet_pton: (7fffa245af28) 7fffa245af28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fffa240fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fffa24105b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 138d52d70 main+0x3e0 (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Maynard Johnson <maynard@us.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Fixes: e07d585e2454 ("perf tests: Switch trace+probe_libc_inet_pton to use record") Link: http://lkml.kernel.org/r/49621ec5f37109f0655e5a8c32287ad68d85a1e5.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:51:37 -03:00
Sandipan Das	9068533e4f	perf powerpc: Fix callchain ip filtering when return address is in a register For powerpc64, perf will filter out the second entry in the callchain, i.e. the LR value, if the return address of the function corresponding to the probed location has already been saved on its caller's stack. The state of the return address is determined using debug information. At any point within a function, if the return address is already saved somewhere, a DWARF expression can tell us about its location. If the return address in still in LR only, no DWARF expression would exist. Typically, the instructions in a function's prologue first copy the LR value to R0 and then pushes R0 on to the stack. If LR has already been copied to R0 but R0 is yet to be pushed to the stack, we can still get a DWARF expression that says that the return address is in R0. This is indicating that getting a DWARF expression for the return address does not guarantee the fact that it has already been saved on the stack. This can be observed on a powerpc64le system running Fedora 27 as shown below. # objdump -d /usr/lib64/libc-2.26.so \| less ... 000000000015af20 <inet_pton>: 15af20: 0b 00 4c 3c addis r2,r12,11 15af24: e0 c1 42 38 addi r2,r2,-15904 15af28: a6 02 08 7c mflr r0 15af2c: f0 ff c1 fb std r30,-16(r1) 15af30: f8 ff e1 fb std r31,-8(r1) 15af34: 78 1b 7f 7c mr r31,r3 15af38: 78 23 83 7c mr r3,r4 15af3c: 78 2b be 7c mr r30,r5 15af40: 10 00 01 f8 std r0,16(r1) 15af44: c1 ff 21 f8 stdu r1,-64(r1) 15af48: 28 00 81 f8 std r4,40(r1) ... # readelf --debug-dump=frames-interp /usr/lib64/libc-2.26.so \| less ... 00027024 0000000000000024 00027028 FDE cie=00000000 pc=000000000015af20..000000000015af88 LOC CFA r30 r31 ra 000000000015af20 r1+0 u u u 000000000015af34 r1+0 c-16 c-8 r0 000000000015af48 r1+64 c-16 c-8 c+16 000000000015af5c r1+0 c-16 c-8 c+16 000000000015af78 r1+0 u u ... # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x18 # perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1 # perf script Before: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e26fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Maynard Johnson <maynard@us.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/66e848a7bdf2d43b39210a705ff6d828a0865661.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:50:44 -03:00
Sandipan Das	c715fcfda5	perf powerpc: Fix callchain ip filtering For powerpc64, redundant entries in the callchain are filtered out by determining the state of the return address and the stack frame using DWARF debug information. For making these filtering decisions we must analyze the debug information for the location corresponding to the program counter value, i.e. the first entry in the callchain, and not the LR value; otherwise, perf may filter out either the second or the third entry in the callchain incorrectly. This can be observed on a powerpc64le system running Fedora 27 as shown below. Case 1 - Attaching a probe at inet_pton+0x8 (binary offset 0x15af28). Return address is still in LR and a new stack frame is not yet allocated. The LR value, i.e. the second entry, should not be filtered out. # objdump -d /usr/lib64/libc-2.26.so \| less ... 000000000010eb10 <gaih_inet.constprop.7>: ... 10fa48: 78 bb e4 7e mr r4,r23 10fa4c: 0a 00 60 38 li r3,10 10fa50: d9 b4 04 48 bl 15af28 <inet_pton+0x8> 10fa54: 00 00 00 60 nop 10fa58: ac f4 ff 4b b 10ef04 <gaih_inet.constprop.7+0x3f4> ... 0000000000110450 <getaddrinfo>: ... 1105a8: 54 00 ff 38 addi r7,r31,84 1105ac: 58 00 df 38 addi r6,r31,88 1105b0: 69 e5 ff 4b bl 10eb18 <gaih_inet.constprop.7+0x8> 1105b4: 78 1b 71 7c mr r17,r3 1105b8: 50 01 7f e8 ld r3,336(r31) ... 000000000015af20 <inet_pton>: 15af20: 0b 00 4c 3c addis r2,r12,11 15af24: e0 c1 42 38 addi r2,r2,-15904 15af28: a6 02 08 7c mflr r0 15af2c: f0 ff c1 fb std r30,-16(r1) 15af30: f8 ff e1 fb std r31,-8(r1) ... # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x8 # perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1 # perf script Before: ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28) 7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 13fb52d70 _init+0xbfc (/usr/bin/ping) 7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28) 7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fffa7d6fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 13fb52d70 _init+0xbfc (/usr/bin/ping) 7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Case 2 - Attaching a probe at _int_malloc+0x180 (binary offset 0x9cf10). Return address in still in LR and a new stack frame has already been allocated but not used. The caller's caller, i.e. the third entry, is invalid and should be filtered out and not the second one. # objdump -d /usr/lib64/libc-2.26.so \| less ... 000000000009cd90 <_int_malloc>: 9cd90: 17 00 4c 3c addis r2,r12,23 9cd94: 70 a3 42 38 addi r2,r2,-23696 9cd98: 26 00 80 7d mfcr r12 9cd9c: f8 ff e1 fb std r31,-8(r1) 9cda0: 17 00 e4 3b addi r31,r4,23 9cda4: d8 ff 61 fb std r27,-40(r1) 9cda8: 78 23 9b 7c mr r27,r4 9cdac: 1f 00 bf 2b cmpldi cr7,r31,31 9cdb0: f0 ff c1 fb std r30,-16(r1) 9cdb4: b0 ff c1 fa std r22,-80(r1) 9cdb8: 78 1b 7e 7c mr r30,r3 9cdbc: 08 00 81 91 stw r12,8(r1) 9cdc0: 11 ff 21 f8 stdu r1,-240(r1) 9cdc4: 4c 01 9d 41 bgt cr7,9cf10 <_int_malloc+0x180> 9cdc8: 20 00 a4 2b cmpldi cr7,r4,32 ... 9cf08: 00 00 00 60 nop 9cf0c: 00 00 42 60 ori r2,r2,0 9cf10: e4 06 ff 7b rldicr r31,r31,0,59 9cf14: 40 f8 a4 7f cmpld cr7,r4,r31 9cf18: 68 05 9d 41 bgt cr7,9d480 <_int_malloc+0x6f0> ... 000000000009e3c0 <tcache_init.part.4>: ... 9e420: 40 02 80 38 li r4,576 9e424: 78 fb e3 7f mr r3,r31 9e428: 71 e9 ff 4b bl 9cd98 <_int_malloc+0x8> 9e42c: 00 00 a3 2f cmpdi cr7,r3,0 9e430: 78 1b 7e 7c mr r30,r3 ... 000000000009f7a0 <__libc_malloc>: ... 9f8f8: 00 00 89 2f cmpwi cr7,r9,0 9f8fc: 1c ff 9e 40 bne cr7,9f818 <__libc_malloc+0x78> 9f900: c9 ea ff 4b bl 9e3c8 <tcache_init.part.4+0x8> 9f904: 00 00 00 60 nop 9f908: e8 90 22 e9 ld r9,-28440(r2) ... # perf probe -x /usr/lib64/libc-2.26.so -a _int_malloc+0x180 # perf record -e probe_libc:_int_malloc -g ./test-malloc # perf script Before: test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10) 7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so) 7fffa6dd0000 [unknown] (/usr/lib64/libc-2.26.so) 7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so) 7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so) 100006b4 main+0x38 (/home/testuser/test-malloc) 7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10) 7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so) 7fffa6e6e42c tcache_init.part.4+0x6c (/usr/lib64/libc-2.26.so) 7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so) 7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so) 100006b4 main+0x38 (/home/sandipan/test-malloc) 7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Maynard Johnson <maynard@us.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Fixes: `a60335ba32` ("perf tools powerpc: Adjust callchain based on DWARF debug info") Link: http://lkml.kernel.org/r/24bb726d91ed173aebc972ec3f41a2ef2249434e.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:50:10 -03:00
Sangwon Hong	6feb3fec51	perf list: Add missing documentation for --desc and --debug options Add missing documentation for --desc and --debug options to the 'perf list' man page. Signed-off-by: Sangwon Hong <qpakzk@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20180717110738.10779-1-qpakzk@gmail.com [ Clarify that --desc is by default active ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:49:57 -03:00
Thomas Richter	8a95c89945	perf kvm: Fix subcommands on s390 With commit `eca0fa28cd` ("perf record: Provide detailed information on s390 CPU") s390 platform provides detailed type/model/capacity information in the CPU identifier string instead of just "IBM/S390". This breaks 'perf kvm' support which uses hard coded string IBM/S390 to compare with the CPU identifier string. Fix this by changing the comparison. Reported-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Tested-by: Stefan Raspl <raspl@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: stable@vger.kernel.org Fixes: `eca0fa28cd` ("perf record: Provide detailed information on s390 CPU") Link: http://lkml.kernel.org/r/20180712070936.67547-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:49:49 -03:00
Thomas Richter	742d92ff21	perf stat: Add transaction flag (-T) support for s390 The 'perf stat' command line flag -T to display transaction counters is currently supported for x86 only. Add support for s390. It is based on the metrics flag -M transaction using the architecture dependent JSON files. This requires a metric named "transaction" in the JSON files for the platform. Introduce a new function metricgroup__has_metric() to check for the existence of a metric_name transaction. As suggested by Andi Kleen, this is the new approach to support transactions counters. Other architectures will follow. Output before: [root@p23lp27 perf]# ./perf stat -T -- sleep 1 Cannot set up transaction events [root@p23lp27 perf]# Output after: [root@s35lp76 perf]# ./perf stat -T -- ~/mytesttx 1 >/tmp/111 Performance counter stats for '/root/mytesttx 1': 1 tx_c_tend # 13.0 transaction 1 tx_nc_tend 11 tx_nc_tabort 0 tx_c_tabort_special 0 tx_c_tabort_no_special 0.001070109 seconds time elapsed [root@s35lp76 perf]# Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180626071701.58190-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:49:37 -03:00
Thomas Richter	83eb383e94	perf json: Add s390 transaction counter definition 'perf stat' displays transactional counters using flag -T on x86. On s390 use a JSON file defined metric named transaction to achieve the same result. Output before: none Output after: [root@s35lp76 perf]# ./perf stat -M transaction -- \ ~/mytesttx 1 >/tmp/111 Performance counter stats for '/root/mytesttx 1': 1 tx_c_tend # 13.0 transaction 1 tx_nc_tend 11 tx_nc_tabort 0 tx_c_tabort_special 0 tx_c_tabort_no_special 0.001061232 seconds time elapsed [root@s35lp76 perf]# Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180621080452.61012-3-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:49:30 -03:00
Thomas Richter	9bacbced0e	perf list: Add s390 support for detailed PMU event description Correct the support of detailed/verbose PMU event description by using the "Unit": keyword in the json files to address event names refering to the /sys/devices/cpum_[cs]f devices. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180621080452.61012-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:49:09 -03:00
Thomas Richter	b8b5ab52bc	Revert "perf list: Add s390 support for detailed/verbose PMU event description" This reverts commit `038586c343`. Fix the support of detailed/verbose PMU event description by using the "Unit": keyword in the json files to address event names refering to the /sys/devices/cpum_[cs]f devices. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180621080452.61012-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:48:58 -03:00
Leo Yan	6cd4ac6a02	perf cs-etm: Bail out immediately for instruction sample failure If the instruction sample failure has happened, it isn't necessary to execute to the end of the function cs_etm__flush(). This commit is to bail out immediately and return the error code. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1529298599-3876-3-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:48:32 -03:00
Leo Yan	6abf0f4510	perf cs-etm: Introduce invalid address macro This patch introduces invalid address macro and uses it to replace dummy value '0xdeadbeefdeadbeefUL'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1529298599-3876-2-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:48:22 -03:00
Arnaldo Carvalho de Melo	e9de7e2f7e	perf hists: Clarify callchain disabling when available We want to allow having mixed events with/without callchains, not using a global flag to show callchains, but allowing supressing callchains when they are present. So invert the logic of the last parameter to hists__fprint() to that effect. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ohqyisr6qge79qa95ojslptx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:37:33 -03:00
Alexey Budankov	06dc5bf21f	perf tests: Check that complex event name is parsed correctly Extend regression testing to cover case of complex event names enabled by the cset `f92da71280` ("perf record: Enable arbitrary event names thru name= modifier"). Testing it: # perf test 1: vmlinux symtab matches kallsyms : Skip 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Test data source output : Ok 6: Parse event definition strings : Ok <===! 7: Simple expression parser : Ok ... Committer testing: # perf test "event definition" 6: Parse event definition strings : Ok # perf test -v 6 2> /tmp/before # perf test -v 6 2> /tmp/after # diff -u /tmp/before /tmp/after --- /tmp/before 2018-06-19 10:50:21.485572638 -0300 +++ /tmp/after 2018-06-19 10:50:40.886572896 -0300 @@ -1,6 +1,6 @@ 6: Parse event definition strings : --- start --- -test child forked, pid 24259 +test child forked, pid 24904 running test 0 'syscalls:sys_enter_openat'Using CPUID GenuineIntel-6-3D registering plugin: /root/.traceevent/plugins/plugin_kvm.so registering plugin: /root/.traceevent/plugins/plugin_hrtimer.so @@ -136,9 +136,11 @@ running test 50 '4:0x6530160/name=numpmu/' running test 51 'L1-dcache-misses/name=cachepmu/' running test 52 'intel_pt//u' +running test 53 'cycles/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'/Duk' running test 0 'cpu/config=10,config1,config2=3,period=1000/u' running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u' running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/' +running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2,umask=0x3/ukp' el-capacity -> cpu/event=0x54,umask=0x2/ el-conflict -> cpu/event=0x54,umask=0x1/ el-start -> cpu/event=0xc8,umask=0x1/ # Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ad30b774-219b-7b80-c610-4e9e298cf8a7@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:37:11 -03:00
Arnaldo Carvalho de Melo	1d59d16e9b	Merge remote-tracking branch 'tip/perf/urgent' into perf/core To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2018-07-24 14:34:32 -03:00
Tobias Tefke	788faab70d	perf, tools: Use correct articles in comments Some of the comments in the perf events code use articles incorrectly, using 'a' for words beginning with a vowel sound, where 'an' should be used. Signed-off-by: Tobias Tefke <tobias.tefke@tutanota.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: jolsa@redhat.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/20180709105715.22938-1-tobias.tefke@tutanota.com [ Fix a few more perf related 'a event' typo fixes from all around the kernel and tooling tree. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-16 00:21:03 +02:00
Linus Torvalds	aa0a3247c0	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tool fixes from Ingo Molnar: "Misc tooling fixes: python3 related fixes, gcc8 fix, bashism fixes and some other smaller fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Use python-config --includes rather than --cflags perf script python: Fix dict reference counting perf stat: Fix --interval_clear option perf tools: Fix compilation errors on gcc8 perf test shell: Prevent temporary editor files from being considered test scripts perf llvm-utils: Remove bashism from kernel include fetch script perf test shell: Make perf's inet_pton test more portable perf test shell: Replace '\|&' with '2>&1 \|' to work with more shells perf scripts python: Add Python 3 support to EventClass.py perf scripts python: Add Python 3 support to sched-migration.py perf scripts python: Add Python 3 support to Util.py perf scripts python: Add Python 3 support to SchedGui.py perf scripts python: Add Python 3 support to Core.py perf tools: Generate a Python script compatible with Python 2 and 3	2018-07-13 13:33:09 -07:00
Laura Abbott	6fdbd824fd	tools: build: Fixup host c flags Commit `0c3b7e4261` ("tools build: Add support for host programs format") introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the variable is HOSTCFLAGS. Fix this up. Fixes: `0c3b7e4261` ("tools build: Add support for host programs format") Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-07-13 00:48:17 +09:00

... 2 3 4 5 6 ...

9297 Commits