mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-12-02 08:34:20 +08:00
5b427df27b
13767 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Ian Rogers
|
22de36ff2c |
perf vendor events: Remove bad ivytown uncore events
The event converter scripts at: https://github.com/intel/event-converter-for-linux-perf passes Filter values from data on 01.org that is bogus in a perf command line and can cause perf to infinitely recurse in parse events. Remove such events or filters using the updated patch: |
||
Ian Rogers
|
2c98bacfd7 |
perf vendor events: Remove bad broadwellde uncore events
The event converter scripts at: https://github.com/intel/event-converter-for-linux-perf passes Filter values from data on 01.org that is bogus in a perf command line and can cause perf to infinitely recurse in parse events. Remove such events or filters using the updated patch: |
||
Ian Rogers
|
b4f0466082 |
perf jevents: Add JEVENTS_ARCH make option
Allow the architecture built into pmu-events.c to be set on the make command line with JEVENTS_ARCH. Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
46acb311c6 |
perf jevents: Simplify generation of C-string
Previous implementation wanted variable order and '(null)' string output to match the C implementation. The '(null)' string output was a quirk/bug and so there is no need to carry it forward. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
e1e19d0545 |
perf jevents: Clean up pytype warnings
Improve type hints to clean up pytype warnings. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Roberto Sassu
|
dd6775f986 |
perf build: Remove FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting
As the building mechanism is now able to retry detection with different combinations of linking flags, setting FEATURE_CHECK_LDFLAGS-disassembler-four-args and FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore, so remove it. Committer notes: Use the same technique to find the set of bfd-related libraries to link as in: 3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220719170555.2576993-3-roberto.sassu@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Claire Jensen
|
0c343af2a2 |
perf test: JSON format checking
Add field checking tests for perf stat JSON output. Sanity checks the expected number of fields are present, that the expected keys are present and they have the correct values. Committer notes: Had to fix this: - $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \ + $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \ Committer testing: [root@quaco ~]# perf test json 90: perf stat JSON output linter : Ok [root@quaco ~]# set -o vi [root@quaco ~]# perf test -v json 90: perf stat JSON output linter : --- start --- test child forked, pid 560794 Checking json output: no args [Success] Checking json output: system wide [Success] Checking json output: system wide Checking json output: system wide no aggregation [Success] Checking json output: interval [Success] Checking json output: event [Success] Checking json output: per core [Success] Checking json output: per thread [Success] Checking json output: per die [Success] Checking json output: per node [Success] Checking json output: per socket [Success] test child finished with 0 ---- end ---- perf stat JSON output linter: Ok [root@quaco ~]# Signed-off-by: Claire Jensen <cjense@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: Claire Jensen <clairej735@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Claire Jensen
|
df936cadfb |
perf stat: Add JSON output option
CSV output is tricky to format and column layout changes are susceptible to breaking parsers. New JSON-formatted output has variable names to identify fields that are consistent and informative, making the output parseable. CSV output example: 1.20,msec,task-clock:u,1204272,100.00,0.697,CPUs utilized 0,,context-switches:u,1204272,100.00,0.000,/sec 0,,cpu-migrations:u,1204272,100.00,0.000,/sec 70,,page-faults:u,1204272,100.00,58.126,K/sec JSON output example: {"counter-value" : "3805.723968", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 3805731510100.00, "pcnt-running" : 100.00, "metric-value" : 4.007571, "metric-unit" : "CPUs utilized"} {"counter-value" : "6166.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 3805723045100.00, "pcnt-running" : 100.00, "metric-value" : 1.620191, "metric-unit" : "K/sec"} {"counter-value" : "466.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 3805727613100.00, "pcnt-running" : 100.00, "metric-value" : 122.447136, "metric-unit" : "/sec"} {"counter-value" : "208.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 3805726799100.00, "pcnt-running" : 100.00, "metric-value" : 54.654516, "metric-unit" : "/sec"} Also added documentation for JSON option. There is some tidy up of CSV code including a potential memory over run in the os.nfields set up. To facilitate this an AGGR_MAX value is added. Committer notes: Fixed up using PRIu64 to format u64 values, not %lu. Committer testing: ⬢[acme@toolbox perf]$ perf stat -j sleep 1 {"counter-value" : "0.731750", "unit" : "msec", "event" : "task-clock:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000731, "metric-unit" : "CPUs utilized"} {"counter-value" : "0.000000", "unit" : "", "event" : "context-switches:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"} {"counter-value" : "0.000000", "unit" : "", "event" : "cpu-migrations:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"} {"counter-value" : "75.000000", "unit" : "", "event" : "page-faults:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 102.494021, "metric-unit" : "K/sec"} {"counter-value" : "578765.000000", "unit" : "", "event" : "cycles:u", "event-runtime" : 379366, "pcnt-running" : 49.00, "metric-value" : 0.790933, "metric-unit" : "GHz"} {"counter-value" : "1298.000000", "unit" : "", "event" : "stalled-cycles-frontend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.224271, "metric-unit" : "frontend cycles idle"} {"counter-value" : "21984.000000", "unit" : "", "event" : "stalled-cycles-backend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 3.798433, "metric-unit" : "backend cycles idle"} {"counter-value" : "468197.000000", "unit" : "", "event" : "instructions:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.808959, "metric-unit" : "insn per cycle"} {"metric-value" : 0.046955, "metric-unit" : "stalled cycles per insn"} {"counter-value" : "103335.000000", "unit" : "", "event" : "branches:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 141.216262, "metric-unit" : "M/sec"} {"counter-value" : "2381.000000", "unit" : "", "event" : "branch-misses:u", "event-runtime" : 388654, "pcnt-running" : 50.00, "metric-value" : 2.304156, "metric-unit" : "of all branches"} ⬢[acme@toolbox perf]$ Signed-off-by: Claire Jensen <cjense@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: Claire Jensen <clairej735@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805200105.2020995-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrián Herrera Arcila
|
bb8bc52e75 |
perf stat: Refactor __run_perf_stat() common code
This extracts common code from the branches of the forks if-then-else. enable_counters(), which was at the beginning of both branches of the conditional, is now unconditional; evlist__start_workload() is extracted to a different if, which enables making the common clocking code unconditional. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20220729161244.10522-1-adrian.herrera@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
6d499a6b3d |
perf lock: Print the number of lost entries for BPF
Like the normal 'perf lock contention' output, it'd print the number of lost entries for BPF if exists or -v option is passed. Currently it uses BROKEN_CONTENDED stat for the lost count (due to full stack maps). $ sudo perf lock con -a -b --map-nr-entries 128 sleep 5 ... === output for debug=== bad: 43, total: 14903 bad rate: 0.29 % histogram of events caused bad sequence acquire: 0 acquired: 0 contended: 43 release: 0 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220802191004.347740-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
ceb13bfc01 |
perf lock: Add --map-nr-entries option
The --map-nr-entries option is to control number of max entries in the perf lock contention BPF maps. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220802191004.347740-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
447ec4e5fa |
perf lock: Introduce struct lock_contention
The lock_contention struct is to carry related fields together and to minimize the change when we add new config options. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220802191004.347740-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
4ee3c4da8b |
perf scripting python: Do not build fail on deprecation warnings
First noticed with fedora:rawhide: 48 11.10 fedora:rawhide : FAIL gcc version 12.1.1 20220628 (Red Hat 12.1.1-3) (GCC) util/scripting-engines/trace-event-python.c: In function 'python_start_script': util/scripting-engines/trace-event-python.c:1899:9: error: 'PySys_SetArgv' is deprecated [-Werror=deprecated-declarations] 1899 | PySys_SetArgv(argc + 1, command_line); No time now to address this warning, so don't make it an error, in time we should either add yet more ifdefs to continue supporting older systems or just convert to whatever new infra python put in place for argv processing, sigh. Acked-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
91cea6be90 |
genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
When genelf was introduced it tested for HAVE_LIBCRYPTO not
HAVE_LIBCRYPTO_SUPPORT, which is the define the feature test for openssl
defines, fix it.
This also adds disables the deprecation warning, someone has to fix this
to build with openssl 3.0 before the warning becomes a hard error.
Fixes:
|
||
Ian Rogers
|
9b7c7728f4 |
perf parse-events: Break out tracepoint and printing
Move print_*_events functions out of parse-events.c into a new print-events.c. Move tracepoint code into tracepoint.c or trace-event-info.c (sole user). This reduces the dependencies of parse-events.c and makes it more amenable to being a library in the future. Remove some unnecessary definitions from parse-events.h. Fix a checkpatch.pl warning on using unsigned rather than unsigned int. Fix some line length warnings too. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com [ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
32f457abb8 |
perf parse-events: Don't #define YY_EXTRA_TYPE
Adding a #define to side-effect a local include isn't clean, for example, it inhibits header precompilation. YY_EXTRA_TYPE is defined to be void* by default, so just remove. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220729204217.250166-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Andres Freund
|
83aa012048 |
tools perf: Fix compilation error with new binutils
binutils changed the signature of init_disassemble_info(), which now causes compilation failures for tools/perf/util/annotate.c, e.g. on debian unstable. Relevant binutils commit: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07 Wire up the feature test and switch to init_disassemble_info_compat(), which were introduced in prior commits, fixing the compilation failure. I verified that perf can still disassemble bpf programs by using bpftrace under load, recording a perf trace, and then annotating the bpf "function" with and without the changes. With old binutils there's no change in output before/after this patch. When comparing the output from old binutils (2.35) to new bintuils with the patch (upstream snapshot) there are a few output differences, but they are unrelated to this patch. An example hunk is: 1.15 : 55:mov %rbp,%rdx 0.00 : 58:add $0xfffffffffffffff8,%rdx 0.00 : 5c:xor %ecx,%ecx - 1.03 : 5e:callq 0xffffffffe12aca3c + 1.03 : 5e:call 0xffffffffe12aca3c 0.00 : 63:xor %eax,%eax - 2.18 : 65:leaveq - 2.82 : 66:retq + 2.18 : 65:leave + 2.82 : 66:ret Signed-off-by: Andres Freund <andres@anarazel.de> Acked-by: Quentin Monnet <quentin@isovalent.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Ben Hutchings <benh@debian.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de Link: https://lore.kernel.org/r/20220801013834.156015-5-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
00b3262598 |
perf test: Add ARM SPE system wide test
In the past it had a problem not setting the pid/tid on the sample correctly when system-wide mode is used. Although it's fixed now it'd be nice if we have a test case for it. Reviewed-by: German Gomez <german.gomez@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220701230932.1000495-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
5f4e821c6c |
perf tools: Rework prologue generation code
Some functions we use for bpf prologue generation are going to be deprecated. This change reworks current code not to use them. We need to replace following functions/struct: bpf_program__set_prep bpf_program__nth_fd struct bpf_prog_prep_result Currently we use bpf_program__set_prep to hook perf callback before program is loaded and provide new instructions with the prologue. We replace this function/ality by taking instructions for specific program, attaching prologue to them and load such new ebpf programs with prologue using separate bpf_prog_load calls (outside libbpf load machinery). Before we can take and use program instructions, we need libbpf to actually load it. This way we get the final shape of its instructions with all relocations and verifier adjustments). There's one glitch though.. perf kprobe program already assumes generated prologue code with proper values in argument registers, so loading such program directly will fail in the verifier. That's where the fallback pre-load handler fits in and prepends the initialization code to the program. Once such program is loaded we take its instructions, cut off the initialization code and prepend the prologue. I know.. sorry ;-) To have access to the program when loading this patch adds support to register 'fallback' section handler to take care of perf kprobe programs. The fallback means that it handles any section definition besides the ones that libbpf handles. The handler serves two purposes: - allows perf programs to have special arguments in section name - allows perf to use pre-load callback where we can attach init code (zeroing all argument registers) to each perf program Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220616202214.70359-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
8b1e1a0347 |
perf bpf: Convert legacy map definition to BTF-defined
The libbpf is switching off support for legacy map definitions [1], which will break the perf llvm tests. Moving the base source map definition to BTF-defined, so we need to use -g compile option for to add debug/BTF info. [1] https://lore.kernel.org/bpf/20220627211527.2245459-1-andrii@kernel.org/ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220704152721.352046-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6d518ac7be |
perf symbol: Fail to read phdr workaround
The perf jvmti agent doesn't create program headers, in this case fallback on section headers as happened previously. Committer notes: To test this, from a public post by Ian: 1) download a Java workload dacapo-9.12-MR1-bach.jar from https://sourceforge.net/projects/dacapobench/ 2) build perf such as "make -C tools/perf O=/tmp/perf NO_LIBBFD=1" it should detect Java and create /tmp/perf/libperf-jvmti.so 3) run perf with the jvmti agent: perf record -k 1 java -agentpath:/tmp/perf/libperf-jvmti.so -jar dacapo-9.12-MR1-bach.jar -n 10 fop 4) run perf inject: perf inject -i perf.data -o perf-injected.data -j 5) run perf report perf report -i perf-injected.data | grep org.apache.fop With this patch reverted I see lots of symbols like: 0.00% java jitted-388040-4656.so [.] org.apache.fop.fo.FObj.bind(org.apache.fop.fo.PropertyList) With the patch ( |
||
Namhyung Kim
|
6fda2405f4 |
perf lock: Implement cpu and task filters for BPF
Add -a/--all-cpus and -C/--cpu options for cpu filtering. Also -p/--pid and --tid options are added for task filtering. The short -t option is taken for --threads already. Tracking the command line workload is possible as well. $ sudo perf lock contention -a -b sleep 1 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220729200756.666106-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
407b36f69e |
perf lock: Use BPF for lock contention analysis
Add -b/--use-bpf option to use BPF to collect lock contention stats. For simplicity it now runs system-wide and requires C-c to stop. Upcoming changes will add the usual filtering. $ sudo perf lock con -b ^C contended total wait max wait avg wait type caller 42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20 23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a 6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30 3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c 1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115 1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148 2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b 1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06 2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f 1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
77d54a2cd6 |
perf lock: Pass machine pointer to is_lock_function()
This is a preparation for later change to expose the function externally so that it can be used without the implicit session data. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Blake Jones <blakejones@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220729200756.666106-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
9bd7021809 |
perf test: Add user space counter reading tests
These tests are based on test_stat_user_read in tools/lib/perf/tests/test-evsel.c. The tests are modified to skip if perf_event_open fails or rdpmc isn't supported. Committer testing: ⬢[acme@toolbox perf]$ perf test "mmap interface" 4: mmap interface tests : 4.1: Read samples using the mmap interface : Skip (permissions) ⬢[acme@toolbox perf]$ [root@five ~]# perf test "mmap interface" 4: mmap interface tests : 4.1: Read samples using the mmap interface : Ok [root@five ~]# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20220719223946.176299-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
481fadfb10 |
perf test: Remove x86 rdpmc test
This test has been superseded by test_stat_user_read in: tools/lib/perf/tests/test-evsel.c The updated test doesn't divide-by-0 when running time of a counter is 0. It also supports ARM64. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20220719223946.176299-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
18808564aa |
Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up the fixes that went upstream via acme/perf/urgent and to get to v5.19. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
9a0b36266f |
perf stat: Add topdown metrics in the default perf stat on the hybrid machine
Topdown metrics are missed in the default perf stat on the hybrid machine, add Topdown metrics in default perf stat for hybrid systems. Currently, we support the perf metrics Topdown for the p-core PMU in the perf stat default, the perf metrics Topdown support for e-core PMU will be implemented later separately. Refactor the code adds two x86 specific functions. Widen the size of the event name column by 7 chars, so that all metrics after the "#" become aligned again. The perf metrics topdown feature is supported on the cpu_core of ADL. The dedicated perf metrics counter and the fixed counter 3 are used for the topdown events. Adding the topdown metrics doesn't trigger multiplexing. Before: # ./perf stat -a true Performance counter stats for 'system wide': 53.70 msec cpu-clock # 25.736 CPUs utilized 80 context-switches # 1.490 K/sec 24 cpu-migrations # 446.951 /sec 52 page-faults # 968.394 /sec 2,788,555 cpu_core/cycles/ # 51.931 M/sec 851,129 cpu_atom/cycles/ # 15.851 M/sec 2,974,030 cpu_core/instructions/ # 55.385 M/sec 416,919 cpu_atom/instructions/ # 7.764 M/sec 586,136 cpu_core/branches/ # 10.916 M/sec 79,872 cpu_atom/branches/ # 1.487 M/sec 14,220 cpu_core/branch-misses/ # 264.819 K/sec 7,691 cpu_atom/branch-misses/ # 143.229 K/sec 0.002086438 seconds time elapsed After: # ./perf stat -a true Performance counter stats for 'system wide': 61.39 msec cpu-clock # 24.874 CPUs utilized 76 context-switches # 1.238 K/sec 24 cpu-migrations # 390.968 /sec 52 page-faults # 847.097 /sec 2,753,695 cpu_core/cycles/ # 44.859 M/sec 903,899 cpu_atom/cycles/ # 14.725 M/sec 2,927,529 cpu_core/instructions/ # 47.690 M/sec 428,498 cpu_atom/instructions/ # 6.980 M/sec 581,299 cpu_core/branches/ # 9.470 M/sec 83,409 cpu_atom/branches/ # 1.359 M/sec 13,641 cpu_core/branch-misses/ # 222.216 K/sec 8,008 cpu_atom/branch-misses/ # 130.453 K/sec 14,761,308 cpu_core/slots/ # 240.466 M/sec 3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring 1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation 5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound 4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound 646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations 1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears 3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth 1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound 0.002467839 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220721065706.2886112-6-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
cdb204ad42 |
perf x86 evlist: Add default hybrid events for perf stat
Provide a new solution to replace the reverted commit
|
||
Kan Liang
|
a9c1ecdabc |
perf evlist: Always use arch_evlist__add_default_attrs()
Current perf stat uses the evlist__add_default_attrs() to add the generic default attrs, and uses arch_evlist__add_default_attrs() to add the Arch specific default attrs, e.g., Topdown for x86. It works well for the non-hybrid platforms. However, for a hybrid platform, the hard code generic default attrs don't work. Uses arch_evlist__add_default_attrs() to replace the evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is modified to invoke the same __evlist__add_default_attrs() for the generic default attrs. No functional change. Add default_null_attrs[] to indicate the arch specific attrs. No functional change for the arch specific default attrs either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220721065706.2886112-4-zhengjun.xing@linux.intel.com Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
ff4207f793 |
perf evsel: Add arch_evsel__hw_name()
The commit
|
||
Kan Liang
|
ace3e31e65 |
perf stat: Revert "perf stat: Add default hybrid events"
This reverts commit Fixes: |
||
Thomas Richter
|
fb5962f81e |
perf test: Fix test case 95 ("Check branch stack sampling") on s390 and use same event
On linux-next tree 'perf test 95' ("Check branch stack sampling") was
added recently.
s390 does not support branch sampling at all and the test case fails
despite for checking branch support before hand.
The check for support of branching uses the software event named "dummy",
as seen in the line:
perf record -b -o- -e dummy -B true > /dev/null 2>&1 || exit 2
However when the branch recording is actually done, a different event is
used, as seen in the line:
perf record -o $TMPDIR/... --branch-filter any,save_type,u -- ...
The event is omitted and for "perf record" the default event is cycles,
which is not supported by s390 and this fails when executed on s390:
# perf record --branch-filter any,save_type,u -- /tmp/__perf_test.program.iDSmQ/a.out
Error:
cycles: PMU Hardware or event type doesn't support branch stack sampling.
#
Therefore fix this and use the same event cycles for testing support
and actually running the test.
Output before:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
Testing user branch stack sampling
---- end ----
Check branch stack sampling: FAILED!
#
Output after:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
---- end ----
Check branch stack sampling: Skip
#
Fixes:
|
||
Nick Forrington
|
08c1d7a159 |
perf vendor events arm64: Arm Cortex-A78C and X1C
Add PMU events for the Arm Cortex-A78C and Arm Cortex-X1C CPUs. Events for Arm Cortex-A78C match those for Arm Cortex-A78. Events for Arm Cortex-X1C match those for Arm Cortex- X1. As such, this is just a mapfile change. Main ID Register (MIDR) and event data is sourced from the corresponding Arm Technical Reference Manuals: Arm Cortex-A78C: https://developer.arm.com/documentation/102226/ Arm Cortex-X1C: https://developer.arm.com/documentation/101968/ Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Nick Forrington <nick.forrington@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220610174459.615995-1-nick.forrington@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ebcdbf7a6a |
perf vendor events: Update Intel snowridgex
Update to v1.20, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the snowridgex files into perf and update mapfile.csv. Tested on a non-snowridgex with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok This change just updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-31-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6b47be608b |
perf vendor events: Update Intel westmereex
Update to v3, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereex files into perf and update mapfile.csv. This change just aligns whitespace and updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-30-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
4823edd648 |
perf vendor events: Update Intel westmereep-sp
Update to v3, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereep-sp files into perf and update mapfile.csv. This change just aligns whitespace and updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-29-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae2fa1ccf1 |
perf vendor events: Update Intel westmereep-dp
Update to v2, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereep-dp files into perf and update mapfile.csv. This change just aligns whitespace. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-28-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
5e1dd4f24a |
perf vendor events: Update Intel tigerlake
Update to v1.07, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the tigerlake files into perf and update mapfile.csv. Tested on a non-tigerlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-27-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
59fd7d3225 |
perf vendor events: Update Intel skylakex
Update to v1.28, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the skylakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-26-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
35d6527701 |
perf vendor events: Update Intel skylake
Update to v53, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the skylake files into perf and update mapfile.csv. Tested on a non-skylake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
89072caf14 |
perf vendor events: Update Intel silvermont
Update to v14, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the silvermont files into perf and update mapfile.csv. Other than aligning whitespace this change just folds the mapfile.csv entries for silvertmont onto one line. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-24-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
34122105f9 |
perf vendor events: Update Intel sapphirerapids
Update to v1.04, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the sapphirerapids files into perf and update mapfile.csv. Tested on a non-sapphirerapids with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-23-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
777e131244 |
perf vendor events: Update Intel sandybridge
Update to v17, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the sandybridge files into perf and update mapfile.csv. Tested on a non-sandybridge with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-22-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8fe33fd5d3 |
perf vendor events: Update Intel nehalemex
Update to v3, there are no TMA metrics for nehalemex. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the nehalemex files into perf and update mapfile.csv. Tested on a non-nehalemex with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Note: most of this change is just sorting the keys in the json dictionaries. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-21-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
bcc344a3bf |
perf vendor events: Update Intel nehalemep
Update to v3, the are no TMA metrics for nehalemep. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the nehalemep files into perf and update mapfile.csv. Tested on a non-nehalemep with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-20-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1ab4ef06fa |
perf vendor events: Add Intel meteorlake
Events are v1.00, there are no metrics yet. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the events and metrics. Manually copy the meteorlake files into perf and update mapfile.csv. Tested on a non-meteorlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-19-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae7bcd600e |
perf vendor events: Update Intel knightslanding
Update to v9, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the knightslanding files into perf and update mapfile.csv. Tested on a non-knightslanding with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Note: uncore-memory has become uncore-other as the topic was determined this way in the conversion scripts. For simplicity the scripts naming is maintained. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-18-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
376d8b581b |
perf vendor events: Update Intel jaketown
Update to v21, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the jaketown files into perf and update mapfile.csv. Tested on a non-jaketown with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6220136831 |
perf vendor events: Update Intel ivytown
Update to v21, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the ivytown files into perf and update mapfile.csv. Tested on a non-ivytown with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
80c14459f6 |
perf vendor events: Update Intel ivybridge
Update to v22, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the ivybridge files into perf and update mapfile.csv. Tested on a non-ivybridge with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
d214d0c261 |
perf vendor events: Update Intel icelakex
Update to v1.15, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the icelakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
a4a4353ebf |
perf vendor events: Update Intel icelake
Update to v1.14, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the icelake files into perf and update mapfile.csv. Tested on a non-icelake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
859fe0f4f2 |
perf vendor events: Update Intel haswellx
Update to v25, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the haswellx files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Failed 93: perf all PMU test : Ok The test 91 failure is a pre-existing failure on the test system with the metric Load_Miss_Real_Latency which is fixed by prefixing it with --metric-no-group. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8e6389f931 |
perf vendor events: Update Intel haswell
Update to v31, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the haswell files into perf and update mapfile.csv. Tested on a non-haswell with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae54f70dd9 |
perf vendor events: Update goldmontplus mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Correct the version in mapfile.csv. Event json remains at v1.01, there are no goldmontplus metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
beb2db9bed |
perf vendor events: Update goldmont mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Modify mapfile.csv to have a missing goldmont cpuid. Event json remains at v13, there are no goldmont metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
3c9c315711 |
perf vendor events: Update Intel elkhartlake
Update to v1.03. Elkhartlake metrics aren't in TMA but basic metrics are left unchanged. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the elkhartlake files into perf and update mapfile.csv. Tested on a non-elkhartlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
f9d45862ec |
perf vendor events: Update Intel cascadelakex
Update to v1.16, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the cascadelakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
9709ede1a1 |
perf vendor events: Update bonnell mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Fold the mapfile.csv entries together with a more complex regular expression. This will reduce the pmu-events.c table size. The files following this change are still at v4. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
a95ab294a5 |
perf vendor events: Update Intel alderlake
Update to v1.13, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the alderlake files into perf and update mapfile.csv. Tested on a non-alderlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ef908a1925 |
perf vendor events: Update Intel broadwellde
Update to v7, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwellde files into perf and update mapfile.csv. Tested on a non-broadwellde with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1775634ea4 |
perf vendor events: Update Intel broadwell
Update to v26, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwell files into perf and update mapfile.csv. Tested on a non-broadwell with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
4266081e33 |
perf vendor events: Update Intel broadwellx
Update to v19, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwellx files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
9a24180567 |
perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the empty list test. However, this meant that offsets were computed against null and that a struct list_head was compared against a 'struct bpf_perf_object'. Working around this with clang's undefined behavior sanitizer required -fno-sanitize=null and -fno-sanitize=object-size. Remove the undefined behavior by using the regular Linux list APIs and handling the starting case separately from the end testing case. Looking at uses like bpf_perf_object__for_each(), as the constant NULL or non-NULL argument can be constant propagated, the code is no less efficient. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Christy Lee <christylee@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
882528d2e7 |
perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed. E.g. libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which resides in the '.gnu.warning.getwd' section. Unlike normal sections, such kind of sections are used for linker warning when a file calls deprecated functions, but they are not part of memory images, the symbols in these sections should be dropped. This patch checks the section attribute SHF_ALLOC bit, if the bit is not set, it skips symbols to avoid spurious ones. Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chang Rui <changruinj@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
2d86612aac |
perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
...
Disassembly of section .bss:
0000000000004040 <completed.0>:
...
0000000000004080 <buf1>:
...
00000000000040c0 <buf2>:
...
0000000000004100 <thread>:
...
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
# ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf2 0x30a8-0x30e8
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf1 0x3068-0x30a8
...
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
file_address = st_value - sh_addr + sh_offset
^
` Memory address
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
After applying the change:
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf2 0x30c0-0x3100
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf1 0x3080-0x30c0
...
Fixes:
|
||
Leo Yan
|
b226521923 |
perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes:
|
||
Ian Rogers
|
a061a8ad3f |
perf test: Avoid sysfs state affecting fake events
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of topdown-slots-issued and topdown-total-slots. Legacy event parsing would try to use '-' to separate parts of an event and so perf_pmu__parse_init sets 'slots' to be a PMU_EVENT_SYMBOL_SUFFIX2. As such parsing the slots event for a fake PMU fails as a PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token. Resolve this issue by test initializing the PMU parsing state before every parse. This must be done every parse as the state is removes after each parse_events. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
bedd17381b |
perf vendor events intel: Update event list for haswellx
Update JSON core/uncore events for haswellx to perf. Based on HSX JSON list v24: https://download.01.org/perfmon/HSX Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
28738de918 |
perf vendor events intel: Update event list for broadwellx
Update JSON core/uncore events for broadwellx to perf. Based on BDX JSON list v19: https://download.01.org/perfmon/BDX Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
b43a5442d8 |
perf vendor events intel: Update event list for Snowridgex
More uncore events are added in the converter tool: https://github.com/intel/event-converter-for-linux-perf Keep both alias and the original name for the events, in case someone already used the alias in their script. Generate the perf events based on Snowridgex(SNR) event list v1.20: https://download.01.org/perfmon/SNR/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
9146af4413 |
perf vendor events intel: Rename tremontx to snowridgex
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
6a92916de5 |
perf vendor events intel: Update event list for Sapphirerapids
Update JSON event list for Sapphirerapids to perf. Based on JSON list v1.02: https://download.01.org/perfmon/SPR/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
5fa2481cdf |
perf vendor events intel: Update event list for Alderlake
Update JSON event list for Alderlake to perf. It is a hybrid event list for both Atom and Core. Based on JSON list v1.11: https://download.01.org/perfmon/ADL/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Colin Ian King
|
8147f79ea5 |
perf inject: Fix spelling mistake "theads" -> "threads"
There is a spelling mistake in a pr_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
acfb65fe1d |
perf kwork: Add workqueue trace BPF support
Implements workqueue trace bpf function. Test cases: # perf kwork -k workqueue lat -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (w)addrconf_verify_work | 0002 | 5.856 ms | 1 | 5.856 ms | 111994.634313 s | 111994.640169 s | (w)vmstat_update | 0001 | 1.247 ms | 1 | 1.247 ms | 111996.462651 s | 111996.463899 s | (w)neigh_periodic_work | 0001 | 1.183 ms | 1 | 1.183 ms | 111996.462789 s | 111996.463973 s | (w)neigh_managed_work | 0001 | 0.989 ms | 2 | 1.635 ms | 111996.462820 s | 111996.464455 s | (w)wb_workfn | 0000 | 0.667 ms | 1 | 0.667 ms | 111996.384273 s | 111996.384940 s | (w)bpf_prog_free_deferred | 0001 | 0.495 ms | 1 | 0.495 ms | 111986.314201 s | 111986.314696 s | (w)mix_interrupt_randomness | 0002 | 0.421 ms | 6 | 0.749 ms | 111995.927750 s | 111995.928499 s | (w)vmstat_shepherd | 0000 | 0.374 ms | 2 | 0.385 ms | 111991.265242 s | 111991.265627 s | (w)e1000_watchdog | 0002 | 0.356 ms | 5 | 0.390 ms | 111994.528380 s | 111994.528770 s | (w)vmstat_update | 0000 | 0.231 ms | 2 | 0.365 ms | 111996.384407 s | 111996.384772 s | (w)flush_to_ldisc | 0006 | 0.165 ms | 1 | 0.165 ms | 111995.930606 s | 111995.930771 s | (w)flush_to_ldisc | 0000 | 0.094 ms | 2 | 0.095 ms | 111996.460453 s | 111996.460548 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)e1000_watchdog | 0002 | 0.627 ms | 2 | 0.324 ms | 112002.720665 s | 112002.720989 s | (w)flush_to_ldisc | 0007 | 0.598 ms | 2 | 0.534 ms | 112000.875226 s | 112000.875761 s | (w)wq_barrier_func | 0007 | 0.492 ms | 1 | 0.492 ms | 112000.876981 s | 112000.877473 s | (w)flush_to_ldisc | 0007 | 0.281 ms | 1 | 0.281 ms | 112005.826882 s | 112005.827163 s | (w)mix_interrupt_randomness | 0002 | 0.229 ms | 3 | 0.102 ms | 112005.825671 s | 112005.825774 s | (w)vmstat_shepherd | 0000 | 0.202 ms | 1 | 0.202 ms | 112001.504511 s | 112001.504713 s | (w)bpf_prog_free_deferred | 0001 | 0.181 ms | 1 | 0.181 ms | 112000.883251 s | 112000.883432 s | (w)wb_workfn | 0007 | 0.130 ms | 1 | 0.130 ms | 112001.505195 s | 112001.505325 s | (w)vmstat_update | 0000 | 0.053 ms | 1 | 0.053 ms | 112001.504763 s | 112001.504815 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
5a81927a40 |
perf kwork: Add softirq trace BPF support
Implements softirq trace bpf function. Test cases: Trace softirq latency without filter: # perf kwork -k softirq lat -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0005 | 0.281 ms | 3 | 0.338 ms | 111295.752222 s | 111295.752560 s | (s)RCU:9 | 0002 | 0.262 ms | 24 | 1.400 ms | 111301.335986 s | 111301.337386 s | (s)SCHED:7 | 0005 | 0.177 ms | 14 | 0.212 ms | 111295.752270 s | 111295.752481 s | (s)RCU:9 | 0007 | 0.161 ms | 47 | 2.022 ms | 111295.402159 s | 111295.404181 s | (s)NET_RX:3 | 0003 | 0.149 ms | 12 | 1.261 ms | 111301.192964 s | 111301.194225 s | (s)TIMER:1 | 0001 | 0.105 ms | 9 | 0.198 ms | 111301.180191 s | 111301.180389 s | ... <SNIP> ... (s)NET_RX:3 | 0002 | 0.098 ms | 6 | 0.124 ms | 111295.403760 s | 111295.403884 s | (s)SCHED:7 | 0001 | 0.093 ms | 19 | 0.242 ms | 111301.180256 s | 111301.180498 s | (s)SCHED:7 | 0007 | 0.078 ms | 15 | 0.188 ms | 111300.064226 s | 111300.064415 s | (s)SCHED:7 | 0004 | 0.077 ms | 11 | 0.213 ms | 111301.361759 s | 111301.361973 s | (s)SCHED:7 | 0000 | 0.063 ms | 33 | 0.805 ms | 111295.401811 s | 111295.402616 s | (s)SCHED:7 | 0003 | 0.063 ms | 14 | 0.085 ms | 111301.192255 s | 111301.192340 s | -------------------------------------------------------------------------------------------------------------------------------- Trace softirq latency with cpu filter: # perf kwork -k softirq lat -b -C 1 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0001 | 0.178 ms | 5 | 0.572 ms | 111435.534135 s | 111435.534707 s | -------------------------------------------------------------------------------------------------------------------------------- Trace softirq latency with name filter: # perf kwork -k softirq lat -b -n SCHED Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)SCHED:7 | 0001 | 0.295 ms | 15 | 2.183 ms | 111452.534950 s | 111452.537133 s | (s)SCHED:7 | 0002 | 0.215 ms | 10 | 0.315 ms | 111460.000238 s | 111460.000553 s | (s)SCHED:7 | 0005 | 0.190 ms | 29 | 0.338 ms | 111457.032538 s | 111457.032876 s | (s)SCHED:7 | 0003 | 0.097 ms | 10 | 0.319 ms | 111452.434351 s | 111452.434670 s | (s)SCHED:7 | 0006 | 0.089 ms | 1 | 0.089 ms | 111450.737450 s | 111450.737539 s | (s)SCHED:7 | 0007 | 0.085 ms | 17 | 0.169 ms | 111452.471333 s | 111452.471502 s | (s)SCHED:7 | 0004 | 0.071 ms | 15 | 0.221 ms | 111452.535252 s | 111452.535473 s | (s)SCHED:7 | 0000 | 0.044 ms | 32 | 0.130 ms | 111460.001982 s | 111460.002112 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
420298aefe |
perf kwork: Add IRQ trace BPF support
Implements irq trace bpf function. Test cases: Trace irq without filter: # perf kwork -k irq rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 31.026 ms | 285 | 1.493 ms | 110326.049963 s | 110326.051456 s | eth0:10 | 0002 | 7.875 ms | 96 | 1.429 ms | 110313.916835 s | 110313.918264 s | ata_piix:14 | 0002 | 2.510 ms | 28 | 0.396 ms | 110331.367987 s | 110331.368383 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with cpu filter: # perf kwork -k irq rep -b -C 0 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 34.288 ms | 282 | 2.061 ms | 110358.078968 s | 110358.081029 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with name filter: # perf kwork -k irq rep -b -n eth0 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- eth0:10 | 0002 | 2.184 ms | 21 | 0.572 ms | 110386.541699 s | 110386.542271 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with summary: # perf kwork -k irq rep -b -S Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 42.923 ms | 285 | 1.181 ms | 110418.128867 s | 110418.130049 s | eth0:10 | 0002 | 2.085 ms | 20 | 0.668 ms | 110416.002935 s | 110416.003603 s | ata_piix:14 | 0002 | 0.970 ms | 4 | 0.656 ms | 110424.034482 s | 110424.035138 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 309 Total runtime (msec) : 45.977 (0.003% load average) Total time span (msec) : 17017.655 -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k irq rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- nvme0q20:145 | 0019 | 0.570 ms | 28 | 0.064 ms | 26966.635102 s | 26966.635167 s | amdgpu:162 | 0002 | 0.568 ms | 29 | 0.068 ms | 26966.644346 s | 26966.644414 s | nvme0q4:129 | 0003 | 0.565 ms | 31 | 0.037 ms | 26966.614830 s | 26966.614866 s | nvme0q16:141 | 0015 | 0.205 ms | 66 | 0.012 ms | 26967.145161 s | 26967.145174 s | nvme0q29:154 | 0028 | 0.154 ms | 44 | 0.014 ms | 26967.078970 s | 26967.078984 s | nvme0q10:135 | 0009 | 0.134 ms | 43 | 0.011 ms | 26967.132093 s | 26967.132104 s | nvme0q2:127 | 0001 | 0.132 ms | 26 | 0.011 ms | 26966.883584 s | 26966.883595 s | nvme0q25:150 | 0024 | 0.127 ms | 32 | 0.014 ms | 26966.631419 s | 26966.631433 s | nvme0q14:139 | 0013 | 0.110 ms | 21 | 0.017 ms | 26966.760843 s | 26966.760861 s | nvme0q30:155 | 0029 | 0.102 ms | 30 | 0.022 ms | 26966.677171 s | 26966.677193 s | nvme0q13:138 | 0012 | 0.088 ms | 20 | 0.015 ms | 26966.738733 s | 26966.738748 s | nvme0q6:131 | 0005 | 0.087 ms | 13 | 0.020 ms | 26966.648445 s | 26966.648465 s | nvme0q28:153 | 0027 | 0.066 ms | 12 | 0.015 ms | 26966.771431 s | 26966.771447 s | nvme0q26:151 | 0025 | 0.060 ms | 13 | 0.012 ms | 26966.704266 s | 26966.704278 s | nvme0q21:146 | 0020 | 0.054 ms | 20 | 0.011 ms | 26967.322082 s | 26967.322094 s | nvme0q1:126 | 0000 | 0.046 ms | 11 | 0.013 ms | 26966.859754 s | 26966.859767 s | nvme0q17:142 | 0016 | 0.046 ms | 10 | 0.011 ms | 26967.114513 s | 26967.114524 s | xhci_hcd:74 | 0015 | 0.041 ms | 3 | 0.016 ms | 26967.086004 s | 26967.086020 s | nvme0q8:133 | 0007 | 0.039 ms | 12 | 0.008 ms | 26966.712056 s | 26966.712063 s | nvme0q32:157 | 0031 | 0.036 ms | 10 | 0.014 ms | 26966.627054 s | 26966.627068 s | nvme0q9:134 | 0008 | 0.036 ms | 11 | 0.011 ms | 26967.258452 s | 26967.258462 s | nvme0q7:132 | 0006 | 0.024 ms | 3 | 0.014 ms | 26966.767404 s | 26966.767418 s | nvme0q11:136 | 0010 | 0.023 ms | 5 | 0.006 ms | 26966.935455 s | 26966.935461 s | nvme0q31:156 | 0030 | 0.018 ms | 5 | 0.006 ms | 26966.627517 s | 26966.627524 s | nvme0q12:137 | 0011 | 0.015 ms | 2 | 0.014 ms | 26966.799588 s | 26966.799602 s | enp5s0-rx-0:164 | 0006 | 0.009 ms | 2 | 0.005 ms | 26966.742024 s | 26966.742028 s | enp5s0-rx-1:165 | 0007 | 0.006 ms | 2 | 0.004 ms | 26966.939486 s | 26966.939490 s | enp5s0-tx-0:166 | 0008 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s | enp5s0-tx-1:167 | 0009 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s | -------------------------------------------------------------------------------------------------------------------------------- #t Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
daf07d2207 |
perf kwork: Implement BPF trace
'perf record' generates perf.data, which generates extra interrupts for hard disk, amount of data to be collected increases with time. Using eBPF trace can process the data in kernel, which solves the preceding two problems. Add -b/--use-bpf option for latency and report to support tracing kwork events using eBPF: 1. Create bpf prog and attach to tracepoints, 2. Start tracing after command is entered, 3. After user hit "ctrl+c", stop tracing and report, 4. Support CPU and name filtering. This commit implements the framework code and does not add specific event support. Test cases: # perf kwork rep -h Usage: perf kwork report [<options>] -b, --use-bpf Use BPF to measure kwork runtime -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork lat -h Usage: perf kwork latency [<options>] -b, --use-bpf Use BPF to measure kwork latency -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat -b Unsupported bpf trace class irq # perf kwork rep -b Unsupported bpf trace class irq Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com [ Simplify work_findnew() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
bcc8b3e88d |
perf kwork: Implement perf kwork timehist
Implements framework of perf kwork timehist, to provide an analysis of kernel work events. Test cases: # perf kwork tim Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111 91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073 91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409 91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114 91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058 91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110 91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059 91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726 91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082 91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111 91576.070147 91576.070206 [0000] (s)SCHED:7 0.060 0.057 91576.073147 91576.073202 [0000] (s)SCHED:7 0.054 0.060 <SNIP> # perf kwork tim --max-stack 2 -g Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073 irq_exit_rcu <- sysvec_call_function_single 91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058 irq_exit_rcu <- sysvec_call_function_single 91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059 irq_exit_rcu <- sysvec_call_function_single 91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt <SNIP> Committer testing: # perf kwork -k workqueue timehist | head -40 Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 26520.211825 26520.211832 [0019] (w)free_work 0.007 0.004 26520.212929 26520.212934 [0020] (w)free_work 0.005 0.004 26520.213226 26520.213228 [0014] (w)kfree_rcu_work 0.002 0.004 26520.214057 26520.214061 [0021] (w)free_work 0.004 0.004 26520.221239 26520.221241 [0007] (w)kfree_rcu_work 0.002 0.009 26520.223232 26520.223238 [0013] (w)psi_avgs_work 0.005 0.006 26520.230057 26520.230060 [0020] (w)free_work 0.003 0.003 26520.270428 26520.270434 [0015] (w)free_work 0.006 0.004 26520.270546 26520.270550 [0014] (w)free_work 0.004 0.003 26520.281626 26520.281629 [0015] (w)free_work 0.003 0.002 26520.287225 26520.287230 [0012] (w)psi_avgs_work 0.005 0.008 26520.287231 26520.287235 [0001] (w)psi_avgs_work 0.004 0.011 26520.287236 26520.287239 [0001] (w)psi_avgs_work 0.003 0.012 26520.329488 26520.329492 [0024] (w)free_work 0.004 0.004 26520.330600 26520.330605 [0007] (w)free_work 0.005 0.004 26520.334218 26520.334218 [0007] (w)kfree_rcu_monitor 0.001 0.002 26520.335220 26520.335221 [0005] (w)kfree_rcu_monitor 0.001 0.004 26520.343980 26520.343985 [0007] (w)free_work 0.005 0.002 26520.345093 26520.345097 [0006] (w)free_work 0.004 0.003 26520.351233 26520.351238 [0027] (w)psi_avgs_work 0.005 0.008 26520.353228 26520.353229 [0007] (w)kfree_rcu_work 0.001 0.002 26520.353229 26520.353231 [0005] (w)kfree_rcu_work 0.001 0.006 26520.382381 26520.382383 [0006] (w)free_work 0.003 0.002 26520.386547 26520.386548 [0006] (w)free_work 0.002 0.001 26520.391243 26520.391245 [0015] (w)console_callback 0.002 0.016 26520.415369 26520.415621 [0027] (w)btrfs_work_helper 0.252 26520.415351 26520.416174 [0002] (w)btrfs_work_helper 0.823 0.037 26520.415343 26520.416304 [0031] (w)btrfs_work_helper 0.961 26520.415335 26520.417078 [0001] (w)btrfs_work_helper 1.743 26520.415250 26520.417564 [0002] (w)wb_workfn 2.314 26520.424777 26520.424787 [0002] (w)btrfs_work_helper 0.010 26520.424788 26520.424798 [0002] (w)btrfs_work_helper 0.010 26520.424790 26520.424805 [0001] (w)btrfs_work_helper 0.016 0.016 26520.424801 26520.424807 [0002] (w)btrfs_work_helper 0.006 26520.424809 26520.424831 [0002] (w)btrfs_work_helper 0.022 0.030 26520.424824 26520.424835 [0027] (w)btrfs_work_helper 0.011 26520.424809 26520.424867 [0001] (w)btrfs_work_helper 0.059 0.032 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-14-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
53e49e32ae |
perf kwork: Add workqueue latency support
Implements workqueue latency function. Test cases: # perf kwork -k workqueue lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (w)vmstat_update | 0001 | 5.004 ms | 1 | 5.004 ms | 44001.745646 s | 44001.750650 s | (w)vmstat_update | 0006 | 1.773 ms | 1 | 1.773 ms | 44000.830840 s | 44000.832613 s | (w)vmstat_shepherd | 0000 | 0.992 ms | 8 | 2.474 ms | 44007.717845 s | 44007.720318 s | (w)vmstat_update | 0000 | 0.974 ms | 5 | 2.624 ms | 44004.785970 s | 44004.788594 s | (w)e1000_watchdog | 0002 | 0.687 ms | 5 | 2.632 ms | 44005.009334 s | 44005.011966 s | (w)vmstat_update | 0002 | 0.307 ms | 1 | 0.307 ms | 44004.817395 s | 44004.817702 s | (w)vmstat_update | 0004 | 0.296 ms | 1 | 0.296 ms | 43997.913677 s | 43997.913973 s | (w)mix_interrupt_randomness | 0000 | 0.283 ms | 285 | 3.724 ms | 44006.790889 s | 44006.794613 s | (w)neigh_managed_work | 0001 | 0.271 ms | 1 | 0.271 ms | 43997.665542 s | 43997.665813 s | (w)vmstat_update | 0005 | 0.261 ms | 1 | 0.261 ms | 44007.820542 s | 44007.820803 s | (w)neigh_managed_work | 0004 | 0.220 ms | 1 | 0.220 ms | 44002.953287 s | 44002.953507 s | (w)neigh_periodic_work | 0004 | 0.217 ms | 1 | 0.217 ms | 43999.929718 s | 43999.929935 s | (w)mix_interrupt_randomness | 0002 | 0.199 ms | 5 | 0.310 ms | 44005.012316 s | 44005.012625 s | (w)vmstat_update | 0003 | 0.199 ms | 4 | 0.307 ms | 44005.714391 s | 44005.714699 s | (w)gc_worker | 0001 | 0.071 ms | 173 | 1.128 ms | 44002.062579 s | 44002.063707 s | -------------------------------------------------------------------------------------------------------------------------------- INFO: 0.020% skipped events (17 including 10 raise, 7 entry, 0 exit) Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-13-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
19807bba5a |
perf kwork: Add softirq latency support
Implements softirq latency function. Test cases: # perf kwork -k softirq lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s | (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)SCHED:7 | 0000 | 0.158 ms | 677 | 2.639 ms | 44004.785716 s | 44004.788355 s | ... <SNIP> ... (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | (s)TIMER:1 | 0005 | 0.128 ms | 1 | 0.128 ms | 44007.820346 s | 44007.820474 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -C 1,2 Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)NET_RX:3 | 0002 | 0.106 ms | 5 | 0.163 ms | 44005.012255 s | 44005.012418 s | (s)TIMER:1 | 0002 | 0.084 ms | 9 | 0.114 ms | 44005.009168 s | 44005.009282 s | (s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -n RCU Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s | (s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -s count,avg -n RCU Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat --time 43997, Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s | (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)TIMER:1 | 0004 | 0.083 ms | 21 | 0.127 ms | 44004.969171 s | 44004.969298 s | ... <SNIP> ... (s)SCHED:7 | 0005 | 0.050 ms | 4 | 0.086 ms | 43997.684852 s | 43997.684938 s | (s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s | (s)SCHED:7 | 0007 | 0.044 ms | 171 | 0.077 ms | 43997.943265 s | 43997.943342 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-12-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
ad3d9f7a92 |
perf kwork: Implement perf kwork latency
Implements framework of perf kwork latency, which is used to report time properties such as delay time and frequency. Test cases: # perf kwork lat -h Usage: perf kwork latency [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat -C 199 Requested CPU 199 too large. Consider raising MAX_NR_CPUS Invalid cpu bitmap # perf kwork lat -i perf_no_exist.data failed to open perf_no_exist.data: No such file or directory # perf kwork lat -s avg1 Error: Unknown --sort key: `avg1' Usage: perf kwork latency [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat --time FFFF, Invalid time span # perf kwork lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit) Since there are no latency-enabled events, the output is empty. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
8dbc3c8689 |
perf kwork: Add workqueue report support
Implements workqueue report function. Test cases: # perf kwork -k workqueue rep Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s | (w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s | (w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s | (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s | (w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s | (w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s | (w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s | (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s | (w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 500 Total runtime (msec) : 1945.085 (0.192% load average) Total time span (msec) : 10155.026 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -n vmstat_update Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k workqueue rep -C 1 | head -20 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)commit_work | 0001 | 25.896 ms | 2 | 13.200 ms | 26522.906700 s | 26522.919900 s | (w)commit_work | 0001 | 13.316 ms | 1 | 13.316 ms | 26522.573246 s | 26522.586562 s | (w)commit_work | 0001 | 13.177 ms | 1 | 13.177 ms | 26522.673406 s | 26522.686583 s | (w)commit_work | 0001 | 12.630 ms | 1 | 12.630 ms | 26522.123921 s | 26522.136551 s | (w)btrfs_work_helper | 0001 | 3.544 ms | 1 | 3.544 ms | 26529.131296 s | 26529.134840 s | (w)btrfs_work_helper | 0001 | 3.330 ms | 1 | 3.330 ms | 26529.137698 s | 26529.141028 s | (w)btrfs_work_helper | 0001 | 2.855 ms | 1 | 2.855 ms | 26529.134842 s | 26529.137697 s | (w)btrfs_work_helper | 0001 | 2.757 ms | 1 | 2.757 ms | 26529.124086 s | 26529.126843 s | (w)btrfs_work_helper | 0001 | 2.182 ms | 1 | 2.182 ms | 26529.141030 s | 26529.143212 s | (w)btrfs_work_helper | 0001 | 1.743 ms | 1 | 1.743 ms | 26520.415335 s | 26520.417078 s | (w)btrfs_work_helper | 0001 | 1.499 ms | 1 | 1.499 ms | 26529.127774 s | 26529.129272 s | (w)btrfs_work_helper | 0001 | 1.446 ms | 1 | 1.446 ms | 26529.129848 s | 26529.131294 s | (w)btrfs_work_helper | 0001 | 1.373 ms | 1 | 1.373 ms | 26523.808270 s | 26523.809643 s | (w)wb_workfn | 0001 | 1.165 ms | 2 | 0.763 ms | 26527.071056 s | 26527.071819 s | (w)btrfs_work_helper | 0001 | 0.926 ms | 1 | 0.926 ms | 26529.126846 s | 26529.127771 s | (w)btrfs_work_helper | 0001 | 0.571 ms | 1 | 0.571 ms | 26529.129275 s | 26529.129846 s | (w)wb_workfn | 0001 | 0.525 ms | 1 | 0.525 ms | 26522.975151 s | 26522.975676 s | # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-10-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
4c14819169 |
perf kwork: Add softirq report support
Implements softirq kwork report function. Test cases: # perf kwork -k softirq rep Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | ... <SNIP> ... (s)RCU:9 | 0004 | 0.830 ms | 26 | 0.058 ms | 43997.666418 s | 43997.666476 s | (s)TIMER:1 | 0001 | 0.471 ms | 4 | 0.158 ms | 44007.834694 s | 44007.834852 s | (s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s | (s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | -------------------------------------------------------------------------------------------------------------------------------- # # perf kwork -k softirq rep -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0000 | 63.631 ms | 680 | 2.690 ms | 44006.721976 s | 44006.724666 s | ... <SNIP> ... (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | (s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s | (s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 12748 Total runtime (msec) : 661.433 (0.065% load average) Total time span (msec) : 10176.441 -------------------------------------------------------------------------------------------------------------------------------- # # perf kwork -k softirq rep -s count,max Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)SCHED:7 | 0002 | 50.039 ms | 1731 | 0.074 ms | 44005.009447 s | 44005.009521 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | ... <SNIP> ... (s)RCU:9 | 0002 | 35.241 ms | 932 | 0.407 ms | 44005.009541 s | 44005.009949 s | (s)RCU:9 | 0000 | 45.710 ms | 702 | 1.144 ms | 44004.787023 s | 44004.788167 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k softirq report -C 2 -s count,max Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)SCHED:7 | 0002 | 0.980 ms | 159 | 0.024 ms | 26035.571037 s | 26035.571061 s | (s)RCU:9 | 0002 | 0.124 ms | 88 | 0.021 ms | 26035.177050 s | 26035.177071 s | (s)TIMER:1 | 0002 | 0.122 ms | 56 | 0.007 ms | 26035.468045 s | 26035.468052 s | -------------------------------------------------------------------------------------------------------------------------------- # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-9-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
94348520c6 |
perf kwork: Add irq report support
Implements irq kwork report function. Test cases: # perf kwork record -- sleep 10 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 6.134 MB perf.data ] # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 2 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 3 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -i perf.data Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -s max,freq Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 18289 Total runtime (msec) : 1167.686 (0.115% load average) Total time span (msec) : 10159.155 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report --time 44005, Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 402.173 ms | 4695 | 0.981 ms | 44007.831992 s | 44007.832973 s | eth0:10 | 0002 | 0.089 ms | 2 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- nvme0q5:130 | 0004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s | amdgpu:162 | 0002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s | nvme0q24:149 | 0023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s | nvme0q20:145 | 0019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s | nvme0q31:156 | 0030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s | nvme0q8:133 | 0007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s | nvme0q6:131 | 0005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s | nvme0q19:144 | 0018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s | nvme0q7:132 | 0006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s | nvme0q18:143 | 0017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s | nvme0q17:142 | 0016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s | enp5s0-rx-0:164 | 0006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s | enp5s0-tx-0:166 | 0008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s | -------------------------------------------------------------------------------------------------------------------------------- # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-8-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
f98919ec4f |
perf kwork: Implement 'report' subcommand
Implements framework of 'perf kwork report', which is used to report time properties such as run time and frequency: Test cases: # perf kwork Usage: perf kwork [<options>] {record|report} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, etc) -v, --verbose be more verbose (show symbol address, etc) # perf kwork report -h Usage: perf kwork report [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- Total count : 0 Total runtime (msec) : 0.000 (0.000% load average) Total time span (msec) : 0.000 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 0,100 Requested CPU 100 too large. Consider raising MAX_NR_CPUS Invalid cpu bitmap # perf kwork report -s runtime1 Error: Unknown --sort key: `runtime1' Usage: perf kwork report [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork report -i perf_no_exist.data failed to open perf_no_exist.data: No such file or directory # perf kwork report --time 00FFF, Invalid time span Since there are no report supported events, the output is empty. Briefly describe the data structure: 1. "class" indicates event type. For example, irq and softiq correspond to different types. 2. "cluster" refers to a specific event corresponding to a type. For example, RCU and TIMER in softirq correspond to different clusters, which contains three types of events: raise, entry, and exit. 3. "atom" includes time of each sample and sample of the previous phase. (For example, exit corresponds to entry, which is used for timehist.) Committer notes: - Add {} for multiline if blocks. - report_print_work() should either return that ret variable that accounts how many bytes were printed or stop accounting and be void. Do the former for now to avoid this: builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable] int ret = 0; ^ 1 error generated. When building with: ⬢[acme@toolbox perf]$ clang --version clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076) Also: - if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) { + if (dst_type < KWORK_TRACE_MAX) { Several versions of clang and at least this gcc: 3 51.40 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0) builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is always true [-Werror,-Wtautological-compare] if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) { As the first entry in a enum is zero. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
97179d9d08 |
perf kwork: Add workqueue kwork record support
Record workqueue events workqueue:workqueue_activate_work, workqueue:workqueue_execute_start & workqueue:workqueue_execute_end Tese cases: Record all events: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.857 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit irq:softirq_raise irq:softirq_entry irq:softirq_exit workqueue:workqueue_activate_work workqueue:workqueue_execute_start workqueue:workqueue_execute_end dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Record workqueue events: # perf kwork -k workqueue record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.081 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date workqueue:workqueue_activate_work workqueue:workqueue_execute_start workqueue:workqueue_execute_end dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Committer testing: # perf kwork record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 3.430 MB perf.data (24130 samples) ] # perf evlist -v irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_activate_work: type: 2, size: 128, config: 0x106, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_execute_start: type: 2, size: 128, config: 0x105, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_execute_end: type: 2, size: 128, config: 0x104, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # perf script | grep workqueue | head swapper 0 [018] 26035.043289: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368 kworker/18:2-ev 70440 [018] 26035.043293: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work kworker/18:2-ev 70440 [018] 26035.043301: workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work swapper 0 [021] 26035.044704: workqueue:workqueue_activate_work: work struct 0xffff8b8ffef6e368 kworker/21:0-ev 4080535 [021] 26035.044709: workqueue:workqueue_execute_start: work struct 0xffff8b8ffef6e368: function free_work kworker/21:0-ev 4080535 [021] 26035.044716: workqueue:workqueue_execute_end: work struct 0xffff8b8ffef6e368: function free_work swapper 0 [018] 26035.045230: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368 kworker/18:2-ev 70440 [018] 26035.045232: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work kworker/18:2-ev 70440 [018] 26035.045235: workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work swapper 0 [001] 26035.052046: workqueue:workqueue_activate_work: work struct 0xffff8b8108901590 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-5-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
e643932190 |
perf kwork: Add softirq kwork record support
Record softirq events irq:softirq_raise, irq:softirq_entry & irq:softirq_exit. Test cases: Record all events: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.897 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit irq:softirq_raise irq:softirq_entry irq:softirq_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Record softirq events: # perf kwork -k softirq record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.141 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:softirq_raise irq:softirq_entry irq:softirq_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Committer testing: # perf kwork record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 3.078 MB perf.data (17433 samples) ] # perf evlist -v irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # perf script | head migration/12 73 [012] 25884.940992: irq:softirq_raise: vec=9 [action=RCU] migration/12 73 [012] 25884.940994: irq:softirq_entry: vec=9 [action=RCU] migration/12 73 [012] 25884.940995: irq:softirq_exit: vec=9 [action=RCU] swapper 0 [004] 25884.940995: irq:softirq_raise: vec=9 [action=RCU] swapper 0 [004] 25884.940998: irq:softirq_entry: vec=9 [action=RCU] swapper 0 [004] 25884.940999: irq:softirq_exit: vec=9 [action=RCU] cc1 71212 [021] 25884.941990: irq:softirq_raise: vec=9 [action=RCU] swapper 0 [004] 25884.941991: irq:softirq_raise: vec=9 [action=RCU] cc1 71212 [021] 25884.941992: irq:softirq_raise: vec=7 [action=SCHED] perf-exec 71208 [013] 25884.941992: irq:softirq_raise: vec=9 [action=RCU] # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-4-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
4f8ae962f0 |
perf kwork: Add irq kwork record support
Record interrupt events irq:irq_handler_entry & irq_handler_exit Test cases: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.556 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-3-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
0f70d8e9db |
perf kwork: New tool to trace time properties of kernel work (such as softirq, and workqueue)
The 'perf kwork' tool is used to trace time properties of kernel work (such as irq, softirq, and workqueue), including runtime, latency, and timehist, using the infrastructure in the perf tools to allow tracing extra targets. This is the first commit to reuse the 'perf record' framework code to implement a simple record function, kwork is not supported currently. Test cases: # perf usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are: <SNIP> iostat Show I/O performance metrics kallsyms Searches running kernel for symbols kmem Tool to trace/measure kernel memory properties kvm Tool to trace/measure kvm guest os kwork Tool to trace/measure kernel work properties (latencies) list List all symbolic event types lock Analyze lock events mem Profile memory accesses record Run a command and record its profile into perf.data <SNIP> See 'perf help COMMAND' for more information on a specific command. # perf kwork Usage: perf kwork [<options>] {record} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile -v, --verbose be more verbose (show symbol address, etc) # perf kwork record -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.787 MB perf.data ] Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-2-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
ade5353950 |
perf data: Add missing unistd.h header needed for pid_t
Noticed when processing 'perf kwork' that includes util/data.h without, by luck, having included unistd.h indirectly to get the pid_t typedef. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
1ab55323c5 |
perf lock: Support -t option for 'contention' subcommand
Like perf lock report, it can report lock contention stat of each task. $ perf lock contention -t contended total wait max wait avg wait pid comm 5 945.20 us 902.08 us 189.04 us 316167 EventManager_De 33 98.17 us 6.78 us 2.97 us 766063 kworker/0:1-get 7 92.47 us 61.26 us 13.21 us 316170 EventManager_De 14 76.31 us 12.87 us 5.45 us 12949 timedcall 24 76.15 us 12.27 us 3.17 us 767992 sched-pipe 15 75.62 us 11.93 us 5.04 us 15127 switchto-defaul 24 71.84 us 5.59 us 2.99 us 629168 kworker/u513:2- 17 67.41 us 7.94 us 3.96 us 13504 coroner- 1 59.56 us 59.56 us 59.56 us 316165 EventManager_De 14 56.21 us 6.89 us 4.01 us 0 swapper Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
79079f21f5 |
perf lock: Add -k and -F options to 'contention' subcommand
Like perf lock report, add -k/--key and -F/--field options to control output formatting and sorting. Note that it has slightly different default options as some fields are not available and to optimize the screen space. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
528b9cab3b |
perf lock: Add 'contention' subcommand
The 'perf lock contention' processes the lock contention events and displays the result like perf lock report. Right now, there's not much difference between the two but the lock contention specific features will come soon. $ perf lock contention contended total wait max wait avg wait type caller 238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c 1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd 81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172 2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d 24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56 2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa 6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d 1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b 3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
f9c695a211 |
perf lock: Add lock aggregation enum
Introduce the aggr_mode variable to prepare a later code change. The default is LOCK_AGGR_ADDR which aggregates the result for the lock instances. When -t/--threads option is given, it'd be set to LOCK_AGGR_TASK. The LOCK_AGGR_CALLER is for the contention analysis and it'd aggregate the stat by comparing the callstacks. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
fb87158bab |
perf lock: Add flags field in the lock_stat
For lock contention tracepoint analysis, it needs to keep the flags. As nr_readlock and nr_trylock fields are not used for it, let's make it a union. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6923397cb7 |
perf test: Add test for #system_tsc_freq in metrics
The value should be non-zero on Intel while zero on everything else. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1276ade6a5 |
perf tsc: Add cpuinfo fall back for arch_get_tsc_freq()
The CPUID method of arch_get_tsc_freq fails for older Intel processors, such as Skylake. Compute using /proc/cpuinfo. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
bc2373a58a |
perf tsc: Add arch TSC frequency information
The TSC frequency information is required for the event metrics with the literal, system_tsc_freq. For the newer Intel platform, the TSC frequency information can be retrieved from the CPUID leaf 0x15. If the TSC frequency information isn't present the /proc/cpuinfo approach is used. Refactor cpuid() for this use. Note, the previous stack pushing/popping approach was broken on x86-64 that has stack red zones that would be clobbered. Committer testing: Before: $ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $ After the patch: $ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $ Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
9fe9b252c7 |
perf lock: Fix a copy-n-paste bug
It should be lock_text_end instead of _start.
Fixes:
|
||
Arnaldo Carvalho de Melo
|
41d0914d86 |
perf python: Ignore unused command line arguments when building with clang
Noticed after switching to python3 by default on some older fedora releases: 35 38.20 fedora:27 : FAIL clang version 5.0.2 (tags/RELEASE_502/final) clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument] clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument] error: command 'clang' failed with exit status 1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
f077c77699 |
perf build: Avoid defining _FORTIFY_SOURCE multiple times
One in perf's CFLAGS and the other in the distro python binding scripts. So if use the usual technique of first -D_FORTIFY_SOURCE then -D it. Noticed with: opensuse tumbleweed: gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Thomas Richter
|
87abe344cd |
perf test: Fix test case 83 ('perf stat CSV output linter') on s390
Perf test case 83: perf stat CSV output linter might fail
on s390.
The reason for this is the output of the command
./perf stat -x, -A -a --no-merge true
which depends on a .config file setting. When CONFIG_SCHED_TOPOLOGY
is set, the output of above perf command is
CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized
When CONFIG_SCHED_TOPOLOGY is *NOT* set the output of above perf
command is
0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized
Fix the test case to accept both output formats.
Output before:
# perf test 83
83: perf stat CSV output linter : FAILED!
#
Output after:
# ./perf test 83
83: perf stat CSV output linter : Ok
#
Fixes:
|
||
Jason Wang
|
2c91cd88f5 |
perf cs-etm: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220716044040.43123-1-wangborong@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jason Wang
|
c69d33ebfa |
perf probe: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Zechuan Chen <chenzechuan1@huawei.com> Link: http://lore.kernel.org/lkml/20220716043957.42829-1-wangborong@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
63a4354ae7 |
perf scripting perl: Ignore some warnings to keep building with perl headers
On gcc 12 we started seeing this: In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:2999, from util/scripting-engines/trace-event-perl.c:35: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_is_utf8_valid_partial_char_flags': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/handy.h:125:23: error: cast from function call of type 'STRLEN' {aka 'long unsigned int'} to non-matching type '_Bool' [-Werror=bad-function-cast] 125 | #define cBOOL(cbool) ((bool) (cbool)) | ^ /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:2363:12: note: in expansion of macro 'cBOOL' 2363 | return cBOOL(is_utf8_char_helper_(s0, e, flags)); | ^~~~~ In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7242: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_cop_file_avn': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:3489:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 3489 | const char *file = CopFILE(cop); | ^~~~~ In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7243: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h: In function 'Perl_newSV_type': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h:376:5: error: enumeration value 'SVt_LAST' not handled in switch [-Werror=switch-enum] 376 | switch (type) { | ^~~~~~ So disable those warnings to keep building with perl devel headers. Noticed, among other distros, on opensuse tumbleweed: gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ee87a0841a |
perf python: Avoid deprecation warning on distutils
Fix the following DeprecationWarning: tools/perf/util/setup.py:31: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives Note: the setuptools module may need installing, for example: $ sudo apt install python-setuptools Reviewer comments: James said: Tested it with python 2.7 and 3.8 by running "make install-python_ext PYTHON=..." Committer notes: Tested with: $ make -k BUILD_BPF_SKEL=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin ; perf test python $ make -k BUILD_BPF_SKEL=1 O=/tmp/build/perf -C tools/perf install-bin ; perf test python Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615014206.26651-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
557cc18ee7 |
perf gtk: Only support --gtk if compiled in
If HAVE_GTK2_SUPPORT isn't defined then --gtk can't succeed, don't support it as a command line option in this case. v2. Is a rebase. Patch appears to have been missed in: https://lore.kernel.org/lkml/Ygu40djM1MqAfkcF@kernel.org/ Signed-off-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: xaizek <xaizek@posteo.net> Link: https://lore.kernel.org/r/20220707203836.345918-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
2f1d6b41e2 |
perf intel-pt: Add documentation for tracing guest machine user space
Now it is possible to decode a host Intel PT trace including guest machine user space, add documentation for the steps needed to do it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-36-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
98759cca84 |
perf intel-pt: Use guest pid/tid etc in guest samples
When decoding with guest sideband information, for VMX non-root (NR) i.e. guest events, replace the host (hypervisor) pid/tid with guest values, and provide also the new machine_pid and vcpu values. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-35-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
61cd9135d0 |
perf intel-pt: Add machine_pid and vcpu to auxtrace_error
When decoding with guest sideband information, for VMX non-root (NR) i.e. guest errors, replace the host (hypervisor) pid/tid with guest values, and provide also the new machine_pid and vcpu values. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-34-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
71658de4dd |
perf intel-pt: Determine guest thread from guest sideband
Prior to decoding, determine what guest thread, if any, is running. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-33-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7d1f65b504 |
perf intel-pt: Disable sync switch with guest sideband
The sync_switch facility attempts to better synchronize context switches with the Intel PT trace, however it is not designed for guest machine context switches, so disable it when guest sideband is detected. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-32-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
0bb82cf518 |
perf intel-pt: Track guest context switches
Use guest context switch events to keep track of which guest thread is running on a particular guest machine and VCPU. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-31-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
12374a1622 |
perf intel-pt: Add some more logging to intel_pt_walk_next_insn()
To aid debugging, add some more logging to intel_pt_walk_next_insn(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-30-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7c0b20d13f |
perf intel-pt: Remove guest_machine_pid
Remove guest_machine_pid because it is not needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-29-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f9de2f0fd3 |
perf tools: Add perf_event__is_guest()
Add a helper function to determine if an event is a guest event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-28-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f42bbbf2e9 |
perf tools: Handle injected guest kernel mmap event
If a kernel mmap event was recorded inside a guest and injected into a host perf.data file, then it will match a host mmap_name not a guest mmap_name, see machine__set_mmap_name(). So try matching a host mmap_name in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-27-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
eef8e06eeb |
perf machine: Use realloc_array_as_needed() in machine__set_current_tid()
Prepare machine__set_current_tid() for use with guest machines that do not currently have a machine->env->nr_cpus_avail value by making use of realloc_array_as_needed(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-26-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
97406a7e4f |
perf inject: Add support for injecting guest sideband events
Inject events from a perf.data file recorded in a virtual machine into a perf.data file recorded on the host at the same time. Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are injected. Additionally, the guest kcore_dir is copied as kcore_dir__ appended to the machine PID. This is non-trivial because: o It is not possible to process 2 sessions simultaneously so instead events are first written to a temporary file. o To avoid conflict, guest sample IDs are replaced with new unused sample IDs. o Guest event's CPU is changed to be the host CPU because it is more useful for reporting and analysis. o Sample ID is mapped to machine PID which is recorded with VCPU in the id index. This is important to allow guest events to be related to the guest machine and VCPU. o Timestamps must be converted. o Events are inserted to obey finished-round ordering. The anticipated use-case is: - start recording sideband events in a guest machine - start recording an AUX area trace on the host which can trace also the guest (e.g. Intel PT) - run test case on the guest - stop recording on the host - stop recording on the guest - copy the guest perf.data file to the host - inject the guest perf.data file sideband events into the host perf.data file using perf inject - the resulting perf.data file can now be used Subsequent patches provide Intel PT support for this. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
10d3470022 |
perf tools: Add reallocarray_as_needed()
Add helper reallocarray_as_needed() to reallocate an array to a larger size and initialize the extra entries to an arbitrary value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
a5367ecb53 |
perf tools: Automatically use guest kcore_dir if present
When registering a guest machine using machine_pid from the id index, check perf.data for a matching kcore_dir subdirectory and set the kallsyms file name accordingly. If set, use it to find the machine's kernel symbols and object code (from kcore). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-23-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
65691e9ff0 |
perf tools: Make has_kcore_dir() work also for guest kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can be stored in the perf.data output directory under the subdirectory named kcore_dir. Guest machines will have their files also under subdirectories beginning kcore_dir__ followed by the machine pid. Make has_kcore_dir() return true also if there is a guest machine kcore_dir. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
386e0d83d3 |
perf tools: Remove also guest kcore_dir with host kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can be stored in the perf.data output directory under the subdirectory named kcore_dir. Guest machines will have their files also under subdirectories beginning kcore_dir__ followed by the machine pid. Remove these also when removing kcore_dir. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
13a133b255 |
perf script python: intel-pt-events: Add machine_pid and vcpu
Add machine_pid and vcpu to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
6de306b7a5 |
perf script python: Add machine_pid and vcpu
Add machine_pid and vcpu to python sample events and context switch events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7151c1d178 |
perf auxtrace: Add machine_pid and vcpu to auxtrace_error
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing fmt member is used to identify the new format. The new members make it possible to easily differentiate errors from guest machines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
2273e46b98 |
perf dlfilter: Add machine_pid and vcpu
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be used to determine if the values are present, however machine_pid is zero if unused in any case. vcpu should be ignored if machine_pid is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
e28fb159f1 |
perf script: Add machine_pid and vcpu
Add fields machine_pid and vcpu. These are displayed only if machine_pid is non-zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
6350490995 |
perf session: Use sample->machine_pid to find guest machine
If machine_pid is set, use it to find the guest machine. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
3461b65da7 |
perf tools: Add machine_pid and vcpu to perf_sample
When parsing a sample with a sample ID, copy machine_pid and vcpu from perf_sample_id to perf_sample. Note, machine_pid will be zero when unused, so only a non-zero value represents a guest machine. vcpu should be ignored if machine_pid is zero. Note also, machine_pid is used with events that have come from injecting a guest perf.data file, however guest events recorded on the host (i.e. using perf kvm) have the (QEMU) hypervisor process pid to identify them - refer machines__find_for_cpumode(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
797efbc523 |
perf tools: Add guest_cpu to hypervisor threads
It is possible to know which guest machine was running at a point in time based on the PID of the currently running host thread. That is, perf identifies guest machines by the PID of the hypervisor. To determine the guest CPU, put it on the hypervisor (QEMU) thread for that VCPU. This is done when processing the id_index which provides the necessary information. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
ff7a78c210 |
perf session: Create guest machines from id_index
Now that id_index has machine_pid, use it to create guest machines. Create the guest machines with an idle thread because guest events for "swapper" will be possible. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
b47bb18661 |
perf tools: Add machine_pid and vcpu to id_index
When injecting events from a guest perf.data file, the events will have separate sample ID numbers. These ID numbers can then be used to determine which machine an event belongs to. To facilitate that, add machine_pid and vcpu to id_index records. For backward compatibility, these are added at the end of the record, and the length of the record is used to determine if they are present or not. Note, this is needed because the events from a guest perf.data file contain the pid/tid of the process running at that time inside the VM not the pid/tid of the (QEMU) hypervisor thread. So a way is needed to relate guest events back to the guest machine and VCPU, and using sample ID numbers for that is relatively simple and convenient. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
c1fd5b7d8a |
perf buildid-cache: Do not require purge files to also be in the file system
realname() returns NULL if the file is not in the file system, but we can still remove it from the build ID cache in that case, so continue and attempt the purge with the name provided. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
15fe03621d |
perf buildid-cache: Add guestmount'd files to the build ID cache
When the guestmount option is used, a guest machine's file system mount point is recorded in machine->root_dir. perf already iterates guest machines when adding files to the build ID cache, but does not take machine->root_dir into account. Use machine->root_dir to find files for guest build IDs, and add them to the build ID cache using the "proper" name i.e. relative to the guest root directory not the host root directory. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
57190e38b0 |
perf script: Add --dump-unsorted-raw-trace option
When reviewing the results of perf inject, it is useful to be able to see the events in the order they appear in the file. So add --dump-unsorted-raw-trace option to do an unsorted dump. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
1ee94463e9 |
perf tools: Add perf_event__synthesize_id_sample()
Add perf_event__synthesize_id_sample() to enable the synthesis of ID samples. This is needed by perf inject. When injecting events from a guest perf.data file, there is a possibility that the sample ID numbers conflict. In that case, perf_event__synthesize_id_sample() can be used to re-write the ID sample. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
0a64de04c9 |
perf tools: Factor out evsel__id_hdr_size()
Factor out evsel__id_hdr_size() so it can be reused. This is needed by perf inject. When injecting events from a guest perf.data file, there is a possibility that the sample ID numbers conflict. To re-write an ID sample, the old one needs to be removed first, which means determining how big it is with evsel__id_hdr_size() and then subtracting that from the event size. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
eddc6e3f66 |
perf tools: Export perf_event__process_finished_round()
Export perf_event__process_finished_round() so it can be used elsewhere. This is needed in perf inject to obey finished-round ordering. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f8bcf1e223 |
perf ordered_events: Add ordered_events__last_flush_time()
Allow callers to get the ordered_events last flush timestamp. This is needed in perf inject to obey finished-round ordering when injecting additional events (e.g. from a guest perf.data file) with timestamps. Any additional events that have timestamps before the last flush time must be injected before the corresponding FINISHED_ROUND event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
163dac34d7 |
perf tools: Export dsos__for_each_with_build_id()
Export dsos__for_each_with_build_id() so it can be used elsewhere. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
68566a7cf5 |
perf tools: Fix dso_id inode generation comparison
Synthesized MMAP events have zero ino_generation, so do not compare
them to DSOs with a real ino_generation otherwise we end up with a DSO
without a build id.
Fixes:
|
||
Blake Jones
|
a6bd98c45d |
perf buildid-list: Add a "-m" option to show kernel and modules build-ids
This new option displays all of the information needed to do external BuildID-based symbolization of kernel stack traces, such as those collected by bpf_get_stackid(). For each kernel module plus the main kernel, it displays the BuildID, the start and end virtual addresses of that module's text range (rounded out to page boundaries), and the pathname of the module. When run as a non-privileged user, the actual addresses of the modules' text ranges are not available, so the tools displays "0, <text length>" for kernel modules and "0, 0xffffffffffffffff" for the kernel itself. Sample output: root# perf buildid-list -m cf6df852fd4da122d616153353cc8f560fd12fe0 ffffffffa5400000 ffffffffa6001e27 [kernel.kallsyms] 1aa7209aa2acb067d66ed6cf7676d65066384d61 ffffffffc0087000 ffffffffc008b000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/sha512_generic.ko 3857815b5bf0183697b68f8fe0ea06121644041e ffffffffc008c000 ffffffffc0098000 /lib/modules/5.15.15-1rodete2-amd64/kernel/arch/x86/crypto/sha512-ssse3.ko 4081fde0bca2bc097cb3e9d1efcb836047d485f1 ffffffffc0099000 ffffffffc009f000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/acpi/button.ko 1ef81ba4890552ea6b0314f9635fc43fc8cef568 ffffffffc00a4000 ffffffffc00aa000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/cryptd.ko cc5c985506cb240d7d082b55ed260cbb851f983e ffffffffc00af000 ffffffffc00b6000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/i2c/busses/i2c-piix4.ko [...] Committer notes: u64 formatter should be PRIx64 for printing as hex numbers, fix this: 28 5.28 debian:experimental-x-mips : FAIL gcc version 11.2.0 (Debian 11.2.0-18) builtin-buildid-list.c: In function 'buildid__map_cb': builtin-buildid-list.c:32:24: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end); | ~~~~^ ~~~~~~~~~~ | | | | long unsigned int u64 {aka long long unsigned int} | %16llx builtin-buildid-list.c:32:30: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end); | ~~~~^ ~~~~~~~~ | | | | long unsigned int u64 {aka long long unsigned int} | %16llx cc1: all warnings being treated as errors Signed-off-by: Blake Jones <blakejones@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220629213632.3899212-1-blakejones@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
0698461ad2 |
Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.
Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:
|
||
Naveen N. Rao
|
4b335e1e0d |
perf trace: Fix SIGSEGV when processing syscall args
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to process a perf.data file created with 'perf trace record -p': #0 0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492 #1 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492 #2 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486 #3 0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973 #4 syscall__scnprintf_args <snip> at builtin-trace.c:2041 #5 0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319 That points to the below code in tools/perf/builtin-trace.c: /* * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible * arguments, even if the syscall being handled, say "openat", uses only 4 arguments * this breaks syscall__augmented_args() check for augmented args, as we calculate * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file, * so when handling, say the openat syscall, we end up getting 6 args for the * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly * thinking that the extra 2 u64 args are the augmented filename, so just check * here and avoid using augmented syscalls when the evsel is the raw_syscalls one. */ if (evsel != trace->syscalls.events.sys_enter) augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size); As the comment points out, we should not be trying to augment the args for raw_syscalls. However, when processing a perf.data file, we are not initializing those properly. Fix the same. Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
deb44a6249 |
perf tests: Fix Convert perf time to TSC test for hybrid
The test does not always correctly determine the number of events for
hybrids, nor allow for more than 1 evsel when parsing.
Fix by iterating the events actually created and getting the correct
evsel for the events processed.
Fixes:
|
||
Adrian Hunter
|
498c7a54f1 |
perf tests: Stop Convert perf time to TSC test opening events twice
Do not call evlist__open() twice.
Fixes:
|