Commit Graph

13767 Commits

Author SHA1 Message Date
Ian Rogers
22de36ff2c perf vendor events: Remove bad ivytown uncore events
The event converter scripts at:

  https://github.com/intel/event-converter-for-linux-perf

passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:

  afd779df99

Fixes: 6220136831 ("perf vendor events: Update Intel ivytown")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:02 -03:00
Ian Rogers
2c98bacfd7 perf vendor events: Remove bad broadwellde uncore events
The event converter scripts at:

  https://github.com/intel/event-converter-for-linux-perf

passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:

  afd779df99

Fixes: ef908a1925 ("perf vendor events: Update Intel broadwellde")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:02 -03:00
Ian Rogers
b4f0466082 perf jevents: Add JEVENTS_ARCH make option
Allow the architecture built into pmu-events.c to be set on the make
command line with JEVENTS_ARCH.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:02 -03:00
Ian Rogers
46acb311c6 perf jevents: Simplify generation of C-string
Previous implementation wanted variable order and '(null)' string output
to match the C implementation. The '(null)' string output was a
quirk/bug and so there is no need to carry it forward.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:02 -03:00
Ian Rogers
e1e19d0545 perf jevents: Clean up pytype warnings
Improve type hints to clean up pytype warnings.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:02 -03:00
Roberto Sassu
dd6775f986 perf build: Remove FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting
As the building mechanism is now able to retry detection with different
combinations of linking flags, setting
FEATURE_CHECK_LDFLAGS-disassembler-four-args and
FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore,
so remove it.

Committer notes:

Use the same technique to find the set of bfd-related libraries to link as in:

  3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features")

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220719170555.2576993-3-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:01 -03:00
Claire Jensen
0c343af2a2 perf test: JSON format checking
Add field checking tests for perf stat JSON output.

Sanity checks the expected number of fields are present, that the
expected keys are present and they have the correct values.

Committer notes:

Had to fix this:

  -               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \
  +               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \

Committer testing:

  [root@quaco ~]# perf test json
   90: perf stat JSON output linter                                    : Ok
  [root@quaco ~]# set -o vi
  [root@quaco ~]# perf test -v json
   90: perf stat JSON output linter                                    :
  --- start ---
  test child forked, pid 560794
  Checking json output: no args [Success]
  Checking json output: system wide [Success]
  Checking json output: system wide Checking json output: system wide no aggregation [Success]
  Checking json output: interval [Success]
  Checking json output: event [Success]
  Checking json output: per core [Success]
  Checking json output: per thread [Success]
  Checking json output: per die [Success]
  Checking json output: per node [Success]
  Checking json output: per socket [Success]
  test child finished with 0
  ---- end ----
  perf stat JSON output linter: Ok
  [root@quaco ~]#

Signed-off-by: Claire Jensen <cjense@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: Claire Jensen <clairej735@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:44:01 -03:00
Claire Jensen
df936cadfb perf stat: Add JSON output option
CSV output is tricky to format and column layout changes are susceptible
to breaking parsers. New JSON-formatted output has variable names to
identify fields that are consistent and informative, making the output
parseable.

CSV output example:

  1.20,msec,task-clock:u,1204272,100.00,0.697,CPUs utilized
  0,,context-switches:u,1204272,100.00,0.000,/sec
  0,,cpu-migrations:u,1204272,100.00,0.000,/sec
  70,,page-faults:u,1204272,100.00,58.126,K/sec

JSON output example:

  {"counter-value" : "3805.723968", "unit" : "msec", "event" :
  "cpu-clock", "event-runtime" : 3805731510100.00, "pcnt-running"
  : 100.00, "metric-value" : 4.007571, "metric-unit" : "CPUs utilized"}
  {"counter-value" : "6166.000000", "unit" : "", "event" :
  "context-switches", "event-runtime" : 3805723045100.00, "pcnt-running"
  : 100.00, "metric-value" : 1.620191, "metric-unit" : "K/sec"}
  {"counter-value" : "466.000000", "unit" : "", "event" :
  "cpu-migrations", "event-runtime" : 3805727613100.00, "pcnt-running"
  : 100.00, "metric-value" : 122.447136, "metric-unit" : "/sec"}
  {"counter-value" : "208.000000", "unit" : "", "event" :
  "page-faults", "event-runtime" : 3805726799100.00, "pcnt-running"
  : 100.00, "metric-value" : 54.654516, "metric-unit" : "/sec"}

Also added documentation for JSON option.

There is some tidy up of CSV code including a potential memory over run
in the os.nfields set up. To facilitate this an AGGR_MAX value is added.

Committer notes:

Fixed up using PRIu64 to format u64 values, not %lu.

Committer testing:

  ⬢[acme@toolbox perf]$ perf stat -j sleep 1
  {"counter-value" : "0.731750", "unit" : "msec", "event" : "task-clock:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000731, "metric-unit" : "CPUs utilized"}
  {"counter-value" : "0.000000", "unit" : "", "event" : "context-switches:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
  {"counter-value" : "0.000000", "unit" : "", "event" : "cpu-migrations:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 0.000000, "metric-unit" : "/sec"}
  {"counter-value" : "75.000000", "unit" : "", "event" : "page-faults:u", "event-runtime" : 731750, "pcnt-running" : 100.00, "metric-value" : 102.494021, "metric-unit" : "K/sec"}
  {"counter-value" : "578765.000000", "unit" : "", "event" : "cycles:u", "event-runtime" : 379366, "pcnt-running" : 49.00, "metric-value" : 0.790933, "metric-unit" : "GHz"}
  {"counter-value" : "1298.000000", "unit" : "", "event" : "stalled-cycles-frontend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.224271, "metric-unit" : "frontend cycles idle"}
  {"counter-value" : "21984.000000", "unit" : "", "event" : "stalled-cycles-backend:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 3.798433, "metric-unit" : "backend cycles idle"}
  {"counter-value" : "468197.000000", "unit" : "", "event" : "instructions:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 0.808959, "metric-unit" : "insn per cycle"}
  {"metric-value" : 0.046955, "metric-unit" : "stalled cycles per insn"}
  {"counter-value" : "103335.000000", "unit" : "", "event" : "branches:u", "event-runtime" : 768020, "pcnt-running" : 100.00, "metric-value" : 141.216262, "metric-unit" : "M/sec"}
  {"counter-value" : "2381.000000", "unit" : "", "event" : "branch-misses:u", "event-runtime" : 388654, "pcnt-running" : 50.00, "metric-value" : 2.304156, "metric-unit" : "of all branches"}
  ⬢[acme@toolbox perf]$

Signed-off-by: Claire Jensen <cjense@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: Claire Jensen <clairej735@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805200105.2020995-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-10 10:43:29 -03:00
Adrián Herrera Arcila
bb8bc52e75 perf stat: Refactor __run_perf_stat() common code
This extracts common code from the branches of the forks if-then-else.

enable_counters(), which was at the beginning of both branches of the
conditional, is now unconditional; evlist__start_workload() is extracted
to a different if, which enables making the common clocking code
unconditional.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220729161244.10522-1-adrian.herrera@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-03 13:02:50 -03:00
Namhyung Kim
6d499a6b3d perf lock: Print the number of lost entries for BPF
Like the normal 'perf lock contention' output, it'd print the number of
lost entries for BPF if exists or -v option is passed.

Currently it uses BROKEN_CONTENDED stat for the lost count (due to full
stack maps).

  $ sudo perf lock con -a -b --map-nr-entries 128 sleep 5
  ...
  === output for debug===

  bad: 43, total: 14903
  bad rate: 0.29 %
  histogram of events caused bad sequence
      acquire: 0
     acquired: 0
    contended: 43
      release: 0

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 18:03:31 -03:00
Namhyung Kim
ceb13bfc01 perf lock: Add --map-nr-entries option
The --map-nr-entries option is to control number of max entries in the
perf lock contention BPF maps.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 18:02:59 -03:00
Namhyung Kim
447ec4e5fa perf lock: Introduce struct lock_contention
The lock_contention struct is to carry related fields together and to
minimize the change when we add new config options.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 18:02:10 -03:00
Arnaldo Carvalho de Melo
4ee3c4da8b perf scripting python: Do not build fail on deprecation warnings
First noticed with fedora:rawhide:

  48    11.10 fedora:rawhide                : FAIL gcc version 12.1.1 20220628 (Red Hat 12.1.1-3) (GCC)
    util/scripting-engines/trace-event-python.c: In function 'python_start_script':
    util/scripting-engines/trace-event-python.c:1899:9: error: 'PySys_SetArgv' is deprecated [-Werror=deprecated-declarations]
     1899 |         PySys_SetArgv(argc + 1, command_line);

No time now to address this warning, so don't make it an error, in time
we should either add yet more ifdefs to continue supporting older
systems or just convert to whatever new infra python put in place for
argv processing, sigh.

Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 16:32:28 -03:00
Arnaldo Carvalho de Melo
91cea6be90 genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
When genelf was introduced it tested for HAVE_LIBCRYPTO not
HAVE_LIBCRYPTO_SUPPORT, which is the define the feature test for openssl
defines, fix it.

This also adds disables the deprecation warning, someone has to fix this
to build with openssl 3.0 before the warning becomes a hard error.

Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Reported-by: 谭梓煊 <tanzixuan.me@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/YulpPqXSOG0Q4J1o@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 16:32:28 -03:00
Ian Rogers
9b7c7728f4 perf parse-events: Break out tracepoint and printing
Move print_*_events functions out of parse-events.c into a new
print-events.c. Move tracepoint code into tracepoint.c or
trace-event-info.c (sole user). This reduces the dependencies of
parse-events.c and makes it more amenable to being a library in the
future.

Remove some unnecessary definitions from parse-events.h. Fix a
checkpatch.pl warning on using unsigned rather than unsigned int.  Fix
some line length warnings too.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com
[ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 16:32:26 -03:00
Ian Rogers
32f457abb8 perf parse-events: Don't #define YY_EXTRA_TYPE
Adding a #define to side-effect a local include isn't clean, for
example, it inhibits header precompilation. YY_EXTRA_TYPE is
defined to be void* by default, so just remove.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 15:06:39 -03:00
Andres Freund
83aa012048 tools perf: Fix compilation error with new binutils
binutils changed the signature of init_disassemble_info(), which now causes
compilation failures for tools/perf/util/annotate.c, e.g. on debian
unstable.

Relevant binutils commit:

  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07

Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.

I verified that perf can still disassemble bpf programs by using bpftrace
under load, recording a perf trace, and then annotating the bpf "function"
with and without the changes. With old binutils there's no change in output
before/after this patch. When comparing the output from old binutils (2.35)
to new bintuils with the patch (upstream snapshot) there are a few output
differences, but they are unrelated to this patch. An example hunk is:

       1.15 :   55:mov    %rbp,%rdx
       0.00 :   58:add    $0xfffffffffffffff8,%rdx
       0.00 :   5c:xor    %ecx,%ecx
  -    1.03 :   5e:callq  0xffffffffe12aca3c
  +    1.03 :   5e:call   0xffffffffe12aca3c
       0.00 :   63:xor    %eax,%eax
  -    2.18 :   65:leaveq
  -    2.82 :   66:retq
  +    2.18 :   65:leave
  +    2.82 :   66:ret

Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-5-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 15:30:22 -03:00
Namhyung Kim
00b3262598 perf test: Add ARM SPE system wide test
In the past it had a problem not setting the pid/tid on the sample
correctly when system-wide mode is used.  Although it's fixed now it'd
be nice if we have a test case for it.

Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220701230932.1000495-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 14:46:30 -03:00
Jiri Olsa
5f4e821c6c perf tools: Rework prologue generation code
Some functions we use for bpf prologue generation are going to be
deprecated. This change reworks current code not to use them.

We need to replace following functions/struct:
   bpf_program__set_prep
   bpf_program__nth_fd
   struct bpf_prog_prep_result

Currently we use bpf_program__set_prep to hook perf callback before
program is loaded and provide new instructions with the prologue.

We replace this function/ality by taking instructions for specific
program, attaching prologue to them and load such new ebpf programs
with prologue using separate bpf_prog_load calls (outside libbpf
load machinery).

Before we can take and use program instructions, we need libbpf to
actually load it. This way we get the final shape of its instructions
with all relocations and verifier adjustments).

There's one glitch though.. perf kprobe program already assumes
generated prologue code with proper values in argument registers,
so loading such program directly will fail in the verifier.

That's where the fallback pre-load handler fits in and prepends
the initialization code to the program. Once such program is loaded
we take its instructions, cut off the initialization code and prepend
the prologue.

I know.. sorry ;-)

To have access to the program when loading this patch adds support to
register 'fallback' section handler to take care of perf kprobe programs.
The fallback means that it handles any section definition besides the
ones that libbpf handles.

The handler serves two purposes:
  - allows perf programs to have special arguments in section name
  - allows perf to use pre-load callback where we can attach init
    code (zeroing all argument registers) to each perf program

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220616202214.70359-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 14:45:56 -03:00
Jiri Olsa
8b1e1a0347 perf bpf: Convert legacy map definition to BTF-defined
The libbpf is switching off support for legacy map definitions [1],
which will break the perf llvm tests.

Moving the base source map definition to BTF-defined, so we need
to use -g compile option for to add debug/BTF info.

[1] https://lore.kernel.org/bpf/20220627211527.2245459-1-andrii@kernel.org/

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220704152721.352046-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 14:43:13 -03:00
Ian Rogers
6d518ac7be perf symbol: Fail to read phdr workaround
The perf jvmti agent doesn't create program headers, in this case
fallback on section headers as happened previously.

Committer notes:

To test this, from a public post by Ian:

1) download a Java workload dacapo-9.12-MR1-bach.jar from
https://sourceforge.net/projects/dacapobench/

2) build perf such as "make -C tools/perf O=/tmp/perf NO_LIBBFD=1" it
should detect Java and create /tmp/perf/libperf-jvmti.so

3) run perf with the jvmti agent:

  perf record -k 1 java -agentpath:/tmp/perf/libperf-jvmti.so -jar dacapo-9.12-MR1-bach.jar -n 10 fop

4) run perf inject:

  perf inject -i perf.data -o perf-injected.data -j

5) run perf report

  perf report -i perf-injected.data | grep org.apache.fop

With this patch reverted I see lots of symbols like:

     0.00%  java             jitted-388040-4656.so  [.] org.apache.fop.fo.FObj.bind(org.apache.fop.fo.PropertyList)

With the patch (2d86612aac ("perf symbol: Correct address for bss
symbols")) I see lots of:

  dso__load_sym_internal: failed to find program header for symbol:
  Lorg/apache/fop/fo/FObj;bind(Lorg/apache/fop/fo/PropertyList;)V
  st_value: 0x40

Fixes: 2d86612aac ("perf symbol: Correct address for bss symbols")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220731164923.691193-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:30:36 -03:00
Namhyung Kim
6fda2405f4 perf lock: Implement cpu and task filters for BPF
Add -a/--all-cpus and -C/--cpu options for cpu filtering.  Also -p/--pid
and --tid options are added for task filtering.  The short -t option is
taken for --threads already.  Tracking the command line workload is
possible as well.

  $ sudo perf lock contention -a -b sleep 1

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:28:51 -03:00
Namhyung Kim
407b36f69e perf lock: Use BPF for lock contention analysis
Add -b/--use-bpf option to use BPF to collect lock contention stats.
For simplicity it now runs system-wide and requires C-c to stop.
Upcoming changes will add the usual filtering.

  $ sudo perf lock con -b
  ^C
   contended   total wait     max wait     avg wait         type   caller

          42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
          23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
           6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
           3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
           1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
           1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
           2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
           1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
           2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
           1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:28:38 -03:00
Namhyung Kim
77d54a2cd6 perf lock: Pass machine pointer to is_lock_function()
This is a preparation for later change to expose the function externally
so that it can be used without the implicit session data.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:28:24 -03:00
Ian Rogers
9bd7021809 perf test: Add user space counter reading tests
These tests are based on test_stat_user_read in
tools/lib/perf/tests/test-evsel.c.

The tests are modified to skip if perf_event_open fails or rdpmc isn't
supported.

Committer testing:

  ⬢[acme@toolbox perf]$ perf test "mmap interface"
    4: mmap interface tests                         :
    4.1: Read samples using the mmap interface      : Skip (permissions)
  ⬢[acme@toolbox perf]$

  [root@five ~]# perf test "mmap interface"
    4: mmap interface tests                         :
    4.1: Read samples using the mmap interface      : Ok
  [root@five ~]#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220719223946.176299-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:20:18 -03:00
Ian Rogers
481fadfb10 perf test: Remove x86 rdpmc test
This test has been superseded by test_stat_user_read in:

  tools/lib/perf/tests/test-evsel.c

The updated test doesn't divide-by-0 when running time of a counter is
0. It also supports ARM64.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220719223946.176299-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:18:12 -03:00
Arnaldo Carvalho de Melo
18808564aa Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up the fixes that went upstream via acme/perf/urgent and to get
to v5.19.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 08:59:31 -03:00
Zhengjun Xing
9a0b36266f perf stat: Add topdown metrics in the default perf stat on the hybrid machine
Topdown metrics are missed in the default perf stat on the hybrid machine,
add Topdown metrics in default perf stat for hybrid systems.

Currently, we support the perf metrics Topdown for the p-core PMU in the
perf stat default, the perf metrics Topdown support for e-core PMU will be
implemented later separately. Refactor the code adds two x86 specific
functions. Widen the size of the event name column by 7 chars, so that all
metrics after the "#" become aligned again.

The perf metrics topdown feature is supported on the cpu_core of ADL. The
dedicated perf metrics counter and the fixed counter 3 are used for the
topdown events. Adding the topdown metrics doesn't trigger multiplexing.

Before:

 # ./perf  stat  -a true

 Performance counter stats for 'system wide':

             53.70 msec cpu-clock                 #   25.736 CPUs utilized
                80      context-switches          #    1.490 K/sec
                24      cpu-migrations            #  446.951 /sec
                52      page-faults               #  968.394 /sec
         2,788,555      cpu_core/cycles/          #   51.931 M/sec
           851,129      cpu_atom/cycles/          #   15.851 M/sec
         2,974,030      cpu_core/instructions/    #   55.385 M/sec
           416,919      cpu_atom/instructions/    #    7.764 M/sec
           586,136      cpu_core/branches/        #   10.916 M/sec
            79,872      cpu_atom/branches/        #    1.487 M/sec
            14,220      cpu_core/branch-misses/   #  264.819 K/sec
             7,691      cpu_atom/branch-misses/   #  143.229 K/sec

       0.002086438 seconds time elapsed

After:

 # ./perf stat  -a true

 Performance counter stats for 'system wide':

             61.39 msec cpu-clock                        #   24.874 CPUs utilized
                76      context-switches                 #    1.238 K/sec
                24      cpu-migrations                   #  390.968 /sec
                52      page-faults                      #  847.097 /sec
         2,753,695      cpu_core/cycles/                 #   44.859 M/sec
           903,899      cpu_atom/cycles/                 #   14.725 M/sec
         2,927,529      cpu_core/instructions/           #   47.690 M/sec
           428,498      cpu_atom/instructions/           #    6.980 M/sec
           581,299      cpu_core/branches/               #    9.470 M/sec
            83,409      cpu_atom/branches/               #    1.359 M/sec
            13,641      cpu_core/branch-misses/          #  222.216 K/sec
             8,008      cpu_atom/branch-misses/          #  130.453 K/sec
        14,761,308      cpu_core/slots/                  #  240.466 M/sec
         3,288,625      cpu_core/topdown-retiring/       #     22.3% retiring
         1,323,323      cpu_core/topdown-bad-spec/       #      9.0% bad speculation
         5,477,470      cpu_core/topdown-fe-bound/       #     37.1% frontend bound
         4,679,199      cpu_core/topdown-be-bound/       #     31.7% backend bound
           646,194      cpu_core/topdown-heavy-ops/      #      4.4% heavy operations       #     17.9% light operations
         1,244,999      cpu_core/topdown-br-mispredict/  #      8.4% branch mispredict      #      0.5% machine clears
         3,891,800      cpu_core/topdown-fetch-lat/      #     26.4% fetch latency          #     10.7% fetch bandwidth
         1,879,034      cpu_core/topdown-mem-bound/      #     12.7% memory bound           #     19.0% Core bound

       0.002467839 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-6-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:43:34 -03:00
Kan Liang
cdb204ad42 perf x86 evlist: Add default hybrid events for perf stat
Provide a new solution to replace the reverted commit ac2dc29edd
("perf stat: Add default hybrid events")

For the default software attrs, nothing is changed.

For the default hardware attrs, create a new evsel for each hybrid pmu.

With the new solution, adding a new default attr will not require the
special support for the hybrid platform anymore.

Also, the "--detailed" is supported on the hybrid platform

With the patch,

  $ perf stat -a -ddd sleep 1

   Performance counter stats for 'system wide':

         32,231.06 msec cpu-clock                 #   32.056 CPUs utilized
               529      context-switches          #   16.413 /sec
                32      cpu-migrations            #    0.993 /sec
                69      page-faults               #    2.141 /sec
       176,754,151      cpu_core/cycles/          #    5.484 M/sec          (41.65%)
       161,695,280      cpu_atom/cycles/          #    5.017 M/sec          (49.92%)
        48,595,992      cpu_core/instructions/    #    1.508 M/sec          (49.98%)
        32,363,337      cpu_atom/instructions/    #    1.004 M/sec          (58.26%)
        10,088,639      cpu_core/branches/        #  313.010 K/sec          (58.31%)
         6,390,582      cpu_atom/branches/        #  198.274 K/sec          (58.26%)
           846,201      cpu_core/branch-misses/   #   26.254 K/sec          (66.65%)
           676,477      cpu_atom/branch-misses/   #   20.988 K/sec          (58.27%)
        14,290,070      cpu_core/L1-dcache-loads/ #  443.363 K/sec          (66.66%)
         9,983,532      cpu_atom/L1-dcache-loads/ #  309.749 K/sec          (58.27%)
           740,725      cpu_core/L1-dcache-load-misses/ #   22.982 K/sec    (66.66%)
   <not supported>      cpu_atom/L1-dcache-load-misses/
           480,441      cpu_core/LLC-loads/       #   14.906 K/sec          (66.67%)
           326,570      cpu_atom/LLC-loads/       #   10.132 K/sec          (58.27%)
               329      cpu_core/LLC-load-misses/ #   10.208 /sec           (66.68%)
                 0      cpu_atom/LLC-load-misses/ #    0.000 /sec           (58.32%)
   <not supported>      cpu_core/L1-icache-loads/
        21,982,491      cpu_atom/L1-icache-loads/ #  682.028 K/sec          (58.43%)
         4,493,189      cpu_core/L1-icache-load-misses/ #  139.406 K/sec    (33.34%)
         4,711,404      cpu_atom/L1-icache-load-misses/ #  146.176 K/sec    (50.08%)
        13,713,090      cpu_core/dTLB-loads/      #  425.462 K/sec          (33.34%)
         9,384,727      cpu_atom/dTLB-loads/      #  291.170 K/sec          (50.08%)
           157,387      cpu_core/dTLB-load-misses/ #    4.883 K/sec         (33.33%)
           108,328      cpu_atom/dTLB-load-misses/ #    3.361 K/sec         (50.08%)
   <not supported>      cpu_core/iTLB-loads/
   <not supported>      cpu_atom/iTLB-loads/
            37,655      cpu_core/iTLB-load-misses/ #    1.168 K/sec         (33.32%)
            61,661      cpu_atom/iTLB-load-misses/ #    1.913 K/sec         (50.03%)
   <not supported>      cpu_core/L1-dcache-prefetches/
   <not supported>      cpu_atom/L1-dcache-prefetches/
   <not supported>      cpu_core/L1-dcache-prefetch-misses/
   <not supported>      cpu_atom/L1-dcache-prefetch-misses/

         1.005466919 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-5-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:42:35 -03:00
Kan Liang
a9c1ecdabc perf evlist: Always use arch_evlist__add_default_attrs()
Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs() to add
the Arch specific default attrs, e.g., Topdown for x86.

It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.

Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.

Add default_null_attrs[] to indicate the arch specific attrs.
No functional change for the arch specific default attrs either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-4-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:41:59 -03:00
Kan Liang
ff4207f793 perf evsel: Add arch_evsel__hw_name()
The commit 55bcf6ef31 ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.

Add an arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.

Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.

Nothing is changed for the other archs.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-3-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:41:19 -03:00
Kan Liang
ace3e31e65 perf stat: Revert "perf stat: Add default hybrid events"
This reverts commit Fixes: ac2dc29edd ("perf stat: Add default
hybrid events")

Between this patch and the reverted patch, the commit 6c1912898e
("perf parse-events: Rename parse_events_error functions") and the
commit 07eafd4e05 ("perf parse-event: Add init and exit to
parse_event_error") clean up the parse_events_error_*() codes. The
related change is also reverted.

The reverted patch is hard to be extended to support new default events,
e.g., Topdown events, and the existing "--detailed" option on a hybrid
platform.

A new solution will be proposed in the following patch to enable the
perf stat default on a hybrid platform.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-2-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:39:51 -03:00
Thomas Richter
fb5962f81e perf test: Fix test case 95 ("Check branch stack sampling") on s390 and use same event
On linux-next tree 'perf test 95' ("Check branch stack sampling") was
added recently.

s390 does not support branch sampling at all and the test case fails
despite for checking branch support before hand.

The check for support of branching uses the software event named "dummy",
as seen in the line:

  perf record -b -o- -e dummy -B true > /dev/null 2>&1 || exit 2

However when the branch recording is actually done, a different event is
used, as seen in the line:

  perf record -o $TMPDIR/... --branch-filter any,save_type,u -- ...

The event is omitted and for "perf record" the default event is cycles,
which is not supported by s390 and this fails when executed on s390:

  # perf record --branch-filter any,save_type,u -- /tmp/__perf_test.program.iDSmQ/a.out
  Error:
  cycles: PMU Hardware or event type doesn't support branch stack sampling.
  #

Therefore fix this and use the same event cycles for testing support
and actually running the test.

Output before:

  # ./perf test -Fv 95
  95: Check branch stack sampling                                     :
  --- start ---
  Testing user branch stack sampling
  ---- end ----
  Check branch stack sampling: FAILED!
  #

Output after:

  # ./perf test -Fv 95
  95: Check branch stack sampling                                     :
  --- start ---
  ---- end ----
  Check branch stack sampling: Skip
  #

Fixes: b55878c90a ("perf test: Add test for branch stack sampling")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: German Gomez <german.gomez@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220727141439.712582-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 10:31:06 -03:00
Nick Forrington
08c1d7a159 perf vendor events arm64: Arm Cortex-A78C and X1C
Add PMU events for the Arm Cortex-A78C and Arm Cortex-X1C CPUs.

Events for Arm Cortex-A78C match those for Arm Cortex-A78.
Events for Arm Cortex-X1C match those for Arm Cortex- X1.

As such, this is just a mapfile change.

Main ID Register (MIDR) and event data is sourced from the corresponding
Arm Technical Reference Manuals:

Arm Cortex-A78C:

  https://developer.arm.com/documentation/102226/

Arm Cortex-X1C:

  https://developer.arm.com/documentation/101968/

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220610174459.615995-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:13:01 -03:00
Ian Rogers
ebcdbf7a6a perf vendor events: Update Intel snowridgex
Update to v1.20, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the snowridgex files into perf and update mapfile.csv.

Tested on a non-snowridgex with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

This change just updates the version number.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-31-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:11:29 -03:00
Ian Rogers
6b47be608b perf vendor events: Update Intel westmereex
Update to v3, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually
copy the westmereex files into perf and update mapfile.csv. This
change just aligns whitespace and updates the version number.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-30-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:11:22 -03:00
Ian Rogers
4823edd648 perf vendor events: Update Intel westmereep-sp
Update to v3, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually
copy the westmereep-sp files into perf and update
mapfile.csv. This change just aligns whitespace and updates the
version number.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-29-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:11:16 -03:00
Ian Rogers
ae2fa1ccf1 perf vendor events: Update Intel westmereep-dp
Update to v2, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually
copy the westmereep-dp files into perf and update
mapfile.csv. This change just aligns whitespace.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-28-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:11:09 -03:00
Ian Rogers
5e1dd4f24a perf vendor events: Update Intel tigerlake
Update to v1.07, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the tigerlake files into perf and update mapfile.csv.

Tested on a non-tigerlake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-27-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:10:59 -03:00
Ian Rogers
59fd7d3225 perf vendor events: Update Intel skylakex
Update to v1.28, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the skylakex files into perf and update mapfile.csv.

Tested with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 90: perf all metricgroups test                                      : Ok
 91: perf all metrics test                                           : Skip
 93: perf all PMU test                                               : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-26-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:10:51 -03:00
Ian Rogers
35d6527701 perf vendor events: Update Intel skylake
Update to v53, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the skylake files into perf and update mapfile.csv.

Tested on a non-skylake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:10:41 -03:00
Ian Rogers
89072caf14 perf vendor events: Update Intel silvermont
Update to v14, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually
copy the silvermont files into perf and update mapfile.csv. Other
than aligning whitespace this change just folds the mapfile.csv
entries for silvertmont onto one line.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:10:27 -03:00
Ian Rogers
34122105f9 perf vendor events: Update Intel sapphirerapids
Update to v1.04, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the sapphirerapids files into perf and update mapfile.csv.

Tested on a non-sapphirerapids with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:10:08 -03:00
Ian Rogers
777e131244 perf vendor events: Update Intel sandybridge
Update to v17, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the sandybridge files into perf and update mapfile.csv.

Tested on a non-sandybridge with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:09:58 -03:00
Ian Rogers
8fe33fd5d3 perf vendor events: Update Intel nehalemex
Update to v3, there are no TMA metrics for nehalemex.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the nehalemex files into perf and update mapfile.csv.

Tested on a non-nehalemex with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Note: most of this change is just sorting the keys in the json dictionaries.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:09:51 -03:00
Ian Rogers
bcc344a3bf perf vendor events: Update Intel nehalemep
Update to v3, the are no TMA metrics for nehalemep.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the nehalemep files into perf and update mapfile.csv.

Tested on a non-nehalemep with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:09:31 -03:00
Ian Rogers
1ab4ef06fa perf vendor events: Add Intel meteorlake
Events are v1.00, there are no metrics yet.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the events and metrics. Manually copy
the meteorlake files into perf and update mapfile.csv.

Tested on a non-meteorlake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:09:17 -03:00
Ian Rogers
ae7bcd600e perf vendor events: Update Intel knightslanding
Update to v9, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the knightslanding files into perf and update mapfile.csv.

Tested on a non-knightslanding with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Note: uncore-memory has become uncore-other as the topic was
determined this way in the conversion scripts. For simplicity the
scripts naming is maintained.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:09:09 -03:00
Ian Rogers
376d8b581b perf vendor events: Update Intel jaketown
Update to v21, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the jaketown files into perf and update mapfile.csv.

Tested on a non-jaketown with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:08:33 -03:00
Ian Rogers
6220136831 perf vendor events: Update Intel ivytown
Update to v21, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the ivytown files into perf and update mapfile.csv.

Tested on a non-ivytown with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:08:27 -03:00
Ian Rogers
80c14459f6 perf vendor events: Update Intel ivybridge
Update to v22, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the ivybridge files into perf and update mapfile.csv.

Tested on a non-ivybridge with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:08:09 -03:00
Ian Rogers
d214d0c261 perf vendor events: Update Intel icelakex
Update to v1.15, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the icelakex files into perf and update mapfile.csv.

Tested with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 90: perf all metricgroups test                                      : Ok
 91: perf all metrics test                                           : Skip
 93: perf all PMU test                                               : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:08:00 -03:00
Ian Rogers
a4a4353ebf perf vendor events: Update Intel icelake
Update to v1.14, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the icelake files into perf and update mapfile.csv.

Tested on a non-icelake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:50 -03:00
Ian Rogers
859fe0f4f2 perf vendor events: Update Intel haswellx
Update to v25, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the haswellx files into perf and update mapfile.csv.

Tested with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 90: perf all metricgroups test                                      : Ok
 91: perf all metrics test                                           : Failed
 93: perf all PMU test                                               : Ok

The test 91 failure is a pre-existing failure on the test system
with the metric Load_Miss_Real_Latency which is fixed by
prefixing it with --metric-no-group.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:40 -03:00
Ian Rogers
8e6389f931 perf vendor events: Update Intel haswell
Update to v31, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the haswell files into perf and update mapfile.csv.

Tested on a non-haswell with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:34 -03:00
Ian Rogers
ae54f70dd9 perf vendor events: Update goldmontplus mapfile.csv
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

Correct the version in mapfile.csv.
Event json remains at v1.01, there are no goldmontplus metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:27 -03:00
Ian Rogers
beb2db9bed perf vendor events: Update goldmont mapfile.csv
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

Modify mapfile.csv to have a missing goldmont cpuid.
Event json remains at v13, there are no goldmont metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:20 -03:00
Ian Rogers
3c9c315711 perf vendor events: Update Intel elkhartlake
Update to v1.03. Elkhartlake metrics aren't in TMA but basic metrics are
left unchanged.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the elkhartlake files into perf and update mapfile.csv.

Tested on a non-elkhartlake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:07:00 -03:00
Ian Rogers
f9d45862ec perf vendor events: Update Intel cascadelakex
Update to v1.16, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the cascadelakex files into perf and update mapfile.csv.

Tested with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 90: perf all metricgroups test                                      : Ok
 91: perf all metrics test                                           : Skip
 93: perf all PMU test                                               : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:06:29 -03:00
Ian Rogers
9709ede1a1 perf vendor events: Update bonnell mapfile.csv
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

Fold the mapfile.csv entries together with a more complex regular
expression. This will reduce the pmu-events.c table size.
The files following this change are still at v4.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:06:17 -03:00
Ian Rogers
a95ab294a5 perf vendor events: Update Intel alderlake
Update to v1.13, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the alderlake files into perf and update mapfile.csv.

Tested on a non-alderlake with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:06:06 -03:00
Ian Rogers
ef908a1925 perf vendor events: Update Intel broadwellde
Update to v7, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the broadwellde files into perf and update mapfile.csv.

Tested on a non-broadwellde with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:05:48 -03:00
Ian Rogers
1775634ea4 perf vendor events: Update Intel broadwell
Update to v26, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the broadwell files into perf and update mapfile.csv.

Tested on a non-broadwell with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:05:22 -03:00
Ian Rogers
4266081e33 perf vendor events: Update Intel broadwellx
Update to v19, the metrics are based on TMA 4.4 full.

Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py

to download and generate the latest events and metrics. Manually copy
the broadwellx files into perf and update mapfile.csv.

Tested with 'perf test':
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 90: perf all metricgroups test                                      : Ok
 91: perf all metrics test                                           : Skip
 93: perf all PMU test                                               : Ok

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-28 16:04:33 -03:00
Ian Rogers
9a24180567 perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.

Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.

Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.

Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:19:39 -03:00
Leo Yan
882528d2e7 perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed.  E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.

Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.

This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chang Rui <changruinj@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
2d86612aac perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols.  This is a common
issue on both x86 and Arm64 platforms.

Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:

  ...

  Disassembly of section .bss:

  0000000000004040 <completed.0>:
  	...

  0000000000004080 <buf1>:
  	...

  00000000000040c0 <buf2>:
  	...

  0000000000004100 <thread>:
  	...

First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.

  # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf2 0x30a8-0x30e8
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf1 0x3068-0x30a8
    ...

The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section.  The perf tool uses below formula to convert
a symbol's memory address to a file address:

  file_address = st_value - sh_addr + sh_offset
                    ^
                    ` Memory address

We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.

The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.

Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info.  A better choice for converting
memory address to file address is using the formula:

  file_address = st_value - p_vaddr + p_offset

This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.

After applying the change:

  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf2 0x30c0-0x3100
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf1 0x3080-0x30c0
    ...

Fixes: f17e04afaf ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
b226521923 perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.

To fix the building failure, this patch changes from the python f-string
format to traditional string format.

Fixes: 12fdd6c009 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi <toracat@elrepo.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: ElRepo <contact@elrepo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Ian Rogers
a061a8ad3f perf test: Avoid sysfs state affecting fake events
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of
topdown-slots-issued and topdown-total-slots.

Legacy event parsing would try to use '-' to separate parts of an event
and so perf_pmu__parse_init sets 'slots' to be a
PMU_EVENT_SYMBOL_SUFFIX2.

As such parsing the slots event for a fake PMU fails as a
PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token.

Resolve this issue by test initializing the PMU parsing state before
every parse. This must be done every parse as the state is removes after
each parse_events.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
bedd17381b perf vendor events intel: Update event list for haswellx
Update JSON core/uncore events for haswellx to perf.

Based on HSX JSON list v24:

https://download.01.org/perfmon/HSX

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
28738de918 perf vendor events intel: Update event list for broadwellx
Update JSON core/uncore events for broadwellx to perf.

Based on BDX JSON list v19:

https://download.01.org/perfmon/BDX

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:55 -03:00
Zhengjun Xing
b43a5442d8 perf vendor events intel: Update event list for Snowridgex
More uncore events are added in the converter tool:

https://github.com/intel/event-converter-for-linux-perf

Keep both alias and the original name for the events, in case someone
already used the alias in their script.

Generate the perf events based on Snowridgex(SNR) event list v1.20:

https://download.01.org/perfmon/SNR/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
9146af4413 perf vendor events intel: Rename tremontx to snowridgex
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
6a92916de5 perf vendor events intel: Update event list for Sapphirerapids
Update JSON event list for Sapphirerapids to perf.

Based on JSON list v1.02:

https://download.01.org/perfmon/SPR/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Zhengjun Xing
5fa2481cdf perf vendor events intel: Update event list for Alderlake
Update JSON event list for Alderlake to perf.

It is a hybrid event list for both Atom and Core.

Based on JSON list v1.11:

https://download.01.org/perfmon/ADL/

Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Colin Ian King
8147f79ea5 perf inject: Fix spelling mistake "theads" -> "threads"
There is a spelling mistake in a pr_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
acfb65fe1d perf kwork: Add workqueue trace BPF support
Implements workqueue trace bpf function.

Test cases:

  # perf kwork -k workqueue lat -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)addrconf_verify_work        | 0002 |      5.856 ms |         1 |      5.856 ms |     111994.634313 s |     111994.640169 s |
    (w)vmstat_update               | 0001 |      1.247 ms |         1 |      1.247 ms |     111996.462651 s |     111996.463899 s |
    (w)neigh_periodic_work         | 0001 |      1.183 ms |         1 |      1.183 ms |     111996.462789 s |     111996.463973 s |
    (w)neigh_managed_work          | 0001 |      0.989 ms |         2 |      1.635 ms |     111996.462820 s |     111996.464455 s |
    (w)wb_workfn                   | 0000 |      0.667 ms |         1 |      0.667 ms |     111996.384273 s |     111996.384940 s |
    (w)bpf_prog_free_deferred      | 0001 |      0.495 ms |         1 |      0.495 ms |     111986.314201 s |     111986.314696 s |
    (w)mix_interrupt_randomness    | 0002 |      0.421 ms |         6 |      0.749 ms |     111995.927750 s |     111995.928499 s |
    (w)vmstat_shepherd             | 0000 |      0.374 ms |         2 |      0.385 ms |     111991.265242 s |     111991.265627 s |
    (w)e1000_watchdog              | 0002 |      0.356 ms |         5 |      0.390 ms |     111994.528380 s |     111994.528770 s |
    (w)vmstat_update               | 0000 |      0.231 ms |         2 |      0.365 ms |     111996.384407 s |     111996.384772 s |
    (w)flush_to_ldisc              | 0006 |      0.165 ms |         1 |      0.165 ms |     111995.930606 s |     111995.930771 s |
    (w)flush_to_ldisc              | 0000 |      0.094 ms |         2 |      0.095 ms |     111996.460453 s |     111996.460548 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)e1000_watchdog              | 0002 |      0.627 ms |         2 |      0.324 ms |     112002.720665 s |     112002.720989 s |
    (w)flush_to_ldisc              | 0007 |      0.598 ms |         2 |      0.534 ms |     112000.875226 s |     112000.875761 s |
    (w)wq_barrier_func             | 0007 |      0.492 ms |         1 |      0.492 ms |     112000.876981 s |     112000.877473 s |
    (w)flush_to_ldisc              | 0007 |      0.281 ms |         1 |      0.281 ms |     112005.826882 s |     112005.827163 s |
    (w)mix_interrupt_randomness    | 0002 |      0.229 ms |         3 |      0.102 ms |     112005.825671 s |     112005.825774 s |
    (w)vmstat_shepherd             | 0000 |      0.202 ms |         1 |      0.202 ms |     112001.504511 s |     112001.504713 s |
    (w)bpf_prog_free_deferred      | 0001 |      0.181 ms |         1 |      0.181 ms |     112000.883251 s |     112000.883432 s |
    (w)wb_workfn                   | 0007 |      0.130 ms |         1 |      0.130 ms |     112001.505195 s |     112001.505325 s |
    (w)vmstat_update               | 0000 |      0.053 ms |         1 |      0.053 ms |     112001.504763 s |     112001.504815 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
5a81927a40 perf kwork: Add softirq trace BPF support
Implements softirq trace bpf function.

Test cases:
Trace softirq latency without filter:

  # perf kwork -k softirq lat -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0005 |      0.281 ms |         3 |      0.338 ms |     111295.752222 s |     111295.752560 s |
    (s)RCU:9                       | 0002 |      0.262 ms |        24 |      1.400 ms |     111301.335986 s |     111301.337386 s |
    (s)SCHED:7                     | 0005 |      0.177 ms |        14 |      0.212 ms |     111295.752270 s |     111295.752481 s |
    (s)RCU:9                       | 0007 |      0.161 ms |        47 |      2.022 ms |     111295.402159 s |     111295.404181 s |
    (s)NET_RX:3                    | 0003 |      0.149 ms |        12 |      1.261 ms |     111301.192964 s |     111301.194225 s |
    (s)TIMER:1                     | 0001 |      0.105 ms |         9 |      0.198 ms |     111301.180191 s |     111301.180389 s |
    ... <SNIP> ...
    (s)NET_RX:3                    | 0002 |      0.098 ms |         6 |      0.124 ms |     111295.403760 s |     111295.403884 s |
    (s)SCHED:7                     | 0001 |      0.093 ms |        19 |      0.242 ms |     111301.180256 s |     111301.180498 s |
    (s)SCHED:7                     | 0007 |      0.078 ms |        15 |      0.188 ms |     111300.064226 s |     111300.064415 s |
    (s)SCHED:7                     | 0004 |      0.077 ms |        11 |      0.213 ms |     111301.361759 s |     111301.361973 s |
    (s)SCHED:7                     | 0000 |      0.063 ms |        33 |      0.805 ms |     111295.401811 s |     111295.402616 s |
    (s)SCHED:7                     | 0003 |      0.063 ms |        14 |      0.085 ms |     111301.192255 s |     111301.192340 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace softirq latency with cpu filter:

  # perf kwork -k softirq lat -b -C 1
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0001 |      0.178 ms |         5 |      0.572 ms |     111435.534135 s |     111435.534707 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace softirq latency with name filter:

  # perf kwork -k softirq lat -b -n SCHED
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)SCHED:7                     | 0001 |      0.295 ms |        15 |      2.183 ms |     111452.534950 s |     111452.537133 s |
    (s)SCHED:7                     | 0002 |      0.215 ms |        10 |      0.315 ms |     111460.000238 s |     111460.000553 s |
    (s)SCHED:7                     | 0005 |      0.190 ms |        29 |      0.338 ms |     111457.032538 s |     111457.032876 s |
    (s)SCHED:7                     | 0003 |      0.097 ms |        10 |      0.319 ms |     111452.434351 s |     111452.434670 s |
    (s)SCHED:7                     | 0006 |      0.089 ms |         1 |      0.089 ms |     111450.737450 s |     111450.737539 s |
    (s)SCHED:7                     | 0007 |      0.085 ms |        17 |      0.169 ms |     111452.471333 s |     111452.471502 s |
    (s)SCHED:7                     | 0004 |      0.071 ms |        15 |      0.221 ms |     111452.535252 s |     111452.535473 s |
    (s)SCHED:7                     | 0000 |      0.044 ms |        32 |      0.130 ms |     111460.001982 s |     111460.002112 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
420298aefe perf kwork: Add IRQ trace BPF support
Implements irq trace bpf function.

Test cases:
Trace irq without filter:

  # perf kwork -k irq rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     31.026 ms |       285 |      1.493 ms |     110326.049963 s |     110326.051456 s |
    eth0:10                        | 0002 |      7.875 ms |        96 |      1.429 ms |     110313.916835 s |     110313.918264 s |
    ata_piix:14                    | 0002 |      2.510 ms |        28 |      0.396 ms |     110331.367987 s |     110331.368383 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with cpu filter:

  # perf kwork -k irq rep -b -C 0
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     34.288 ms |       282 |      2.061 ms |     110358.078968 s |     110358.081029 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with name filter:

  # perf kwork -k irq rep -b -n eth0
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    eth0:10                        | 0002 |      2.184 ms |        21 |      0.572 ms |     110386.541699 s |     110386.542271 s |
   --------------------------------------------------------------------------------------------------------------------------------

Trace irq with summary:

  # perf kwork -k irq rep -b -S
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |     42.923 ms |       285 |      1.181 ms |     110418.128867 s |     110418.130049 s |
    eth0:10                        | 0002 |      2.085 ms |        20 |      0.668 ms |     110416.002935 s |     110416.003603 s |
    ata_piix:14                    | 0002 |      0.970 ms |         4 |      0.656 ms |     110424.034482 s |     110424.035138 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :       309
    Total runtime   (msec) :    45.977 (0.003% load average)
    Total time span (msec) : 17017.655
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k irq rep -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    nvme0q20:145                   | 0019 |      0.570 ms |        28 |      0.064 ms |      26966.635102 s |      26966.635167 s |
    amdgpu:162                     | 0002 |      0.568 ms |        29 |      0.068 ms |      26966.644346 s |      26966.644414 s |
    nvme0q4:129                    | 0003 |      0.565 ms |        31 |      0.037 ms |      26966.614830 s |      26966.614866 s |
    nvme0q16:141                   | 0015 |      0.205 ms |        66 |      0.012 ms |      26967.145161 s |      26967.145174 s |
    nvme0q29:154                   | 0028 |      0.154 ms |        44 |      0.014 ms |      26967.078970 s |      26967.078984 s |
    nvme0q10:135                   | 0009 |      0.134 ms |        43 |      0.011 ms |      26967.132093 s |      26967.132104 s |
    nvme0q2:127                    | 0001 |      0.132 ms |        26 |      0.011 ms |      26966.883584 s |      26966.883595 s |
    nvme0q25:150                   | 0024 |      0.127 ms |        32 |      0.014 ms |      26966.631419 s |      26966.631433 s |
    nvme0q14:139                   | 0013 |      0.110 ms |        21 |      0.017 ms |      26966.760843 s |      26966.760861 s |
    nvme0q30:155                   | 0029 |      0.102 ms |        30 |      0.022 ms |      26966.677171 s |      26966.677193 s |
    nvme0q13:138                   | 0012 |      0.088 ms |        20 |      0.015 ms |      26966.738733 s |      26966.738748 s |
    nvme0q6:131                    | 0005 |      0.087 ms |        13 |      0.020 ms |      26966.648445 s |      26966.648465 s |
    nvme0q28:153                   | 0027 |      0.066 ms |        12 |      0.015 ms |      26966.771431 s |      26966.771447 s |
    nvme0q26:151                   | 0025 |      0.060 ms |        13 |      0.012 ms |      26966.704266 s |      26966.704278 s |
    nvme0q21:146                   | 0020 |      0.054 ms |        20 |      0.011 ms |      26967.322082 s |      26967.322094 s |
    nvme0q1:126                    | 0000 |      0.046 ms |        11 |      0.013 ms |      26966.859754 s |      26966.859767 s |
    nvme0q17:142                   | 0016 |      0.046 ms |        10 |      0.011 ms |      26967.114513 s |      26967.114524 s |
    xhci_hcd:74                    | 0015 |      0.041 ms |         3 |      0.016 ms |      26967.086004 s |      26967.086020 s |
    nvme0q8:133                    | 0007 |      0.039 ms |        12 |      0.008 ms |      26966.712056 s |      26966.712063 s |
    nvme0q32:157                   | 0031 |      0.036 ms |        10 |      0.014 ms |      26966.627054 s |      26966.627068 s |
    nvme0q9:134                    | 0008 |      0.036 ms |        11 |      0.011 ms |      26967.258452 s |      26967.258462 s |
    nvme0q7:132                    | 0006 |      0.024 ms |         3 |      0.014 ms |      26966.767404 s |      26966.767418 s |
    nvme0q11:136                   | 0010 |      0.023 ms |         5 |      0.006 ms |      26966.935455 s |      26966.935461 s |
    nvme0q31:156                   | 0030 |      0.018 ms |         5 |      0.006 ms |      26966.627517 s |      26966.627524 s |
    nvme0q12:137                   | 0011 |      0.015 ms |         2 |      0.014 ms |      26966.799588 s |      26966.799602 s |
    enp5s0-rx-0:164                | 0006 |      0.009 ms |         2 |      0.005 ms |      26966.742024 s |      26966.742028 s |
    enp5s0-rx-1:165                | 0007 |      0.006 ms |         2 |      0.004 ms |      26966.939486 s |      26966.939490 s |
    enp5s0-tx-0:166                | 0008 |      0.005 ms |         1 |      0.005 ms |      26966.939484 s |      26966.939489 s |
    enp5s0-tx-1:167                | 0009 |      0.005 ms |         1 |      0.005 ms |      26966.939484 s |      26966.939489 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #t

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
daf07d2207 perf kwork: Implement BPF trace
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.

Using eBPF trace can process the data in kernel, which solves the
preceding two problems.

Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:

1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.

This commit implements the framework code and
does not add specific event support.

Test cases:

  # perf kwork rep -h

   Usage: perf kwork report [<options>]

      -b, --use-bpf         Use BPF to measure kwork runtime
      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -h

   Usage: perf kwork latency [<options>]

      -b, --use-bpf         Use BPF to measure kwork latency
      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -b
  Unsupported bpf trace class irq

  # perf kwork rep -b
  Unsupported bpf trace class irq

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
bcc8b3e88d perf kwork: Implement perf kwork timehist
Implements framework of perf kwork timehist,
to provide an analysis of kernel work events.

Test cases:

  # perf kwork tim
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        91576.060290       91576.060344  [0000]  (s)RCU:9                             0.055       0.111
        91576.061470       91576.061547  [0000]  (s)SCHED:7                           0.077       0.073
        91576.062604       91576.062697  [0001]  (s)RCU:9                             0.094       0.409
        91576.064443       91576.064517  [0002]  (s)RCU:9                             0.074       0.114
        91576.065144       91576.065211  [0000]  (s)SCHED:7                           0.067       0.058
        91576.066564       91576.066609  [0003]  (s)RCU:9                             0.045       0.110
        91576.068495       91576.068559  [0000]  (s)SCHED:7                           0.064       0.059
        91576.068900       91576.068996  [0004]  (s)RCU:9                             0.096       0.726
        91576.069364       91576.069420  [0002]  (s)RCU:9                             0.056       0.082
        91576.069649       91576.069701  [0004]  (s)RCU:9                             0.052       0.111
        91576.070147       91576.070206  [0000]  (s)SCHED:7                           0.060       0.057
        91576.073147       91576.073202  [0000]  (s)SCHED:7                           0.054       0.060
  <SNIP>

  # perf kwork tim --max-stack 2 -g
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        91576.060290       91576.060344  [0000]  (s)RCU:9                             0.055       0.111   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.061470       91576.061547  [0000]  (s)SCHED:7                           0.077       0.073   irq_exit_rcu <- sysvec_call_function_single
        91576.062604       91576.062697  [0001]  (s)RCU:9                             0.094       0.409   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.064443       91576.064517  [0002]  (s)RCU:9                             0.074       0.114   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.065144       91576.065211  [0000]  (s)SCHED:7                           0.067       0.058   irq_exit_rcu <- sysvec_call_function_single
        91576.066564       91576.066609  [0003]  (s)RCU:9                             0.045       0.110   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.068495       91576.068559  [0000]  (s)SCHED:7                           0.064       0.059   irq_exit_rcu <- sysvec_call_function_single
        91576.068900       91576.068996  [0004]  (s)RCU:9                             0.096       0.726   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.069364       91576.069420  [0002]  (s)RCU:9                             0.056       0.082   irq_exit_rcu <- sysvec_apic_timer_interrupt
        91576.069649       91576.069701  [0004]  (s)RCU:9                             0.052       0.111   irq_exit_rcu <- sysvec_apic_timer_interrupt
  <SNIP>

Committer testing:

  # perf kwork -k workqueue timehist | head -40
   Runtime start      Runtime end        Cpu     Kwork name                      Runtime     Delaytime
                                                 (TYPE)NAME:NUM                  (msec)      (msec)
   -----------------  -----------------  ------  ------------------------------  ----------  ----------
        26520.211825       26520.211832  [0019]  (w)free_work                         0.007       0.004
        26520.212929       26520.212934  [0020]  (w)free_work                         0.005       0.004
        26520.213226       26520.213228  [0014]  (w)kfree_rcu_work                    0.002       0.004
        26520.214057       26520.214061  [0021]  (w)free_work                         0.004       0.004
        26520.221239       26520.221241  [0007]  (w)kfree_rcu_work                    0.002       0.009
        26520.223232       26520.223238  [0013]  (w)psi_avgs_work                     0.005       0.006
        26520.230057       26520.230060  [0020]  (w)free_work                         0.003       0.003
        26520.270428       26520.270434  [0015]  (w)free_work                         0.006       0.004
        26520.270546       26520.270550  [0014]  (w)free_work                         0.004       0.003
        26520.281626       26520.281629  [0015]  (w)free_work                         0.003       0.002
        26520.287225       26520.287230  [0012]  (w)psi_avgs_work                     0.005       0.008
        26520.287231       26520.287235  [0001]  (w)psi_avgs_work                     0.004       0.011
        26520.287236       26520.287239  [0001]  (w)psi_avgs_work                     0.003       0.012
        26520.329488       26520.329492  [0024]  (w)free_work                         0.004       0.004
        26520.330600       26520.330605  [0007]  (w)free_work                         0.005       0.004
        26520.334218       26520.334218  [0007]  (w)kfree_rcu_monitor                 0.001       0.002
        26520.335220       26520.335221  [0005]  (w)kfree_rcu_monitor                 0.001       0.004
        26520.343980       26520.343985  [0007]  (w)free_work                         0.005       0.002
        26520.345093       26520.345097  [0006]  (w)free_work                         0.004       0.003
        26520.351233       26520.351238  [0027]  (w)psi_avgs_work                     0.005       0.008
        26520.353228       26520.353229  [0007]  (w)kfree_rcu_work                    0.001       0.002
        26520.353229       26520.353231  [0005]  (w)kfree_rcu_work                    0.001       0.006
        26520.382381       26520.382383  [0006]  (w)free_work                         0.003       0.002
        26520.386547       26520.386548  [0006]  (w)free_work                         0.002       0.001
        26520.391243       26520.391245  [0015]  (w)console_callback                  0.002       0.016
        26520.415369       26520.415621  [0027]  (w)btrfs_work_helper                 0.252
        26520.415351       26520.416174  [0002]  (w)btrfs_work_helper                 0.823       0.037
        26520.415343       26520.416304  [0031]  (w)btrfs_work_helper                 0.961
        26520.415335       26520.417078  [0001]  (w)btrfs_work_helper                 1.743
        26520.415250       26520.417564  [0002]  (w)wb_workfn                         2.314
        26520.424777       26520.424787  [0002]  (w)btrfs_work_helper                 0.010
        26520.424788       26520.424798  [0002]  (w)btrfs_work_helper                 0.010
        26520.424790       26520.424805  [0001]  (w)btrfs_work_helper                 0.016       0.016
        26520.424801       26520.424807  [0002]  (w)btrfs_work_helper                 0.006
        26520.424809       26520.424831  [0002]  (w)btrfs_work_helper                 0.022       0.030
        26520.424824       26520.424835  [0027]  (w)btrfs_work_helper                 0.011
        26520.424809       26520.424867  [0001]  (w)btrfs_work_helper                 0.059       0.032
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-14-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
53e49e32ae perf kwork: Add workqueue latency support
Implements workqueue latency function.

Test cases:

  # perf kwork -k workqueue lat

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)vmstat_update               | 0001 |      5.004 ms |         1 |      5.004 ms |      44001.745646 s |      44001.750650 s |
    (w)vmstat_update               | 0006 |      1.773 ms |         1 |      1.773 ms |      44000.830840 s |      44000.832613 s |
    (w)vmstat_shepherd             | 0000 |      0.992 ms |         8 |      2.474 ms |      44007.717845 s |      44007.720318 s |
    (w)vmstat_update               | 0000 |      0.974 ms |         5 |      2.624 ms |      44004.785970 s |      44004.788594 s |
    (w)e1000_watchdog              | 0002 |      0.687 ms |         5 |      2.632 ms |      44005.009334 s |      44005.011966 s |
    (w)vmstat_update               | 0002 |      0.307 ms |         1 |      0.307 ms |      44004.817395 s |      44004.817702 s |
    (w)vmstat_update               | 0004 |      0.296 ms |         1 |      0.296 ms |      43997.913677 s |      43997.913973 s |
    (w)mix_interrupt_randomness    | 0000 |      0.283 ms |       285 |      3.724 ms |      44006.790889 s |      44006.794613 s |
    (w)neigh_managed_work          | 0001 |      0.271 ms |         1 |      0.271 ms |      43997.665542 s |      43997.665813 s |
    (w)vmstat_update               | 0005 |      0.261 ms |         1 |      0.261 ms |      44007.820542 s |      44007.820803 s |
    (w)neigh_managed_work          | 0004 |      0.220 ms |         1 |      0.220 ms |      44002.953287 s |      44002.953507 s |
    (w)neigh_periodic_work         | 0004 |      0.217 ms |         1 |      0.217 ms |      43999.929718 s |      43999.929935 s |
    (w)mix_interrupt_randomness    | 0002 |      0.199 ms |         5 |      0.310 ms |      44005.012316 s |      44005.012625 s |
    (w)vmstat_update               | 0003 |      0.199 ms |         4 |      0.307 ms |      44005.714391 s |      44005.714699 s |
    (w)gc_worker                   | 0001 |      0.071 ms |       173 |      1.128 ms |      44002.062579 s |      44002.063707 s |
   --------------------------------------------------------------------------------------------------------------------------------
    INFO: 0.020% skipped events (17 including 10 raise, 7 entry, 0 exit)

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-13-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
19807bba5a perf kwork: Add softirq latency support
Implements softirq latency function.

Test cases:

  # perf kwork -k softirq lat

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0006 |      1.048 ms |         1 |      1.048 ms |      44000.829759 s |      44000.830807 s |
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)SCHED:7                     | 0000 |      0.158 ms |       677 |      2.639 ms |      44004.785716 s |      44004.788355 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
    (s)TIMER:1                     | 0005 |      0.128 ms |         1 |      0.128 ms |      44007.820346 s |      44007.820474 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -C 1,2

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)NET_RX:3                    | 0002 |      0.106 ms |         5 |      0.163 ms |      44005.012255 s |      44005.012418 s |
    (s)TIMER:1                     | 0002 |      0.084 ms |         9 |      0.114 ms |      44005.009168 s |      44005.009282 s |
    (s)SCHED:7                     | 0001 |      0.049 ms |       655 |      0.837 ms |      44005.707998 s |      44005.708835 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -n RCU

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)RCU:9                       | 0004 |      0.237 ms |        26 |      0.792 ms |      43997.683018 s |      43997.683810 s |
    (s)RCU:9                       | 0007 |      0.217 ms |       140 |      1.335 ms |      43997.671080 s |      43997.672415 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat -s count,avg -n RCU

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)RCU:9                       | 0003 |      0.129 ms |      2193 |      1.507 ms |      44006.010208 s |      44006.011715 s |
    (s)RCU:9                       | 0001 |      0.216 ms |      1619 |      3.659 ms |      43997.662069 s |      43997.665727 s |
    (s)RCU:9                       | 0002 |      0.141 ms |       932 |      1.662 ms |      44005.010206 s |      44005.011868 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)RCU:9                       | 0007 |      0.217 ms |       140 |      1.335 ms |      43997.671080 s |      43997.672415 s |
    (s)RCU:9                       | 0005 |      0.245 ms |        88 |      1.866 ms |      43997.683105 s |      43997.684971 s |
    (s)RCU:9                       | 0004 |      0.237 ms |        26 |      0.792 ms |      43997.683018 s |      43997.683810 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k softirq lat --time 43997,

    Kwork Name                     | Cpu  | Avg delay     | Count     | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0006 |      1.048 ms |         1 |      1.048 ms |      44000.829759 s |      44000.830807 s |
    (s)TIMER:1                     | 0001 |      1.008 ms |         4 |      3.434 ms |      43997.662069 s |      43997.665503 s |
    (s)RCU:9                       | 0006 |      0.675 ms |         7 |      1.328 ms |      43997.670304 s |      43997.671632 s |
    (s)RCU:9                       | 0000 |      0.414 ms |       701 |      3.996 ms |      43997.661170 s |      43997.665167 s |
    (s)TIMER:1                     | 0004 |      0.083 ms |        21 |      0.127 ms |      44004.969171 s |      44004.969298 s |
    ... <SNIP> ...
    (s)SCHED:7                     | 0005 |      0.050 ms |         4 |      0.086 ms |      43997.684852 s |      43997.684938 s |
    (s)SCHED:7                     | 0001 |      0.049 ms |       655 |      0.837 ms |      44005.707998 s |      44005.708835 s |
    (s)SCHED:7                     | 0007 |      0.044 ms |       171 |      0.077 ms |      43997.943265 s |      43997.943342 s |
    (s)SCHED:7                     | 0002 |      0.040 ms |      1731 |      0.211 ms |      44005.009237 s |      44005.009447 s |
   --------------------------------------------------------------------------------------------------------------------------------

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-12-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
ad3d9f7a92 perf kwork: Implement perf kwork latency
Implements framework of perf kwork latency, which is used to report time
properties such as delay time and frequency.

Test cases:

  # perf kwork lat -h

   Usage: perf kwork latency [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat -C 199
  Requested CPU 199 too large. Consider raising MAX_NR_CPUS
  Invalid cpu bitmap

  # perf kwork lat -i perf_no_exist.data
  failed to open perf_no_exist.data: No such file or directory

  # perf kwork lat -s avg1
    Error: Unknown --sort key: `avg1'

   Usage: perf kwork latency [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): avg, max, count
          --time <str>      Time span for analysis (start,stop)

  # perf kwork lat --time FFFF,
  Invalid time span

  # perf kwork lat

    Kwork Name                     | Cpu  | Avg delay     | Count    | Max delay     | Max delay start     | Max delay end       |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------
    INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit)

Since there are no latency-enabled events, the output is empty.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
8dbc3c8689 perf kwork: Add workqueue report support
Implements workqueue report function.

Test cases:

  # perf kwork -k workqueue rep

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)gc_worker                   | 0001 |   1912.389 ms |       173 |     12.896 ms |      44002.050787 s |      44002.063683 s |
    (w)mix_interrupt_randomness    | 0000 |     24.308 ms |       285 |      3.349 ms |      44004.784908 s |      44004.788257 s |
    (w)e1000_watchdog              | 0002 |      5.332 ms |         5 |      2.059 ms |      44000.914366 s |      44000.916424 s |
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_shepherd             | 0000 |      0.964 ms |         8 |      0.195 ms |      43997.986453 s |      43997.986648 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)mix_interrupt_randomness    | 0002 |      0.114 ms |         5 |      0.037 ms |      44005.012625 s |      44005.012662 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)neigh_periodic_work         | 0004 |      0.039 ms |         1 |      0.039 ms |      43999.929935 s |      43999.929974 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)neigh_managed_work          | 0001 |      0.036 ms |         1 |      0.036 ms |      43997.665813 s |      43997.665849 s |
    (w)neigh_managed_work          | 0004 |      0.036 ms |         1 |      0.036 ms |      44002.953507 s |      44002.953543 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)gc_worker                   | 0001 |   1912.389 ms |       173 |     12.896 ms |      44002.050787 s |      44002.063683 s |
    (w)mix_interrupt_randomness    | 0000 |     24.308 ms |       285 |      3.349 ms |      44004.784908 s |      44004.788257 s |
    (w)e1000_watchdog              | 0002 |      5.332 ms |         5 |      2.059 ms |      44000.914366 s |      44000.916424 s |
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_shepherd             | 0000 |      0.964 ms |         8 |      0.195 ms |      43997.986453 s |      43997.986648 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)mix_interrupt_randomness    | 0002 |      0.114 ms |         5 |      0.037 ms |      44005.012625 s |      44005.012662 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)neigh_periodic_work         | 0004 |      0.039 ms |         1 |      0.039 ms |      43999.929935 s |      43999.929974 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)neigh_managed_work          | 0001 |      0.036 ms |         1 |      0.036 ms |      43997.665813 s |      43997.665849 s |
    (w)neigh_managed_work          | 0004 |      0.036 ms |         1 |      0.036 ms |      44002.953507 s |      44002.953543 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :       500
    Total runtime   (msec) :  1945.085 (0.192% load average)
    Total time span (msec) : 10155.026
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork -k workqueue rep -n vmstat_update

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)vmstat_update               | 0005 |      0.989 ms |         2 |      0.953 ms |      43997.986991 s |      43997.987944 s |
    (w)vmstat_update               | 0003 |      0.306 ms |         6 |      0.077 ms |      44004.689543 s |      44004.689620 s |
    (w)vmstat_update               | 0000 |      0.196 ms |         5 |      0.049 ms |      44005.713732 s |      44005.713781 s |
    (w)vmstat_update               | 0001 |      0.162 ms |         2 |      0.130 ms |      44000.192034 s |      44000.192164 s |
    (w)vmstat_update               | 0002 |      0.084 ms |         2 |      0.043 ms |      44004.817702 s |      44004.817745 s |
    (w)vmstat_update               | 0006 |      0.067 ms |         2 |      0.041 ms |      43997.987214 s |      43997.987254 s |
    (w)vmstat_update               | 0007 |      0.037 ms |         1 |      0.037 ms |      43997.988969 s |      43997.989006 s |
    (w)vmstat_update               | 0004 |      0.027 ms |         1 |      0.027 ms |      43997.913973 s |      43997.914000 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k workqueue rep -C 1 | head -20

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (w)commit_work                 | 0001 |     25.896 ms |         2 |     13.200 ms |      26522.906700 s |      26522.919900 s |
    (w)commit_work                 | 0001 |     13.316 ms |         1 |     13.316 ms |      26522.573246 s |      26522.586562 s |
    (w)commit_work                 | 0001 |     13.177 ms |         1 |     13.177 ms |      26522.673406 s |      26522.686583 s |
    (w)commit_work                 | 0001 |     12.630 ms |         1 |     12.630 ms |      26522.123921 s |      26522.136551 s |
    (w)btrfs_work_helper           | 0001 |      3.544 ms |         1 |      3.544 ms |      26529.131296 s |      26529.134840 s |
    (w)btrfs_work_helper           | 0001 |      3.330 ms |         1 |      3.330 ms |      26529.137698 s |      26529.141028 s |
    (w)btrfs_work_helper           | 0001 |      2.855 ms |         1 |      2.855 ms |      26529.134842 s |      26529.137697 s |
    (w)btrfs_work_helper           | 0001 |      2.757 ms |         1 |      2.757 ms |      26529.124086 s |      26529.126843 s |
    (w)btrfs_work_helper           | 0001 |      2.182 ms |         1 |      2.182 ms |      26529.141030 s |      26529.143212 s |
    (w)btrfs_work_helper           | 0001 |      1.743 ms |         1 |      1.743 ms |      26520.415335 s |      26520.417078 s |
    (w)btrfs_work_helper           | 0001 |      1.499 ms |         1 |      1.499 ms |      26529.127774 s |      26529.129272 s |
    (w)btrfs_work_helper           | 0001 |      1.446 ms |         1 |      1.446 ms |      26529.129848 s |      26529.131294 s |
    (w)btrfs_work_helper           | 0001 |      1.373 ms |         1 |      1.373 ms |      26523.808270 s |      26523.809643 s |
    (w)wb_workfn                   | 0001 |      1.165 ms |         2 |      0.763 ms |      26527.071056 s |      26527.071819 s |
    (w)btrfs_work_helper           | 0001 |      0.926 ms |         1 |      0.926 ms |      26529.126846 s |      26529.127771 s |
    (w)btrfs_work_helper           | 0001 |      0.571 ms |         1 |      0.571 ms |      26529.129275 s |      26529.129846 s |
    (w)wb_workfn                   | 0001 |      0.525 ms |         1 |      0.525 ms |      26522.975151 s |      26522.975676 s |
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-10-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
4c14819169 perf kwork: Add softirq report support
Implements softirq kwork report function.

Test cases:

  # perf kwork -k softirq rep

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0004 |      0.830 ms |        26 |      0.058 ms |      43997.666418 s |      43997.666476 s |
    (s)TIMER:1                     | 0001 |      0.471 ms |         4 |      0.158 ms |      44007.834694 s |      44007.834852 s |
    (s)RCU:9                       | 0006 |      0.220 ms |         7 |      0.048 ms |      44004.833764 s |      44004.833812 s |
    (s)NET_RX:3                    | 0002 |      0.164 ms |         5 |      0.049 ms |      44005.012418 s |      44005.012466 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #
  # perf kwork -k softirq rep -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0000 |     63.631 ms |       680 |      2.690 ms |      44006.721976 s |      44006.724666 s |
    ... <SNIP> ...
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    (s)RCU:9                       | 0006 |      0.220 ms |         7 |      0.048 ms |      44004.833764 s |      44004.833812 s |
    (s)NET_RX:3                    | 0002 |      0.164 ms |         5 |      0.049 ms |      44005.012418 s |      44005.012466 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :     12748
    Total runtime   (msec) :   661.433 (0.065% load average)
    Total time span (msec) : 10176.441
   --------------------------------------------------------------------------------------------------------------------------------

  #
  # perf kwork -k softirq rep -s count,max

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    (s)TIMER:1                     | 0003 |    181.387 ms |      2476 |      1.240 ms |      44004.787960 s |      44004.789201 s |
    (s)RCU:9                       | 0003 |     91.573 ms |      2193 |      0.650 ms |      44004.790258 s |      44004.790908 s |
    (s)SCHED:7                     | 0002 |     50.039 ms |      1731 |      0.074 ms |      44005.009447 s |      44005.009521 s |
    (s)RCU:9                       | 0001 |     78.960 ms |      1619 |      1.195 ms |      44001.496553 s |      44001.497749 s |
    (s)SCHED:7                     | 0003 |     55.962 ms |      1255 |      0.954 ms |      44004.812008 s |      44004.812962 s |
    ... <SNIP> ...
    (s)RCU:9                       | 0002 |     35.241 ms |       932 |      0.407 ms |      44005.009541 s |      44005.009949 s |
    (s)RCU:9                       | 0000 |     45.710 ms |       702 |      1.144 ms |      44004.787023 s |      44004.788167 s |
    (s)SCHED:7                     | 0006 |      0.080 ms |         2 |      0.044 ms |      43997.826145 s |      43997.826189 s |
    (s)TIMER:1                     | 0005 |      0.164 ms |         1 |      0.164 ms |      44007.820474 s |      44007.820638 s |
    (s)TIMER:1                     | 0006 |      0.087 ms |         1 |      0.087 ms |      44000.830807 s |      44000.830894 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork -k softirq report -C 2 -s count,max

  Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
 --------------------------------------------------------------------------------------------------------------------------------
  (s)SCHED:7                     | 0002 |      0.980 ms |       159 |      0.024 ms |      26035.571037 s |      26035.571061 s |
  (s)RCU:9                       | 0002 |      0.124 ms |        88 |      0.021 ms |      26035.177050 s |      26035.177071 s |
  (s)TIMER:1                     | 0002 |      0.122 ms |        56 |      0.007 ms |      26035.468045 s |      26035.468052 s |
 --------------------------------------------------------------------------------------------------------------------------------

  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-9-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
94348520c6 perf kwork: Add irq report support
Implements irq kwork report function.

Test cases:

  # perf kwork record -- sleep 10
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 6.134 MB perf.data ]

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 2

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 3

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -i perf.data

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -s max,freq

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |   1167.501 ms |     18284 |      1.096 ms |      44004.464905 s |      44004.466001 s |
    eth0:10                        | 0002 |      0.185 ms |         5 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :     18289
    Total runtime   (msec) :  1167.686 (0.115% load average)
    Total time span (msec) : 10159.155
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report --time 44005,

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    virtio0-requests:25            | 0000 |    402.173 ms |      4695 |      0.981 ms |      44007.831992 s |      44007.832973 s |
    eth0:10                        | 0002 |      0.089 ms |         2 |      0.058 ms |      44005.012222 s |      44005.012280 s |
   --------------------------------------------------------------------------------------------------------------------------------

Committer testing:

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
    nvme0q5:130                    | 0004 |      1.101 ms |        49 |      0.051 ms |      26035.056403 s |      26035.056455 s |
    amdgpu:162                     | 0002 |      0.176 ms |         9 |      0.046 ms |      26035.268020 s |      26035.268066 s |
    nvme0q24:149                   | 0023 |      0.161 ms |        55 |      0.009 ms |      26035.655280 s |      26035.655288 s |
    nvme0q20:145                   | 0019 |      0.090 ms |        33 |      0.014 ms |      26035.939018 s |      26035.939032 s |
    nvme0q31:156                   | 0030 |      0.075 ms |        21 |      0.010 ms |      26035.052237 s |      26035.052247 s |
    nvme0q8:133                    | 0007 |      0.062 ms |        12 |      0.021 ms |      26035.416840 s |      26035.416861 s |
    nvme0q6:131                    | 0005 |      0.054 ms |        22 |      0.010 ms |      26035.199919 s |      26035.199929 s |
    nvme0q19:144                   | 0018 |      0.052 ms |        14 |      0.010 ms |      26035.110615 s |      26035.110625 s |
    nvme0q7:132                    | 0006 |      0.049 ms |        13 |      0.007 ms |      26035.125180 s |      26035.125187 s |
    nvme0q18:143                   | 0017 |      0.033 ms |        14 |      0.007 ms |      26035.169698 s |      26035.169705 s |
    nvme0q17:142                   | 0016 |      0.013 ms |         1 |      0.013 ms |      26035.565147 s |      26035.565160 s |
    enp5s0-rx-0:164                | 0006 |      0.004 ms |         4 |      0.002 ms |      26035.928882 s |      26035.928884 s |
    enp5s0-tx-0:166                | 0008 |      0.003 ms |         3 |      0.002 ms |      26035.870923 s |      26035.870925 s |
   --------------------------------------------------------------------------------------------------------------------------------

  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-8-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:31:54 -03:00
Yang Jihong
f98919ec4f perf kwork: Implement 'report' subcommand
Implements framework of 'perf kwork report', which is used to report
time properties such as run time and frequency:

Test cases:

  # perf kwork

   Usage: perf kwork [<options>] {record|report}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, etc)
      -v, --verbose         be more verbose (show symbol address, etc)

  # perf kwork report -h

   Usage: perf kwork report [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork report

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -S

    Kwork Name                     | Cpu  | Total Runtime | Count     | Max runtime   | Max runtime start   | Max runtime end     |
   --------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------------------------------------------------------------------------------------
    Total count            :         0
    Total runtime   (msec) :     0.000 (0.000% load average)
    Total time span (msec) :     0.000
   --------------------------------------------------------------------------------------------------------------------------------

  # perf kwork report -C 0,100
  Requested CPU 100 too large. Consider raising MAX_NR_CPUS
  Invalid cpu bitmap

  # perf kwork report -s runtime1
    Error: Unknown --sort key: `runtime1'

   Usage: perf kwork report [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): runtime, max, count
      -S, --with-summary    Show summary with statistics
          --time <str>      Time span for analysis (start,stop)

  # perf kwork report -i perf_no_exist.data
  failed to open perf_no_exist.data: No such file or directory

  # perf kwork report --time 00FFF,
  Invalid time span

Since there are no report supported events, the output is empty.

Briefly describe the data structure:

1. "class" indicates event type. For example, irq and softiq correspond
to different types.

2. "cluster" refers to a specific event corresponding to a type. For
example, RCU and TIMER in softirq correspond to different clusters,
which contains three types of events: raise, entry, and exit.

3. "atom" includes time of each sample and sample of the previous phase.
(For example, exit corresponds to entry, which is used for timehist.)

Committer notes:

- Add {} for multiline if blocks.

- report_print_work() should either return that ret variable that
  accounts how many bytes were printed or stop accounting and be void.
  Do the former for now to avoid this:

builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable]
        int ret = 0;
            ^
1 error generated.

  When building with:

  ⬢[acme@toolbox perf]$ clang --version
  clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076)

Also:

  -       if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
  +       if (dst_type < KWORK_TRACE_MAX) {

Several versions of clang and at least this gcc:

   3    51.40 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is
          always true [-Werror,-Wtautological-compare]
            if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {

As the first entry in a enum is zero.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:30:05 -03:00
Yang Jihong
97179d9d08 perf kwork: Add workqueue kwork record support
Record workqueue events workqueue:workqueue_activate_work,
workqueue:workqueue_execute_start & workqueue:workqueue_execute_end

Tese cases:
Record all events:

  # perf kwork record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.857 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  irq:irq_handler_entry
  irq:irq_handler_exit
  irq:softirq_raise
  irq:softirq_entry
  irq:softirq_exit
  workqueue:workqueue_activate_work
  workqueue:workqueue_execute_start
  workqueue:workqueue_execute_end
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Record workqueue events:

  # perf kwork -k workqueue record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.081 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  workqueue:workqueue_activate_work
  workqueue:workqueue_execute_start
  workqueue:workqueue_execute_end
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Committer testing:

  # perf kwork record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 3.430 MB perf.data (24130 samples) ]
  # perf evlist -v
  irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_activate_work: type: 2, size: 128, config: 0x106, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_execute_start: type: 2, size: 128, config: 0x105, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  workqueue:workqueue_execute_end: type: 2, size: 128, config: 0x104, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  # perf script | grep workqueue | head
           swapper     0 [018] 26035.043289: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368
   kworker/18:2-ev 70440 [018] 26035.043293: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work
   kworker/18:2-ev 70440 [018] 26035.043301:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work
           swapper     0 [021] 26035.044704: workqueue:workqueue_activate_work: work struct 0xffff8b8ffef6e368
   kworker/21:0-ev 4080535 [021] 26035.044709: workqueue:workqueue_execute_start: work struct 0xffff8b8ffef6e368: function free_work
   kworker/21:0-ev 4080535 [021] 26035.044716:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffef6e368: function free_work
           swapper     0 [018] 26035.045230: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368
   kworker/18:2-ev 70440 [018] 26035.045232: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work
   kworker/18:2-ev 70440 [018] 26035.045235:   workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work
           swapper     0 [001] 26035.052046: workqueue:workqueue_activate_work: work struct 0xffff8b8108901590
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-5-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:02:04 -03:00
Yang Jihong
e643932190 perf kwork: Add softirq kwork record support
Record softirq events irq:softirq_raise, irq:softirq_entry &
irq:softirq_exit.

Test cases:
Record all events:

  # perf kwork record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.897 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  irq:irq_handler_entry
  irq:irq_handler_exit
  irq:softirq_raise
  irq:softirq_entry
  irq:softirq_exit
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Record softirq events:

  # perf kwork -k softirq record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.141 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  irq:softirq_raise
  irq:softirq_entry
  irq:softirq_exit
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Committer testing:

  # perf kwork record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 3.078 MB perf.data (17433 samples) ]
  # perf evlist -v
  irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1
  dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  # perf script | head
      migration/12    73 [012] 25884.940992:     irq:softirq_raise: vec=9 [action=RCU]
      migration/12    73 [012] 25884.940994:     irq:softirq_entry: vec=9 [action=RCU]
      migration/12    73 [012] 25884.940995:      irq:softirq_exit: vec=9 [action=RCU]
           swapper     0 [004] 25884.940995:     irq:softirq_raise: vec=9 [action=RCU]
           swapper     0 [004] 25884.940998:     irq:softirq_entry: vec=9 [action=RCU]
           swapper     0 [004] 25884.940999:      irq:softirq_exit: vec=9 [action=RCU]
               cc1 71212 [021] 25884.941990:     irq:softirq_raise: vec=9 [action=RCU]
           swapper     0 [004] 25884.941991:     irq:softirq_raise: vec=9 [action=RCU]
               cc1 71212 [021] 25884.941992:     irq:softirq_raise: vec=7 [action=SCHED]
         perf-exec 71208 [013] 25884.941992:     irq:softirq_raise: vec=9 [action=RCU]
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-4-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:01:54 -03:00
Yang Jihong
4f8ae962f0 perf kwork: Add irq kwork record support
Record interrupt events irq:irq_handler_entry & irq_handler_exit

Test cases:

 # perf kwork record -o perf_kwork.date -- sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.556 MB perf_kwork.date ]
  #
  # perf evlist -i perf_kwork.date
  irq:irq_handler_entry
  irq:irq_handler_exit
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:01:29 -03:00
Yang Jihong
0f70d8e9db perf kwork: New tool to trace time properties of kernel work (such as softirq, and workqueue)
The 'perf kwork' tool is used to trace time properties of kernel work
(such as irq, softirq, and workqueue), including runtime, latency, and
timehist, using the infrastructure in the perf tools to allow tracing
extra targets.

This is the first commit to reuse the 'perf record' framework code to
implement a simple record function, kwork is not supported currently.

Test cases:

  # perf

   usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

   The most commonly used perf commands are:
  <SNIP>
     iostat          Show I/O performance metrics
     kallsyms        Searches running kernel for symbols
     kmem            Tool to trace/measure kernel memory properties
     kvm             Tool to trace/measure kvm guest os
     kwork           Tool to trace/measure kernel work properties (latencies)
     list            List all symbolic event types
     lock            Analyze lock events
     mem             Profile memory accesses
     record          Run a command and record its profile into perf.data
  <SNIP>
   See 'perf help COMMAND' for more information on a specific command.

  # perf kwork

   Usage: perf kwork [<options>] {record}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -k, --kwork <kwork>   list of kwork to profile
      -v, --verbose         be more verbose (show symbol address, etc)

  # perf kwork record -- sleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.787 MB perf.data ]

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-2-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-26 16:01:24 -03:00
Arnaldo Carvalho de Melo
ade5353950 perf data: Add missing unistd.h header needed for pid_t
Noticed when processing 'perf kwork' that includes util/data.h without,
by luck, having included unistd.h indirectly to get the pid_t typedef.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 18:10:43 -03:00
Namhyung Kim
1ab55323c5 perf lock: Support -t option for 'contention' subcommand
Like perf lock report, it can report lock contention stat of each task.

  $ perf lock contention -t
   contended   total wait     max wait     avg wait          pid   comm

           5    945.20 us    902.08 us    189.04 us       316167   EventManager_De
          33     98.17 us      6.78 us      2.97 us       766063   kworker/0:1-get
           7     92.47 us     61.26 us     13.21 us       316170   EventManager_De
          14     76.31 us     12.87 us      5.45 us        12949   timedcall
          24     76.15 us     12.27 us      3.17 us       767992   sched-pipe
          15     75.62 us     11.93 us      5.04 us        15127   switchto-defaul
          24     71.84 us      5.59 us      2.99 us       629168   kworker/u513:2-
          17     67.41 us      7.94 us      3.96 us        13504   coroner-
           1     59.56 us     59.56 us     59.56 us       316165   EventManager_De
          14     56.21 us      6.89 us      4.01 us            0   swapper

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 17:58:42 -03:00
Namhyung Kim
79079f21f5 perf lock: Add -k and -F options to 'contention' subcommand
Like perf lock report, add -k/--key and -F/--field options to control
output formatting and sorting.  Note that it has slightly different
default options as some fields are not available and to optimize the
screen space.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 17:58:14 -03:00
Namhyung Kim
528b9cab3b perf lock: Add 'contention' subcommand
The 'perf lock contention' processes the lock contention events and
displays the result like perf lock report.  Right now, there's not
much difference between the two but the lock contention specific
features will come soon.

  $ perf lock contention
   contended   total wait     max wait     avg wait         type   caller

         238      1.41 ms     29.20 us      5.94 us     spinlock   update_blocked_averages+0x4c
           1    902.08 us    902.08 us    902.08 us      rwsem:R   do_user_addr_fault+0x1dd
          81    330.30 us     17.24 us      4.08 us     spinlock   _nohz_idle_balance+0x172
           2     89.54 us     61.26 us     44.77 us     spinlock   do_anonymous_page+0x16d
          24     78.36 us     12.27 us      3.27 us        mutex   pipe_read+0x56
           2     71.58 us     59.56 us     35.79 us     spinlock   __handle_mm_fault+0x6aa
           6     25.68 us      6.89 us      4.28 us     spinlock   do_idle+0x28d
           1     18.46 us     18.46 us     18.46 us      rtmutex   exec_fw_cmd+0x21b
           3     15.25 us      6.26 us      5.08 us     spinlock   tick_do_update_jiffies64+0x2c

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 17:55:51 -03:00
Namhyung Kim
f9c695a211 perf lock: Add lock aggregation enum
Introduce the aggr_mode variable to prepare a later code change.

The default is LOCK_AGGR_ADDR which aggregates the result for the lock
instances.

When -t/--threads option is given, it'd be set to LOCK_AGGR_TASK.  The
LOCK_AGGR_CALLER is for the contention analysis and it'd aggregate the
stat by comparing the callstacks.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 17:54:23 -03:00
Namhyung Kim
fb87158bab perf lock: Add flags field in the lock_stat
For lock contention tracepoint analysis, it needs to keep the flags.
As nr_readlock and nr_trylock fields are not used for it, let's make
it a union.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 17:52:22 -03:00
Ian Rogers
6923397cb7 perf test: Add test for #system_tsc_freq in metrics
The value should be non-zero on Intel while zero on everything else.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 12:29:07 -03:00
Ian Rogers
1276ade6a5 perf tsc: Add cpuinfo fall back for arch_get_tsc_freq()
The CPUID method of arch_get_tsc_freq fails for older Intel processors,
such as Skylake. Compute using /proc/cpuinfo.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 12:29:07 -03:00
Kan Liang
bc2373a58a perf tsc: Add arch TSC frequency information
The TSC frequency information is required for the event metrics with the
literal, system_tsc_freq. For the newer Intel platform, the TSC
frequency information can be retrieved from the CPUID leaf 0x15.  If the
TSC frequency information isn't present the /proc/cpuinfo approach is
used.

Refactor cpuid() for this use. Note, the previous stack pushing/popping
approach was broken on x86-64 that has stack red zones that would be
clobbered.

Committer testing:

Before:

  $ perf record sleep 0.0001
  [ perf record: Woken up 1 times to write data ]
  $ perf report --header-only |& grep cpuid
  # cpuid : AuthenticAMD,25,33,0
  $

After the patch:

  $ perf record sleep 0.0001
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ]
  $ perf report --header-only |& grep cpuid
  # cpuid : AuthenticAMD,25,33,0
  $

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-25 12:28:00 -03:00
Namhyung Kim
9fe9b252c7 perf lock: Fix a copy-n-paste bug
It should be lock_text_end instead of _start.

Fixes: 0d2997f750 ("perf lock: Look up callchain for the contended locks")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220721043644.153718-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-21 15:58:12 -03:00
Arnaldo Carvalho de Melo
41d0914d86 perf python: Ignore unused command line arguments when building with clang
Noticed after switching to python3 by default on some older fedora
releases:

  35    38.20 fedora:27                     : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
    clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
    clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
    error: command 'clang' failed with exit status 1

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 15:14:39 -03:00
Arnaldo Carvalho de Melo
f077c77699 perf build: Avoid defining _FORTIFY_SOURCE multiple times
One in perf's CFLAGS and the other in the distro python binding
scripts.

So if use the usual technique of first -D_FORTIFY_SOURCE then -D it.

Noticed with:

opensuse tumbleweed:

  gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 13:21:45 -03:00
Thomas Richter
87abe344cd perf test: Fix test case 83 ('perf stat CSV output linter') on s390
Perf test case 83: perf stat CSV output linter might fail
on s390.
The reason for this is the output of the command

 ./perf stat -x, -A -a --no-merge true

which depends on a .config file setting. When CONFIG_SCHED_TOPOLOGY
is set, the output of above perf command is

   CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized

When CONFIG_SCHED_TOPOLOGY is *NOT* set the output of above perf
command is

   0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized

Fix the test case to accept both output formats.

Output before:
 # perf test 83
 83: perf stat CSV output linter       : FAILED!
 #

Output after:
 # ./perf test 83
 83: perf stat CSV output linter       : Ok
 #

Fixes: ec906102e5 ("perf test: Fix "perf stat CSV output linter" test on s390")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220720123419.220953-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 13:20:33 -03:00
Jason Wang
2c91cd88f5 perf cs-etm: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220716044040.43123-1-wangborong@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:11 -03:00
Jason Wang
c69d33ebfa perf probe: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Link: http://lore.kernel.org/lkml/20220716043957.42829-1-wangborong@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:11 -03:00
Arnaldo Carvalho de Melo
63a4354ae7 perf scripting perl: Ignore some warnings to keep building with perl headers
On gcc 12 we started seeing this:

  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:2999,
                   from util/scripting-engines/trace-event-perl.c:35:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_is_utf8_valid_partial_char_flags':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/handy.h:125:23: error: cast from function call of type 'STRLEN' {aka 'long unsigned int'} to non-matching type '_Bool' [-Werror=bad-function-cast]
    125 | #define cBOOL(cbool) ((bool) (cbool))
        |                       ^
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:2363:12: note: in expansion of macro 'cBOOL'
   2363 |     return cBOOL(is_utf8_char_helper_(s0, e, flags));
        |            ^~~~~
  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7242:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_cop_file_avn':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:3489:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
   3489 |     const char *file = CopFILE(cop);
        |     ^~~~~
  In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7243:
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h: In function 'Perl_newSV_type':
  /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h:376:5: error: enumeration value 'SVt_LAST' not handled in switch [-Werror=switch-enum]
    376 |     switch (type) {
        |     ^~~~~~

So disable those warnings to keep building with perl devel headers.

Noticed, among other distros, on opensuse tumbleweed:

gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:11 -03:00
Ian Rogers
ee87a0841a perf python: Avoid deprecation warning on distutils
Fix the following DeprecationWarning:

  tools/perf/util/setup.py:31: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives

Note: the setuptools module may need installing, for example:

  $ sudo apt install python-setuptools

Reviewer comments:

James said:

Tested it with python 2.7 and 3.8 by running "make install-python_ext PYTHON=..."

Committer notes:

Tested with:

 $ make -k BUILD_BPF_SKEL=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin ; perf test python

 $ make -k BUILD_BPF_SKEL=1 O=/tmp/build/perf -C tools/perf install-bin ; perf test python

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220615014206.26651-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:11 -03:00
Ian Rogers
557cc18ee7 perf gtk: Only support --gtk if compiled in
If HAVE_GTK2_SUPPORT isn't defined then --gtk can't succeed, don't
support it as a command line option in this case.

v2. Is a rebase. Patch appears to have been missed in:
https://lore.kernel.org/lkml/Ygu40djM1MqAfkcF@kernel.org/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: xaizek <xaizek@posteo.net>
Link: https://lore.kernel.org/r/20220707203836.345918-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:11 -03:00
Adrian Hunter
2f1d6b41e2 perf intel-pt: Add documentation for tracing guest machine user space
Now it is possible to decode a host Intel PT trace including guest machine
user space, add documentation for the steps needed to do it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-36-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:07 -03:00
Adrian Hunter
98759cca84 perf intel-pt: Use guest pid/tid etc in guest samples
When decoding with guest sideband information, for VMX non-root (NR)
i.e. guest events, replace the host (hypervisor) pid/tid with guest values,
and provide also the new machine_pid and vcpu values.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-35-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:04 -03:00
Adrian Hunter
61cd9135d0 perf intel-pt: Add machine_pid and vcpu to auxtrace_error
When decoding with guest sideband information, for VMX non-root (NR)
i.e. guest errors, replace the host (hypervisor) pid/tid with guest values,
and provide also the new machine_pid and vcpu values.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-34-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:09:01 -03:00
Adrian Hunter
71658de4dd perf intel-pt: Determine guest thread from guest sideband
Prior to decoding, determine what guest thread, if any, is running.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-33-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:57 -03:00
Adrian Hunter
7d1f65b504 perf intel-pt: Disable sync switch with guest sideband
The sync_switch facility attempts to better synchronize context switches
with the Intel PT trace, however it is not designed for guest machine
context switches, so disable it when guest sideband is detected.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-32-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:53 -03:00
Adrian Hunter
0bb82cf518 perf intel-pt: Track guest context switches
Use guest context switch events to keep track of which guest thread is
running on a particular guest machine and VCPU.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-31-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:49 -03:00
Adrian Hunter
12374a1622 perf intel-pt: Add some more logging to intel_pt_walk_next_insn()
To aid debugging, add some more logging to intel_pt_walk_next_insn().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-30-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:46 -03:00
Adrian Hunter
7c0b20d13f perf intel-pt: Remove guest_machine_pid
Remove guest_machine_pid because it is not needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-29-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:42 -03:00
Adrian Hunter
f9de2f0fd3 perf tools: Add perf_event__is_guest()
Add a helper function to determine if an event is a guest event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-28-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:42 -03:00
Adrian Hunter
f42bbbf2e9 perf tools: Handle injected guest kernel mmap event
If a kernel mmap event was recorded inside a guest and injected into a host
perf.data file, then it will match a host mmap_name not a guest mmap_name,
see machine__set_mmap_name(). So try matching a host mmap_name in that
case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-27-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
eef8e06eeb perf machine: Use realloc_array_as_needed() in machine__set_current_tid()
Prepare machine__set_current_tid() for use with guest machines that do
not currently have a machine->env->nr_cpus_avail value by making use of
realloc_array_as_needed().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-26-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
97406a7e4f perf inject: Add support for injecting guest sideband events
Inject events from a perf.data file recorded in a virtual machine into
a perf.data file recorded on the host at the same time.

Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
injected.  Additionally, the guest kcore_dir is copied as kcore_dir__
appended to the machine PID.

This is non-trivial because:
 o It is not possible to process 2 sessions simultaneously so instead
 events are first written to a temporary file.
 o To avoid conflict, guest sample IDs are replaced with new unused sample
 IDs.
 o Guest event's CPU is changed to be the host CPU because it is more
 useful for reporting and analysis.
 o Sample ID is mapped to machine PID which is recorded with VCPU in the
 id index. This is important to allow guest events to be related to the
 guest machine and VCPU.
 o Timestamps must be converted.
 o Events are inserted to obey finished-round ordering.

The anticipated use-case is:
 - start recording sideband events in a guest machine
 - start recording an AUX area trace on the host which can trace also the
 guest (e.g. Intel PT)
 - run test case on the guest
 - stop recording on the host
 - stop recording on the guest
 - copy the guest perf.data file to the host
 - inject the guest perf.data file sideband events into the host perf.data
 file using perf inject
 - the resulting perf.data file can now be used

Subsequent patches provide Intel PT support for this.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
10d3470022 perf tools: Add reallocarray_as_needed()
Add helper reallocarray_as_needed() to reallocate an array to a larger
size and initialize the extra entries to an arbitrary value.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
a5367ecb53 perf tools: Automatically use guest kcore_dir if present
When registering a guest machine using machine_pid from the id index,
check perf.data for a matching kcore_dir subdirectory and set the
kallsyms file name accordingly. If set, use it to find the machine's
kernel symbols and object code (from kcore).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-23-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
65691e9ff0 perf tools: Make has_kcore_dir() work also for guest kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Make has_kcore_dir()
return true also if there is a guest machine kcore_dir.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-22-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:34 -03:00
Adrian Hunter
386e0d83d3 perf tools: Remove also guest kcore_dir with host kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Remove these also when
removing kcore_dir.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:30 -03:00
Adrian Hunter
13a133b255 perf script python: intel-pt-events: Add machine_pid and vcpu
Add machine_pid and vcpu to the intel-pt-events.py script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:25 -03:00
Adrian Hunter
6de306b7a5 perf script python: Add machine_pid and vcpu
Add machine_pid and vcpu to python sample events and context switch events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:21 -03:00
Adrian Hunter
7151c1d178 perf auxtrace: Add machine_pid and vcpu to auxtrace_error
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing
fmt member is used to identify the new format.

The new members make it possible to easily differentiate errors from guest
machines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:17 -03:00
Adrian Hunter
2273e46b98 perf dlfilter: Add machine_pid and vcpu
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be
used to determine if the values are present, however machine_pid is zero if
unused in any case. vcpu should be ignored if machine_pid is zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:13 -03:00
Adrian Hunter
e28fb159f1 perf script: Add machine_pid and vcpu
Add fields machine_pid and vcpu. These are displayed only if machine_pid is
non-zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:13 -03:00
Adrian Hunter
6350490995 perf session: Use sample->machine_pid to find guest machine
If machine_pid is set, use it to find the guest machine.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:08 -03:00
Adrian Hunter
3461b65da7 perf tools: Add machine_pid and vcpu to perf_sample
When parsing a sample with a sample ID, copy machine_pid and vcpu from
perf_sample_id to perf_sample.

Note, machine_pid will be zero when unused, so only a non-zero value
represents a guest machine. vcpu should be ignored if machine_pid is zero.

Note also, machine_pid is used with events that have come from injecting a
guest perf.data file, however guest events recorded on the host (i.e. using
perf kvm) have the (QEMU) hypervisor process pid to identify them - refer
machines__find_for_cpumode().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:04 -03:00
Adrian Hunter
797efbc523 perf tools: Add guest_cpu to hypervisor threads
It is possible to know which guest machine was running at a point in time
based on the PID of the currently running host thread. That is, perf
identifies guest machines by the PID of the hypervisor.

To determine the guest CPU, put it on the hypervisor (QEMU) thread for
that VCPU.

This is done when processing the id_index which provides the necessary
information.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:04 -03:00
Adrian Hunter
ff7a78c210 perf session: Create guest machines from id_index
Now that id_index has machine_pid, use it to create guest machines.
Create the guest machines with an idle thread because guest events
for "swapper" will be possible.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
b47bb18661 perf tools: Add machine_pid and vcpu to id_index
When injecting events from a guest perf.data file, the events will have
separate sample ID numbers. These ID numbers can then be used to determine
which machine an event belongs to. To facilitate that, add machine_pid and
vcpu to id_index records. For backward compatibility, these are added at
the end of the record, and the length of the record is used to determine
if they are present or not.

Note, this is needed because the events from a guest perf.data file contain
the pid/tid of the process running at that time inside the VM not the
pid/tid of the (QEMU) hypervisor thread. So a way is needed to relate
guest events back to the guest machine and VCPU, and using sample ID
numbers for that is relatively simple and convenient.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
c1fd5b7d8a perf buildid-cache: Do not require purge files to also be in the file system
realname() returns NULL if the file is not in the file system, but we can
still remove it from the build ID cache in that case, so continue and
attempt the purge with the name provided.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
15fe03621d perf buildid-cache: Add guestmount'd files to the build ID cache
When the guestmount option is used, a guest machine's file system mount
point is recorded in machine->root_dir.

perf already iterates guest machines when adding files to the build ID
cache, but does not take machine->root_dir into account.

Use machine->root_dir to find files for guest build IDs, and add them to
the build ID cache using the "proper" name i.e. relative to the guest root
directory not the host root directory.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:53 -03:00
Adrian Hunter
57190e38b0 perf script: Add --dump-unsorted-raw-trace option
When reviewing the results of perf inject, it is useful to be able to see
the events in the order they appear in the file.

So add --dump-unsorted-raw-trace option to do an unsorted dump.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
1ee94463e9 perf tools: Add perf_event__synthesize_id_sample()
Add perf_event__synthesize_id_sample() to enable the synthesis of
ID samples.

This is needed by perf inject. When injecting events from a guest perf.data
file, there is a possibility that the sample ID numbers conflict. In that
case, perf_event__synthesize_id_sample() can be used to re-write the ID
sample.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
0a64de04c9 perf tools: Factor out evsel__id_hdr_size()
Factor out evsel__id_hdr_size() so it can be reused.

This is needed by perf inject. When injecting events from a guest perf.data
file, there is a possibility that the sample ID numbers conflict. To
re-write an ID sample, the old one needs to be removed first, which means
determining how big it is with evsel__id_hdr_size() and then subtracting
that from the event size.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
eddc6e3f66 perf tools: Export perf_event__process_finished_round()
Export perf_event__process_finished_round() so it can be used elsewhere.

This is needed in perf inject to obey finished-round ordering.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:37 -03:00
Adrian Hunter
f8bcf1e223 perf ordered_events: Add ordered_events__last_flush_time()
Allow callers to get the ordered_events last flush timestamp.

This is needed in perf inject to obey finished-round ordering when
injecting additional events (e.g. from a guest perf.data file) with
timestamps. Any additional events that have timestamps before the last
flush time must be injected before the corresponding FINISHED_ROUND event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:30 -03:00
Adrian Hunter
163dac34d7 perf tools: Export dsos__for_each_with_build_id()
Export dsos__for_each_with_build_id() so it can be used elsewhere.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:02 -03:00
Adrian Hunter
68566a7cf5 perf tools: Fix dso_id inode generation comparison
Synthesized MMAP events have zero ino_generation, so do not compare
them to DSOs with a real ino_generation otherwise we end up with a DSO
without a build id.

Fixes: 0e3149f86b ("perf dso: Move dso_id from 'struct map' to 'struct dso'")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220711093218.10967-2-adrian.hunter@intel.com
[ Added clarification to the comment from Ian + more detailed explanation from Adrian ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-19 16:19:00 -03:00
Blake Jones
a6bd98c45d perf buildid-list: Add a "-m" option to show kernel and modules build-ids
This new option displays all of the information needed to do external
BuildID-based symbolization of kernel stack traces, such as those collected
by bpf_get_stackid().

For each kernel module plus the main kernel, it displays the BuildID,
the start and end virtual addresses of that module's text range (rounded
out to page boundaries), and the pathname of the module.

When run as a non-privileged user, the actual addresses of the modules'
text ranges are not available, so the tools displays "0, <text length>" for
kernel modules and "0, 0xffffffffffffffff" for the kernel itself.

Sample output:

  root# perf buildid-list -m
  cf6df852fd4da122d616153353cc8f560fd12fe0 ffffffffa5400000 ffffffffa6001e27 [kernel.kallsyms]
  1aa7209aa2acb067d66ed6cf7676d65066384d61 ffffffffc0087000 ffffffffc008b000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/sha512_generic.ko
  3857815b5bf0183697b68f8fe0ea06121644041e ffffffffc008c000 ffffffffc0098000 /lib/modules/5.15.15-1rodete2-amd64/kernel/arch/x86/crypto/sha512-ssse3.ko
  4081fde0bca2bc097cb3e9d1efcb836047d485f1 ffffffffc0099000 ffffffffc009f000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/acpi/button.ko
  1ef81ba4890552ea6b0314f9635fc43fc8cef568 ffffffffc00a4000 ffffffffc00aa000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/cryptd.ko
  cc5c985506cb240d7d082b55ed260cbb851f983e ffffffffc00af000 ffffffffc00b6000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/i2c/busses/i2c-piix4.ko
  [...]

Committer notes:

u64 formatter should be PRIx64 for printing as hex numbers, fix this:

  28     5.28 debian:experimental-x-mips    : FAIL gcc version 11.2.0 (Debian 11.2.0-18)
    builtin-buildid-list.c: In function 'buildid__map_cb':
    builtin-buildid-list.c:32:24: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
       32 |         printf("%s %16lx %16lx", bid_buf, map->start, map->end);
          |                    ~~~~^                  ~~~~~~~~~~
          |                        |                     |
          |                        long unsigned int     u64 {aka long long unsigned int}
          |                    %16llx
    builtin-buildid-list.c:32:30: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
       32 |         printf("%s %16lx %16lx", bid_buf, map->start, map->end);
          |                          ~~~~^                        ~~~~~~~~
          |                              |                           |
          |                              long unsigned int           u64 {aka long long unsigned int}
          |                          %16llx
    cc1: all warnings being treated as errors

Signed-off-by: Blake Jones <blakejones@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220629213632.3899212-1-blakejones@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18 16:35:34 -03:00
Arnaldo Carvalho de Melo
0698461ad2 Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.

Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:

  49c692b7df ("perf offcpu: Accept allowed sample types only")

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18 10:36:11 -03:00
Naveen N. Rao
4b335e1e0d perf trace: Fix SIGSEGV when processing syscall args
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

  #0  0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
  #1  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
  #2  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
  #3  0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
  #4  syscall__scnprintf_args <snip> at builtin-trace.c:2041
  #5  0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
	/*
	 * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
	 * arguments, even if the syscall being handled, say "openat", uses only 4 arguments
	 * this breaks syscall__augmented_args() check for augmented args, as we calculate
	 * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file,
	 * so when handling, say the openat syscall, we end up getting 6 args for the
	 * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
	 * thinking that the extra 2 u64 args are the augmented filename, so just check
	 * here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
	 */
	if (evsel != trace->syscalls.events.sys_enter)
		augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:59:52 -03:00
Adrian Hunter
deb44a6249 perf tests: Fix Convert perf time to TSC test for hybrid
The test does not always correctly determine the number of events for
hybrids, nor allow for more than 1 evsel when parsing.

Fix by iterating the events actually created and getting the correct
evsel for the events processed.

Fixes: d9da6f70eb ("perf tests: Support 'Convert perf time to TSC' test for hybrid")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:57:07 -03:00
Adrian Hunter
498c7a54f1 perf tests: Stop Convert perf time to TSC test opening events twice
Do not call evlist__open() twice.

Fixes: 5bb017d4b9 ("perf test: Fix error message for test case 71 on s390, where it is not supported")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:56:58 -03:00