Commit Graph

6296 Commits

Author SHA1 Message Date
Andi Kleen
f948339290 perf stat: Add support for metrics in interval mode
Now that we can modify the metrics printout functions easily, it's
straight forward to support metric printing for interval mode.  All that
is needed is to print the time stamp on every new line.  Pass the prefix
into the context and print it out.

v2: Move wrong hunk to here.

Committer note:

Before:

  [root@jouet ~]# perf stat -I 1000 -e instructions,cycles sleep 1
  #           time    counts unit events
       1.000168216   538,913      instructions
       1.000168216   748,765      cycles
       1.000660048   153,741      instructions
       1.000660048   214,066      cycles

After:

  # perf stat -I 1000 -e instructions,cycles sleep 1
  #           time    counts unit events
       1.000215928   519,620      instructions              #    0.69  insn per cycle
       1.000215928   752,003      cycles
       1.000946033   148,502      instructions              #    0.33  insn per cycle
       1.000946033   160,104      cycles

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:13:01 -03:00
Andi Kleen
140aeadc1f perf stat: Abstract stat metrics printing
Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.

This will allow adding metrics to CSV mode and also using them for other
purposes.

The computation of padding is now done in the central callback, instead
of every metric doing it manually.  This makes it easier to add new
metrics.

v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet.  Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:13:00 -03:00
Jiri Olsa
720e98b5fa perf tools: Add perf data cache feature
Storing CPU cache details under perf data. It's stored as new
HEADER_CACHE feature and it's displayed under header info with -I
option:

  $ perf report --header-only -I
  ...
  # CPU cache info:
  #  L1 Data                 32K [0-1]
  #  L1 Instruction          32K [0-1]
  #  L1 Data                 32K [2-3]
  #  L1 Instruction          32K [2-3]
  #  L2 Unified             256K [0-1]
  #  L2 Unified             256K [2-3]
  #  L3 Unified            4096K [0-3]
  ...

All distinct caches are stored/displayed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160216150143.GA7119@krava.brq.redhat.com
[ Fixed leak on process_caches(), s/cache_level/cpu_cache_level/g ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:13:00 -03:00
Jiri Olsa
dd629cc097 perf tools: Initialize libapi debug output
Setting libapi debug output functions to use perf functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:59 -03:00
Arnaldo Carvalho de Melo
bedbdd4297 perf debug: Rename __eprintf(va_list args) to veprintf
Adhering to the naming convention used when va_args is in a printf like
function, e.g. stdio.h.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b5l3wt77ct28dcnriguxtvn6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:58 -03:00
Jiri Olsa
607bfbd7ff tools lib api fs: Adopt filename__read_str from perf
We already moved similar functions in here, also it'll be useful for
sysfs__read_str addition in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:56 -03:00
Stephane Eranian
d646ae0a73 perf jvmti: Add check for java alternatives cmd in Makefile
This patch modifies the jvmti makefile to check if the
/usr/sbin/java-update-alternatives utility is present.  If so, then use
it, if not then use the altenatives command.

This helps handle the difference between Ubuntu and Fedora Linux
distributions.

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455604661-9357-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-16 17:12:46 -03:00
Arnaldo Carvalho de Melo
1ad826bad5 perf tests: Fix build on older systems where 'signal' is reserved
fixing the following problems, for instance, on RHEL6.7:

    CC       /tmp/build/perf/tests/bp_signal.o
  cc1: warnings being treated as errors
  tests/bp_signal.c: In function ‘__event’:
  tests/bp_signal.c:106: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  tests/bp_signal.c: In function ‘bp_event’:
  tests/bp_signal.c:144: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  tests/bp_signal.c: In function ‘wp_event’:
  tests/bp_signal.c:149: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  mv: cannot stat `/tmp/build/perf/tests/.bp_signal.o.tmp': No such file or directory
  make[3]: *** [/tmp/build/perf/tests/bp_signal.o] Error 1
  make[2]: *** [tests] Error 2
  make[1]: *** [/tmp/build/perf/perf-in.o] Error 2
  make[1]: *** Waiting for unfinished jobs....

Reported-by: Vinson Lee <vlee@freedesktop.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Fixes: 8fd34e1cce ("perf test: Improve bp_signal")
Link: http://lkml.kernel.org/n/tip-wlpx6tik1b0jirlkw64bv400@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-15 17:33:26 -03:00
Wang Nan
5141d7350d perf data: Fix releasing event_class
A new patch in libbabeltrace [1] reveals a object leak problem in
'perf data' CTF support: perf code never releases the event_class
which is allocated in add_event() and stored in evsel's private field.

If libbabeltrace has the above patch applied, leaking event_class
prevents the writer from being destroyed and flushing metadata. For
example:

  $ perf record ls
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
  $ perf data convert --to-ctf ./out.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
  $ cat ./out.ctf/metadata
  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata

The correct result should be:
  ...
  $ cat ./out.ctf/metadata
  /* CTF 1.8 */

  trace {
  [SNIP]

  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata

The full story is:

Patch [1] of babeltrace redesigns its reference counting scheme. In that
patch:

 * writer <- trace (bt_ctf_writer_create)
 * trace <- stream_class (bt_ctf_trace_add_stream_class)
 * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
 ('<-' means 'is a parent of')

Holding of event_class causes reference count of corresponding 'writer'
to increase through parent chain. Perf expects that 'writer' is released
(so metadata is flushed) through bt_ctf_writer_put() in
ctf_writer__cleanup(). However, since it never releases event_class, the
reference of 'writer' won't be dropped, so bt_ctf_writer_put() won't
lead to the release of writer.

Before this CTF patch, !(writer <- trace). Even with event_class leaking,
the writer ends up being released.

[1] e6a8e8e474

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:27:48 -03:00
Arnaldo Carvalho de Melo
2146afc6b4 perf tools: Rename parse_events__free_terms() to parse_events_terms__delete()
To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:09:17 -03:00
Wang Nan
d20a5f2b27 perf tools: Free the terms list_head in parse_events__free_terms()
Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 17:01:17 -03:00
Arnaldo Carvalho de Melo
682dc24c2a perf tools: Use perf_event_terms__purge() for non-malloced terms
In these two cases, a 'perf test' entry and in the PMU code the
list_head is on the stack, so we can't use perf_event__free_terms()
(soon to be renamed to perf_event_terms__delete()), because it will
free the list_head as well.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-i956ryjhz97gnnqe8iqe7m7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:53:22 -03:00
Arnaldo Carvalho de Melo
fc0a2c1d59 perf tools: Introduce parse_events_terms__purge()
Purges 'struct parse_event_term' entries from a list_head.

Some users need this because they don't allocate space for the list
head, it maybe on the stack or embedded into some other struct.

Next patch will convert users that need just purging and then the
perf_events__free_terms() routine will free the list head as well,
finally being renamed to perf_events_terms__delete().

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-4w3zl4ifcl0ed0j4bu3tckqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:53:19 -03:00
Wang Nan
a8adfceb38 perf tools: Unlink entries from terms list
We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 16:51:15 -03:00
Arnaldo Carvalho de Melo
89fee70943 perf hists: Do column alignment on the format iterator
We were doing column alignment in the format function for each cell,
returning a string padded with spaces so that when the next column is
printed the cursor is at its column alignment.

This ends up needlessly printing trailing spaces, do it at the format
iterator, that is where we know if it is needed, i.e. if there is more
columns to be printed.

This eliminates the need for triming lines when doing a dump using 'P'
in the TUI browser and also produces far saner results with things like
piping 'perf report' to 'less'.

Right now only the formatters for sym->name and the 'locked' column
(perf mem report), that are the ones that end up at the end of lines
in the default 'perf report', 'perf top' and 'perf mem report' tools,
the others will be done in a subsequent patch.

In the end the 'width' parameter for the formatters now mean, in
'printf' terms, the 'precision', where before it was the field 'width'.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 12:52:25 -03:00
Arnaldo Carvalho de Melo
37d9bb580a perf tools: Add comment explaining the repsep_snprintf function
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 12:52:20 -03:00
Taeung Song
b416e204f8 perf python scripting: Append examples to err msg about audit-libs-python
To print syscall names, the audit-libs-python package is required.. If
not installed, it prints this error string:

    # perf script syscall-counts
    Install the audit-libs-python package to get syscall names.

But the package name is different in Ubuntu, mention that in the error
message, similar to a error message of util/trace-event-scripting.c:

    # perf script syscall-counts
    Install the audit-libs-python package to get syscall names.
    For example:
      # apt-get install python-audit (Ubuntu)
      # yum install audit-libs-python (Fedora)
      etc.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455018790-13425-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 11:30:27 -03:00
Zubair Lutfullah Kakakhel
37b4e2020a perf build: Add EXTRA_LDFLAGS option to makefile
To compile for little-endian systems, you need to pass -EL to CC and LD.

EXTRA_CFLAGS works to pass -EL to CC.
Add EXTRA_LDFLAGS to pass -EL to LD.

Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455024818-15842-1-git-send-email-Zubair.Kakakhel@imgtec.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 11:30:20 -03:00
Wang Nan
e7ee404757 perf symbols: Fix symbols searching for module in buildid-cache
Before this patch, if a sample is triggered inside a module not in
/lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
report' will still be unable to find the correct symbol.  For example:

  # rm -rf ~/.debug/
  # perf buildid-cache -a ./mymodule.ko
  # perf probe -m ./mymodule.ko -a get_mymodule_val
  Added new event:
    probe:get_mymodule_val (on get_mymodule_val in mymodule)

  You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

  # perf record -e probe:get_mymodule_val cat /proc/mymodule
  mymodule:3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

  # perf report --stdio
  [SNIP]
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

  # perf report -vvvv --stdio
  dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
  symbol__new: get_mymodule_val 0x70-0x8a
  [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when its file is found in some well know directories. All
files loaded from buildid-cache are treated as user programs. Following
dso__load_sym() set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko', and
adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 10:54:47 -03:00
Taeung Song
c7ac24178c perf config: Add '--system' and '--user' options to select which config file is used
The '--system' option means $(sysconfdir)/perfconfig and '--user' means
$HOME/.perfconfig. If none is used, both system and user config file are
read.  E.g.:

    # perf config [<file-option>] [options]

    With an specific config file:

    # perf config --user | --system

    or both user and system config file:

    # perf config

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455126685-32367-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-12 10:54:46 -03:00
Stephane Eranian
598b7c6919 perf jit: add source line info support
This patch adds source line information support to perf for jitted code.

The source line info must be emitted by the runtime, such as JVMTI.

Perf injects extract the source line info from the jitdump file and adds
the corresponding .debug_lines section in the ELF image generated for
each jitted function.

The source line enables matching any address in the profile with a
source file and line number.

The improvement is visible in perf annotate with the source code
displayed alongside the assembly code.

The dwarf code leverages the support from OProfile which is also
released under GPLv2.  Copyright 2007 OProfile authors.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 12:33:09 -03:00
Stephane Eranian
209045adc2 perf tools: add JVMTI agent library
This is a standalone JVMTI library to help  profile Java jitted code with perf
record/perf report. The library is not installed or compiled automatically by
perf Makefile. It is not used directly by perf. It is arch agnostic and has
been tested on X86 and ARM. It needs to be used with a Java runtime, such as
OpenJDK, as follows:

  $ java -agentpath:libjvmti.so .......

See the "Committer Notes" below on how to build it.

When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*

This binary dump file contains information to help symbolize and
annotate jitted code.

The jitdump information must be injected into the perf.data file
using:

  $ perf inject --jit -i perf.data -o perf.data.jitted

This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.

Then, to visualize the function or asm profile, simply use the regular
perf commands:

  $ perf report -i perf.data.jitted

or

  $ perf annotate -i perf.data.jitted

JVMTI agent code adapted from the OProfile's opagent code.

This version of the JVMTI agent is using the CLOCK_MONOTONIC as the time
source to timestamp jit samples. To correlate with perf_events samples,
it needs to run on kernel 4.0.0-rc5+ or later with the following commit
from Peter Zijlstra:

  34f439278c ("perf: Add per event clockid support")

With this patch recording jitted code is done as follows:

   $ perf record -k mono -- java -agentpath:libjvmti.so .......

 --------------------------------------------------------------------------

Committer Notes:

Extended testing instructions:

  $ cd tools/perf/jvmti/
  $ dnf install java-devel
  $ make

Then, create some simple java stuff to record some samples:

  $ cat hello.java
  public class hello {
	public static void main(String[] args) {
                 System.out.println("Hello, World");
       	}
  }
  $ javac hello.java
  $ java hello
  Hello, World
  $

And then record it using this jvmti thing:

  $ perf record -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hello
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jit-1908.dump
  Hello, World
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (268 samples) ]
  $

Now lets insert the PERF_RECORD_MMAP2 records to point jitted mmaps to
files created by the agent:

  $ perf inject --jit -i perf.data -o perf.data.jitted

And finally see that it did its job:

  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | tail -5
  79197149129422 0xfe10 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428bd60(0x80) @ 0x40 fd:02 1840554 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-283.so
  79197149235701 0xfeb0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428ba60(0x180) @ 0x40 fd:02 1840555 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-284.so
  79197149250558 0xff50 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b860(0x180) @ 0x40 fd:02 1840556 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-285.so
  79197149714746 0xfff0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b660(0x180) @ 0x40 fd:02 1840557 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-286.so
  79197149806558 0x10090 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b460(0x180) @ 0x40 fd:02 1840558 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-287.so
  $

So:

  $ perf report -D -i perf.data | grep PERF_RECORD_MMAP2 | wc -l
  Failed to open /tmp/perf-1908.map, continuing without symbols
  21
  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | wc -l
  307
  $ echo $((307 - 21))
  286
  $

286 extra PERF_RECORD_MMAP2 records.

All for thise tiny, with just one function, ELF files:

  $ file /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so
  /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), corrupted program header size, BuildID[sha1]=ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f, not stripped
  $ readelf -sw /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so

  Symbol table '.symtab' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000040     9 FUNC    LOCAL  DEFAULT    1 atomic_cmpxchg_long
  $

Inserted into the build-id cache:

  $ ls -la ~/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f
  lrwxrwxrwx. 1 acme acme 111 Feb  5 11:30 /home/acme/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f -> ../../home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so/ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f

Note: check why 'file' reports that 'corrupted program header size'.

With a stupid java hog to do some profiling:

$ cat hog.java
  public class hog {
	private static double do_something_else(int i) {
		double total = 0;
		while (i > 0) {
			total += Math.log(i--);
		}
		return total;
	}
	private static double do_something(int i) {
		double total = 0;
		while (i > 0) {
			total += Math.sqrt(i--) + do_something_else(i / 100);
		}
		return total;
	}
	public static void main(String[] args) {
		System.out.println(String.format("%s=%f & %f", args[0],
				   do_something(Integer.parseInt(args[0])),
				   do_something_else(Integer.parseInt(args[1]))));
	}
  }
  $ javac hog.java
  $ perf record -F 10000 -g -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hog 100000 2345000
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XX4sqd14/jit-8670.dump
  100000=291561592.669602 & 32050989.778714
  [ perf record: Woken up 6 times to write data ]
  [ perf record: Captured and wrote 1.536 MB perf.data (12538 samples) ]
  $ perf inject --jit -i perf.data -o perf.data.jitted

Looking at the 'perf report' TUI, at one expanded callchain leading
to the jitted code:

  $ perf report --no-children -i perf.data.jitted

Samples: 12K of event 'cycles:pp', Event count (approx.): 3829569932
  Overhead  Comm  Shared Object       Symbol
-   93.38%  java  jitted-8670-291.so  [.] class hog.do_something_else(int)
     class hog.do_something_else(int)
   - Interpreter
      - 75.86% call_stub
           JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread
      - 17.52% JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-4-git-send-email-eranian@google.com
[ Made it build on fedora23, added some build/usage instructions ]
[ Check if filename != NULL in compiled_method_load_cb, fixing segfault ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 12:26:31 -03:00
Stephane Eranian
9b07e27f88 perf inject: Add jitdump mmap injection support
This patch adds a --jit/-j option to perf inject.

This options injects MMAP records into the perf.data file to cover the
jitted code mmaps. It also emits ELF images for each function in the
jidump file.  Those images are created where the jitdump file is.  The
MMAP records point to that location as well.

Typical flow:

  $ perf record -k mono -- java -agentpath:libpjvmti.so java_class
  $ perf inject --jit -i perf.data -o perf.data.jitted
  $ perf report -i perf.data.jitted

Note that jitdump.h support is not limited to Java, it works with any
jitted environment modified to emit the jitdump file format, include
those where code can be jitted multiple times and moved around.

The jitdump.h format is adapted from the Oprofile project.

The genelf.c (ELF binary generation) depends on MD5 hash encoding for
the buildid. To enable this, libssl-dev must be installed. If not, then
genelf.c defaults to using urandom to generate the buildid, which is not
ideal.  The Makefile auto-detects the presence on libssl-dev.

This version mmaps the jitdump file to create a marker MMAP record in
the perf.data file. The marker is used to detect jitdump and cause perf
inject to inject the jitted mmaps and generate ELF images for jitted
functions.

In V8, the following fixes and changes were made among other things:

  -  the jidump header format include a new flags field to be used
     to carry information about the configuration of the runtime agent.
     Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file
    at which the code resides.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic
    objects to match the file offset.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all
    MMAP events in one go, so it does not obey finished_round semantics, so drop the
    finished_round events from the output perf.data file.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 09:46:45 -03:00
Arnaldo Carvalho de Melo
921f3fadbc perf inject: Make sure mmap records are ordered when injecting build_ids
To make sure the mmap records are ordered correctly and so that the
correct especially due to jitted code mmaps.

We cannot generate the buildid hit list and inject the jit mmaps (will
come right after this patch) in at the same time for now.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 09:46:45 -03:00
Stephane Eranian
8ee4646038 perf build: Add libcrypto feature detection
Will be used to generate build-ids in the jitdump code.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ tools/perf/Makefile.perf comment about NO_LIBCRYPTO and added it to tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 09:46:45 -03:00
Stephane Eranian
e9c4bcdd34 perf symbols: add Java demangling support
Add Java function descriptor demangling support.  Something bfd cannot
do.

Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of
functions.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 09:46:45 -03:00
Marcin Ślusarz
89fee59b50 perf tools: handle spaces in file names obtained from /proc/pid/maps
Steam frequently puts game binaries in folders with spaces.

Note: "(deleted)" markers are now treated as part of the file name.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 6064803313 ("perf tools: Use sscanf for parsing /proc/pid/maps")
Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-05 09:39:56 -03:00
Arnaldo Carvalho de Melo
be9e499111 perf build tests: Do parallell builds with 'build-test'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jhmnf9g7y9ryqcjql00unk5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 15:57:00 -03:00
Jiri Olsa
3e2751d916 perf tools: Fix parallel build including 'clean' target
Do not parallelize 'clean' with other targets, figure out if it is
present and do it first, then the other targets.

Noticed with:

  tools/perf> make -j24 clean all

   LD       arch/libperf-in.o
   LD       plugin_xen-in.o
 arch//libperf-in.o: file not recognized: File truncated
 make[3]: *** [arch/libperf-in.o] Error 1
 make[2]: *** [arch] Error 2
 make[2]: *** Waiting for unfinished jobs....
   AR       libapi.a

Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kb0qs29zbz7hxn32mc5zbsoz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 15:53:16 -03:00
Taeung Song
a9edec3ce2 perf config: Document 'record.build-id' variable in man page
Explain 'record.build-id' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-9-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:49:16 -03:00
Taeung Song
57f0dafe6a perf config: Document 'kmem.default' variable in man page
Explain 'kmem.default' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-8-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:47:39 -03:00
Taeung Song
ab2e08e8ba perf config: Document 'pager.<subcommand>' variables in man page
Explain 'pager.<subcommand>' variables.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-7-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:47:25 -03:00
Taeung Song
08b75b409e perf config: Document 'man.viewer' variable in man page
Explain 'man.viewer' variable and how to add new man viewer tools.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-6-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:46:54 -03:00
Taeung Song
0b04c84087 perf config: Document 'top.children' variable in man page
Explain 'top.children' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-5-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:46:12 -03:00
Taeung Song
806cb95bb6 perf config: Document variables for 'report' section in man page
Explain 'report' section's variables:

  'percent-limit', 'queue-size' and 'children'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-4-git-send-email-treeze.taeung@gmail.com
[ Fix some grammar issues, add some more info ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:44:25 -03:00
Taeung Song
56c94dc56f perf config: Document variables for 'call-graph' section in man page
Explain 'call-graph' section and its variables:

  'record-mode', 'dump-size', 'print-type', 'order', 'sort-key',
  'threshold' and 'print-limit'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:37:32 -03:00
Taeung Song
67f43c0097 perf config: Document 'ui.show-headers' variable in man page
This option controls display of column headers (like 'Overhead' and
'Symbol') in 'report' and 'top'. If this option is false, they are
hidden.  This option is only applied to TUI.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:36:13 -03:00
Arnaldo Carvalho de Melo
3c7a152b0d perf build tests: Move the feature related vars to the front of the make cmdline
So that we do less visual searching on the 'make build-test' output to
see the feature related variables:

After:

  $ make -C tools/perf build-test
  <SNIP>
           make_no_newt_O: cd . && make NO_NEWT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.dz55IX DESTDIR=/tmp/tmp.X29xxo
              make_tags_O: cd . && make tags FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.6ecLh8 DESTDIR=/tmp/tmp.6vIla578Ho
  make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.SVPM2G DESTDIR=/tmp/tmp.C0oAam

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dx4krgzqa566v1pedrbrcchi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:27:51 -03:00
Arnaldo Carvalho de Melo
5978531b29 perf build tests: Elide "-f Makefile" from make invokation
Since this is the name that 'make' will look for if no explicit -f file
is passed.

This in turn makes the output of 'build-test' more compact:

Before:

  $ perf stat make -C tools/perf build-test
  <SNIP>
  cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
          make_no_libaudit_O: cd . && make -f Makefile  O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
  <SNIP>

After:

  $ perf stat make -C tools/perf build-test
  <SNIP>
          make_no_libaudit_O: cd . && make  O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
  <SNIP>

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m440lb8dkfsywsyah0htif6t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-04 11:27:50 -03:00
Ingo Molnar
d3aaf09f88 perf/core improvements and fixes:
New features:
 
 - Add 'L' hotkey to dynamicly set the percent threshold for histogram
   entries and callchains, i.e. dynamicly do what the --percent-limit
   command line option to 'top' and 'report' does. (Namhyung Kim)
 
 Infrastructure:
 
 - Per hists field and sort lists, that will be used, for instance,
   in the c2c tool (Jiri Olsa)
 
 Documentation:
 
 - Update documentation of --sort and --perf-limit options
   for 'perf report' (Namhyung Kim)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWsiK5AAoJENZQFvNTUqpADJoQAIVBlW29iw+qshg0bWM+1xw/
 RBKgTCLT2VPK9wE00qIQOmZjf4ZRHLWDBVDf8iB36UMhY4q4vUaWYRJu8dZ/IoPx
 AQPg5JFe8INd2voDHo6E8vCdS58Mk7NWRdWs9dctVHGa7LjdDdLhYNLcghIps8Pb
 4cSLltmaa/XGxBvu526g6TRSaoaCIVO1W0wxNvoc3XFipI7zqtgCasLr5Hb6RxdD
 BUC7WYuJ5n96EqR0vjeV7ajv9m65+RaLFIIa6vjUIEdAB+3sDIpfiMpkIprAzoze
 TcsGpSz5+5yAbnezgIoHaRvFWzd/hune62sZmCySd1E3+5Vb8JbUo4WxNTAlTwdS
 XbwY1VyLYDnKFVqQSAElFbeXad8kbkd06tzLgUOTfoeHWKMp5qxfH/myLNjF42pK
 d5mQ5TnN/ESsj3D/3TcgBzmIOGIS5R2evKsIIwip4rSGAP901U+RRBMDE87vn0M2
 TsqrkKPZgQcYTA9FKlWIPCdT0Y+yYr0jvgRabN9rQMCp5r7Wb7hrpwHGMPp16hbT
 1mnMm8N+moIMv1Wp0R5C75bls2uFaOiS3Y7XyORjqfwHMfzLUqPWPSmDwtSrbE8a
 FTDGInJubTJRXdjTE/0+d+zZbpiEqTybfjOce3AcB2KBMfSJaQRvLcvAei21GMWC
 hrBhLY6ngbgd7jD1RTHg
 =foQI
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

 - Add 'L' hotkey to dynamicly set the percent threshold for histogram
   entries and callchains, i.e. dynamicly do what the --percent-limit
   command line option to 'top' and 'report' does. (Namhyung Kim)

Infrastructure changes:

 - Per hists field and sort lists, that will be used, for instance,
   in the c2c tool (Jiri Olsa)

Documentation changes:

 - Update documentation of --sort and --perf-limit options
   for 'perf report' (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-04 08:58:01 +01:00
Ingo Molnar
b83ea91f08 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-04 08:57:44 +01:00
Ingo Molnar
580df49eed perf/urgent fix:
- Fix 'perf stat' interval output values (Jiri Olsa)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWsoP/AAoJENZQFvNTUqpAuDsQAIj/MjfVV5l44QhWmbuBXbaL
 jIhoXqRT9ZK7LFWS+zJdTKxTR03BkfSkfp0TMsXJ9WPAT8xmoUe+0sNR9PWQm/vh
 zjaqTAaFbyhYgYJBDmTceEMNoRiPzdb622l0phWuoGshG+3LKa0sWDR1W2ETUv+u
 ihITgtyQpYSa9DH/wwzhwiSUaHux7nNQgM7ZwLb1WWo8U+/pRFGxn+mAZ/RGwV1F
 hmmxonqrcI/BGU6HtpAQL7nM9DFmfysx++32FSlYg3nwUKPXqYN6rc3z5eMxzVCg
 2q3+5jHPx3CtfE/VUntyjGOOSSTmio1M08EllDhV0D/muz30piqdF8YigVAHQ7m2
 FtJ4NL9vjiQ3VZ28CmxAOT8Af2bTBysCD9WHRHvGhSOpW9wNS/pzTh5Z13ImxMeI
 L3wHPaypJZY+xY+c61jYUj4gJ2bHSSLMUj7LK1nhV6XQ8cQW6Fc+9X7gM0KRwWXL
 Oa1miTkLkI/x0KlQLQvPa1cwwaWdZKIjYjpp10jsrSHlDnPJNGVdmSWNoTRscNz/
 Pn9+cabxWllJMhXlcmzb5yzDmFFMpxR8UMVajH0eaMBIJdYhBq9LVJyFMdL9U4mT
 2XOEX1I48Ix+k0M8kqwZPgCC7X4h/N0frCsbGKeuv9nacRvGt8F75BjH42fSVZQQ
 ihfcFcK6LSNpUGeHlgkS
 =yHrl
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

  - Fix 'perf stat' interval output values (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-04 08:56:54 +01:00
Ingo Molnar
9a969403c3 perf/urgent fixes:
- tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
   with a very specific combination: Machine with Intel PT (e.g. Broadwell),
   kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
   tracefs (for instance !root users), making the fallback from
   perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
   reading its info from tracefs, fix it. (Adrian Hunter)
 
 - Fix segfault in intel pt, by making it follow the 'struct thread' lifetime cycle
   checking expectations, noticed for instance, when processing perf.data files with
   Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)
 
 - Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
   fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWsjJzAAoJENZQFvNTUqpAF2EP/2UJGz/mi7O3wu90YO7BnglD
 WizS/Y4fTLDfoz9+hwiUscHMvBOpAYMRbGX73AyHCP5qPsv1fRyW5jPCyCqpvDWB
 TU86CX0t9CEXAj3mKTwShqIXiY9Hmf0lOwxAxY+Y5I12utirqHzZreilBhNvHStz
 ESYXpTKsDNdU08Zu7nLmKqIlFLnRvY+7sL55+rgcw7DWaIcivpAF8b8RX7iJyoJk
 fL7dkebXDtZvQpBZ4A8TniACjqebfpg1BSiZ7c9NDIs7YMB+2VPDzXrySP2Oq3q6
 u8rZtwn8/0idZ5Es2LWU68QXJL0Z6q7p74BZ+/IO1jSTviegu8CQTfIHRfyx+ur4
 IZroUuEPDz9tFw7q8tUt/D48Qbh7rOIFBYUHtbq9e0g1WfW2g0NzP/EseNQxkica
 uZdfn98cHZyeGiNLhRAjqTwGmZTlV7EoNh6282i7PwyJ9J5nOs36f3Tuo1bekVp+
 qtugNbE2xebwwCiBSAHbsQcIrKnyL+bcgSrDzKAP5kBz9r58TQad+CUNQ/IHyhdr
 q66RYEy3cdmotcPKtK5jxNMYoSoJzlGEpX3FKXZNHkpRIkNp1vZA6MlSIiHOo/A8
 eUg6O55XBRJLYdZ/Q2Vb1t1X83g2789o4tgW0tOUtwBwCW7AIQq7w2m08aUCbQf5
 2/HaTdhyxMx6ra34dzS1
 =0cG5
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
   with a very specific combination: Machine with Intel PT (e.g. Broadwell),
   kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
   tracefs (for instance !root users), making the fallback from
   perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
   reading its info from tracefs, fix it. (Adrian Hunter)

 - Fix segfault in intel PT, by making it follow the 'struct thread' lifetime cycle
   checking expectations, noticed for instance, when processing perf.data files with
   Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)

 - Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
   fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-04 08:55:00 +01:00
Jiri Olsa
51fd2df1e8 perf stat: Fix interval output values
We broke interval data displays with commit:

  3f416f22d1 ("perf stat: Do not clean event's private stats")

This commit removed stats cleaning, which is important for '-r' option
to carry counters data over the whole run. But it's necessary to clean
it for interval mode, otherwise the displayed value is avg of all
previous values.

Before:
  $ perf stat -e cycles -a -I 1000 record
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791         91,519,906      cycles

Now:
  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

Notice the second value being bigger (91,.. < 107,..).

This could be easily verified by using perf script which displays raw
stat data:

  $ perf script
  CPU  THREAD       VAL         ENA         RUN        TIME EVENT
    0      -1  23855779  1000209530  1000209530  1000240796 cycles
    1      -1  33340397  1000224964  1000224964  1000240796 cycles
    2      -1  15835415  1000226695  1000226695  1000240796 cycles
    3      -1   2184696  1000228245  1000228245  1000240796 cycles
    0      -1  97014312  2000514533  2000514533  2000512791 cycles
    1      -1  46121497  2000543795  2000543795  2000512791 cycles
    2      -1  32269530  2000543566  2000543566  2000512791 cycles
    3      -1   7634472  2000544108  2000544108  2000512791 cycles

The sum of the first 4 values is the first interval aggregated value:

  23855779 + 33340397 + 15835415 + 2184696 = 75,216,287

The sum of the second 4 values minus first value is the second interval
aggregated value:

  97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454485436-20639-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 19:39:52 -03:00
Namhyung Kim
b62e8dfcda perf hists browser: Add 'L' hotkey to change percent limit
Add 'L' key action to change the percent limit applied to both of hist
entries and callchains.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:22 -03:00
Namhyung Kim
1ba2fc6de4 perf report: Update documention of --percent-limit option
The --percent-limit option was changed to be applied to callchains as
well as to hist entries recently, but it missed to update the doc.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:21 -03:00
Namhyung Kim
c6f5f6b662 perf report: Update documentation of --sort option
The description of the memory sort key (used by --mem-mode) was
misplaced.  Move it under the --sort option so that it can be referenced
properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:21 -03:00
Jiri Olsa
aa6f50af82 perf hists: Introduce hists__for_each_sort_list macro
With the hist object having the perf_hpp_list we can now iterate sort
format entries based in the hists object. Adding
hists__for_each_sort_list macro to do that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-27-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:20 -03:00
Jiri Olsa
f0786af536 perf hists: Introduce hists__for_each_format macro
With the hist object having the perf_hpp_list we can now iterate output
format entries based in the hists object. Adding hists__for_each_format
macro to do that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-26-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:19 -03:00
Jiri Olsa
5b65855e20 perf tools: Add hpp_list into struct hists object
Adding hpp_list into struct hists object.

Initializing struct hists_evsel hists object to carry global
perf_hpp_list list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-25-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:18 -03:00
Jiri Olsa
43e0a68f13 perf hists: Add struct perf_hpp_list argument to helper functions
Adding struct perf_hpp_list argument to following helper functions:

  void perf_hpp__setup_output_field(struct perf_hpp_list *list);
  void perf_hpp__reset_output_field(struct perf_hpp_list *list);
  void perf_hpp__append_sort_keys(struct perf_hpp_list *list);

so they could be used on hists's hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-24-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:17 -03:00
Jiri Olsa
1a8ebd243a perf hists: Introduce perf_hpp_list__for_each_sort_list_safe macro
Introducing perf_hpp_list__for_each_sort_list_safe macro
to iterate perf_hpp_list object's sort entries safely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-23-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:16 -03:00
Jiri Olsa
d29a497090 perf hists: Introduce perf_hpp_list__for_each_sort_list macro
Introducing perf_hpp_list__for_each_sort_list macro to iterate
perf_hpp_list object's sort entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-22-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:15 -03:00
Jiri Olsa
7a1799e0a2 perf hists: Introduce perf_hpp_list__for_each_format_safe macro
Introducing perf_hpp_list__for_each_format_safe macro to iterate
perf_hpp_list object's output entries safely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-21-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:14 -03:00
Jiri Olsa
cf094045d7 perf hists: Introduce perf_hpp_list__for_each_format macro
Introducing perf_hpp_list__for_each_format macro to iterate
perf_hpp_list object's output entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-20-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:14 -03:00
Jiri Olsa
07600027fb perf hists: Pass perf_hpp_list all the way through setup_output_list
Passing perf_hpp_list all the way through setup_output_list so the
output entry could be added on the arbitrary list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-19-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:13 -03:00
Jiri Olsa
ebdd98e030 perf hists: Add perf_hpp_list register helpers
Adding 2 perf_hpp_list register helpers:

  perf_hpp_list__column_register()
  perf_hpp_list__register_sort_field()

to be called within existing helpers:

  perf_hpp__column_register()
  perf_hpp__register_sort_field()

to register format entries within global perf_hpp_list object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-17-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:12 -03:00
Jiri Olsa
94b3dc3865 perf hists: Introduce perf_hpp_list__init function
Introducing perf_hpp_list__init function to have an easy way to
initialize perf_hpp_list struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-16-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:11 -03:00
Jiri Olsa
7c31e10266 perf hists: Introduce struct perf_hpp_list
Gather output and sort lists under struct perf_hpp_list, so we could
have multiple instancies of sort/output format entries.

Replacing current perf_hpp__list and perf_hpp__sort_list lists with
single perf_hpp_list instance.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-15-git-send-email-jolsa@kernel.org
[ Renamed fields to .{fields,sorts} as suggested by Namhyung and acked by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:10 -03:00
Jiri Olsa
6d3375efeb perf hists: Separate output fields parsing into setup_output_list function
Separating output fields parsing into setup_output_list function, so
it's separated from field_order string setup and could be reused later
in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:09 -03:00
Jiri Olsa
2fbaa39079 perf hists: Separate sort fields parsing into setup_sort_list function
Separating sort fields parsing into setup_sort_list function, so it's
separated from sort_order string setup and could be reused later in
following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:09 -03:00
Jiri Olsa
564132f311 perf hists: Properly release format fields
With multiple list holding format entries, we need the support properly
releasing format output/sort fields.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:08 -03:00
Jiri Olsa
12cb4397fb perf hists: Remove perf_hpp__column_(disable|enable)
Those functions are no longer needed. They operate over perf_hpp__format
array which is now used only as template for dynamic entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:08 -03:00
Jiri Olsa
1945c3e734 perf hists: Allocate output sort field
Currently we use static output fields, because we have single global
list of all sort/output fields.

We will add hists specific sort and output lists in following patches,
so we need all format entries to be dynamically allocated. Adding
support to allocate output sort field.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:07 -03:00
Arnaldo Carvalho de Melo
3ee60c3b18 perf top: Move UI initialization ahead of sort setup
The ui initialization changes hpp format callbacks, based on the used
browser. Thus we need this init being processed before setup_sorting.

Replica of a patch by Jiri for 'perf report'.

Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:03 -03:00
Jiri Olsa
9887804d01 perf report: Move UI initialization ahead of sort setup
The ui initialization changes hpp format callbacks, based on the used
browser. Thus we need this init being processed before setup_sorting.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:03 -03:00
Jiri Olsa
3f931f2c42 perf hists: Make hpp setup function generic
Now that we have the 'equal' method implemented for hpp format entries
we can ease up the logic in the following functions and make them
generic wrt comparing format entries:

  perf_hpp__setup_output_field
  perf_hpp__append_sort_keys

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:02 -03:00
Jiri Olsa
c0020efa07 perf hists: Add 'hpp__equal' callback function
Adding 'hpp__equal' callback function to compare hpp output format
entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 12:24:01 -03:00
Jiri Olsa
97358084b9 perf hists: Add 'equal' method to perf_hpp_fmt struct
To easily compare format entries and make it available for all kinds of
format entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 11:13:12 -03:00
Jiri Olsa
2e8b79e706 perf hists: Use struct perf_hpp_fmt::idx in perf_hpp__reset_width
We are going to add dynamic hpp format fields, so we need to make the
'len' change for the format itself, not in the perf_hpp__format
template.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 11:13:12 -03:00
Jiri Olsa
b21a763edd perf hists: Add _idx fields into struct perf_hpp_fmt
Currently there's no way of comparing hpp format entries, which is
needed in following patches.

Adding _idx fields into struct perf_hpp_fmt to recognize and be able to
compare hpp format entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 11:13:11 -03:00
Jiri Olsa
452ce03b1e perf hists: Introduce perf_evsel__output_resort function
Adding evsel specific function to sort hists_evsel based hists. The
hists__output_resort can be now used to sort common hists object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 11:13:11 -03:00
Jiri Olsa
01441af5df perf hists: Factor output_resort from hists__output_resort
Currently hists__output_resort() depends on hists based on hists_evsel
struct, but we need to be able to sort common hists as well.

Cutting out the sorting base sorting code into output_resort
function, so it can be reused in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-03 11:13:11 -03:00
Ingo Molnar
8eb22c984e perf/core callchain fixes and improvements
User visible:
 
 - Make --percent-limit apply to callchains also and fix some bugs
   related to --percent-limit (Namhyung Kim)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWr9ItAAoJENZQFvNTUqpAmDcP/A0DqDIsttDMrJtsTb+DeMEt
 HaOjOZT9usZW8khAkEsk62i6lg/ewVqajalzFkfByG0gAJ5u/Xi1YQo64MzXpLjV
 DZ0Nlqujhb6PowO9eRPra6UAEiq+88Gzn+y+XzYqVsPVLAK/d8Ck9ALWo33gIBhc
 uq32fpp79zrCgfq8pOhvWMaMmRqmpyUiwCjiFCgUs1FD2NjdwGWSfH6XqxVdojVv
 /s1agYu+E9WJ74Df2upoIUxiFcG4+aT6Y4li3N1XaATrWoiqrkSyp1uVwOZ9H4i0
 9OyIhDzR0aar8z0aVJJmccqfGpC9LLWaf5YkYqK6A8vI0x5FyCu4TieeKCMJ5k7S
 1AO2E6FGsQ/vOJx/LvVGrEAmUog/kZ8q4OmudpmGBcHJ9PGHpnUg/6uAij2Nwyxo
 68oL4kgZFTrC5Cxdr1W+8Z/4Z9piNzArs2SSr5PfHWyzAB35WEKXCwoDy1uQ2q7d
 XIUa+6Gvldc5iRjrulY8YCqwhltfx9LiCWdOYmEpS2BGIeWzTQIinYNzVwCTP7Av
 tsLKaGx4/O5iZf1yuMaOXx9nXK6N87gb9il8sSQD2AZVPIkBTkE5mKYycqXblqUV
 wFH4oZ4QKTPnbwV2gOHsjOKABhsm6Jop8vpgKZtF3May5K9lNx6Ivq4KyqA+uSht
 BpYuVeCKwKHyT2uwSQf4
 =3p/S
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-3' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core callchain fixes and improvements from Arnaldo Carvalho de Melo <acme@redhat.com:

User visible changes:

  - Make --percent-limit apply to callchains also and fix some bugs
    related to --percent-limit (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-03 11:02:37 +01:00
Ingo Molnar
d072b89d3b New features:
- Port 'perf kvm stat' to PowerPC (Hemant Kumar)
 
 Infrastructure:
 
 - Use the 'feature-dump' target to do the feature checks just once and then
   add code to reuse that in the tests/make makefile, speeding up the
   'make -C tools/perf build-test' target (Wang Nan)
 
 - Reduce the number of tests the 'build-test' target do to those that don't
   pollute the source tree (Arnaldo Carvalho de Melo)
 
 - Improve the output of the build tests a bit by aligning the name of the
   tests, more can be done to filter out uninteresting info in the output
   (Arnaldo Carvalho de Melo)
 
 - Add perf_evlist pointer to *info_priv_size(), more prep work for
   supporting the coresight architecture (Mathieu Poirier)
 
 - Improve the 'perf test bp_signal' test (Wang Nan)
 
 - Check environment before starting the BPF 'perf test', so that we can just
   'Skip' older kernels instead of 'FAIL'ing them (Wang Nan)
 
 - Fix cpumode of synthesized buildid event (Wang Nan)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWq9VUAAoJENZQFvNTUqpAi0QP/2i7itD/P9wkGLsPb+HbNlX+
 umTxbhdKkyLc9WI8hGfdXXtdHcdkHUAmuZ/DzMbOcnGUE0mJdL6dphslsm1VFslP
 Q9sAj43BjWddEKfka1ylos1u/nDhpdpRX7bkRaepA9Zl0P0BSPXv+S28GO3jttxX
 uadXN9K7Amsa8tibKicxgLTUhZH05lmhPO00xGHuhQ6EQHcaw8VDYUlA+Wrh+NIa
 jIVnRE5q/hBwOyFQR/1gal8N5w2vO0vCglQmGQTEDjgQVMf/cSZChUlVqtxcDxcu
 FIDE42+jAnbESmVkBHq2n8ZvNxHOVlG6hTqZOqeiqs+tyfw7fYnGf+tkFPgBIEXP
 hB/hwgCJVBbbYo5hzT12eBz7UeWwn1ljqTpTnBrCaOl05MwvN4bMAMFVBXPLQHtm
 47AsyaOXEli9RaRwgdcYGVUhqIPTa2Ql2vPRb1PmQ3ugBqqLyUpYOox8WUYQv2g9
 sd61KMoXxUiuNsoq0ZXXkjWBeEBz2joRQYrlBQ0tZR8m06UA8FXLUXFopAUZKHGh
 7w8BTXRRCc9lEm/pWfHjVykObRlHew0qcDihybtMVsNGpUQzqKh7A8b2DmMvmRrJ
 BmnUBQA8kFiE4BJSdOdqwH8PpDRYpTCg0a6cyK4RDlm7isX2ho40edstspEO1N4n
 BUG1zE5SIPC1o1MSFxBn
 =CFBX
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf tooling changes from Arnaldo Carvalho de Melo:

New features:

 - Port 'perf kvm stat' to PowerPC (Hemant Kumar)

Infrastructure changes:

 - Use the 'feature-dump' target to do the feature checks just once and then
   add code to reuse that in the tests/make makefile, speeding up the
   'make -C tools/perf build-test' target (Wang Nan)

 - Reduce the number of tests the 'build-test' target do to those that don't
   pollute the source tree (Arnaldo Carvalho de Melo)

 - Improve the output of the build tests a bit by aligning the name of the
   tests, more can be done to filter out uninteresting info in the output
   (Arnaldo Carvalho de Melo)

 - Add perf_evlist pointer to *info_priv_size(), more prep work for
   supporting the coresight architecture (Mathieu Poirier)

 - Improve the 'perf test bp_signal' test (Wang Nan)

 - Check environment before starting the BPF 'perf test', so that we can just
   'Skip' older kernels instead of 'FAIL'ing them (Wang Nan)

 - Fix cpumode of synthesized buildid event (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-03 11:00:29 +01:00
Ingo Molnar
402e8db5c1 perf/core improvements and fixes:
User visible:
 
 - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows",
   as it controls just the that UI element in the annotate browser (Taeung Song)
 
 - Avoid trying to read ELF symtabs from device files, noticed while doing
   memory profiling work (Jiri Olsa)
 
 - Improve context detection when offering options in the hists browser,
   i.e.  some options don't make sense when the browser is not working with
   a perf.data file ('perf top' mode), only in 'perf report' mode, like
   scripting (Namhyung Kim)
 
 Infrastructure:
 
 - Elliminate duplication in the hists browser filter functions, getting the
   common part into a function that receives callbacks for filtering by
   DSO, thread, etc (Namhyung Kim)
 
 - Fix misleadingly indented assignment, found using
   gcc6 -Wmisleading-indentation (Markus Trippelsdorf)
 
 - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that
   detects such problems and then fixing the problem, so that the test now
   passes (Wang Nan)
 
 - More improvements to the build infrastructure to allow reusing the
   feature detection facilities (Wang Nan)
 
 - Auto initialize the globals needed by cpu__max_{cpu,node}() routines
   (Arnaldo Carvalho de Melo)
 
 Documentation:
 
 - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings)
 
 - Document a bunch more ~/.perfconfig knobs (Taeung Song)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWp8VgAAoJENZQFvNTUqpA8gAQAJY4pLDDeK6rZAqAY5fD//JJ
 aETuF0icXErkboef5usFk0MGVzK8HOWFOD5YnVAPXsGRqTHZ9ix3xw5sBFDaZKrP
 zagySidfHxrPDvtSW6doCjtg571dFaEHWUL48kT8ZpH9vwGDs42Gl/hjEY2P91zK
 uNktNoHvbHMUOxoMIp9zyCcV5WEWTog8RwCp53QrxxNrLYIT40wpADQIvuKNgqEP
 wIQyC2pgLv9ra27fXThauDes+a/TWLfURtxoeGgiDaIFmOi2t5VeN8D+DxXskKIB
 GtYF7Wxk5U+gELsAo5cZKS5Hyf13LqmwL4Jy/Th5jWaObyNXU2ZwnB3zXxZ3Dmvu
 keiOY8EmoOoKqOhjUVfsdvsVy0tNhObIJYhlqyOfQg+EqizR0PVlkDxWvODEKkkA
 T+dWXm183aXwCsHKM0EhAPgsVAJ/U9+lQjHro/lPq/i54oOogL/aBsVvUjNKo6Od
 m6q2ezgFZRHuPMLmOYhJaxtpvOirQkxORZZx2wgzgs5AsJly+ydoR3ETdhAD76Sg
 QGSKdTCziDA8KM0Vul6mjoqNlASpUM9cN6uLlv4c26pmf1krwleILMFqzaoYV3iE
 3y/ebiRyj2luwKSXELNjcs/7GzCfN3h8sjP6AQ+q0fuWH3zU6+J9oKi+KevYBr8J
 fFEX6MNxdxOY92mXDPZa
 =cuZc
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

 - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows",
   as it controls just the that UI element in the annotate browser (Taeung Song)

 - Avoid trying to read ELF symtabs from device files, noticed while doing
   memory profiling work (Jiri Olsa)

 - Improve context detection when offering options in the hists browser,
   i.e.  some options don't make sense when the browser is not working with
   a perf.data file ('perf top' mode), only in 'perf report' mode, like
   scripting (Namhyung Kim)

Infrastructure changes:

 - Elliminate duplication in the hists browser filter functions, getting the
   common part into a function that receives callbacks for filtering by
   DSO, thread, etc. (Namhyung Kim)

 - Fix misleadingly indented assignment, found using
   gcc6 -Wmisleading-indentation (Markus Trippelsdorf)

 - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that
   detects such problems and then fixing the problem, so that the test now
   passes (Wang Nan)

 - More improvements to the build infrastructure to allow reusing the
   feature detection facilities (Wang Nan)

 - Auto initialize the globals needed by cpu__max_{cpu,node}() routines
   (Arnaldo Carvalho de Melo)

Documentation changes:

 - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings)

 - Document a bunch more ~/.perfconfig knobs (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-03 10:58:46 +01:00
Hemant Kumar
270bde1e76 perf probe: Search both .eh_frame and .debug_frame sections for probe location
'perf probe' through debuginfo__find_probes() in util/probe-finder.c
checks for the functions' frame descriptions in either .eh_frame section
of an ELF or the .debug_frame.

The check is based on whether either one of these sections is present.
Depending on distro, toolchain defaults, architetcutre, build flags,
etc., CFI might be found in either .eh_frame and/or .debug_frame.
Sometimes, it may happen that, .eh_frame, even if present, may not be
complete and may miss some descriptions.

Therefore, to be sure, to find the CFI covering an address we will
always have to investigate both if available.

For e.g., in powerpc, this may happen:
  $ gcc -g bin.c -o bin

  $ objdump --dwarf ./bin
  <1><145>: Abbrev Number: 7 (DW_TAG_subprogram)
     <146> DW_AT_external   : 1
     <146> DW_AT_name       : (indirect string, offset: 0x9e): main
     <14a> DW_AT_decl_file  : 1
     <14b> DW_AT_decl_line  : 39
     <14c> DW_AT_prototyped : 1
     <14c> DW_AT_type       : <0x57>
     <150> DW_AT_low_pc     : 0x100007b8

If the .eh_frame and .debug_frame are checked for the same binary, we
will find that, .eh_frame (although present) doesn't contain a
description for "main" function.

But, .debug_frame has a description:

  000000d8 00000024 00000000 FDE cie=00000000 pc=100007b8..10000838
    DW_CFA_advance_loc: 16 to 100007c8
    DW_CFA_def_cfa_offset: 144
    DW_CFA_offset_extended_sf: r65 at cfa+16
  ...

Due to this (since, perf checks whether .eh_frame is present and goes on
searching for that address inside that frame), perf is unable to process
the probes:

  # perf probe -x ./bin main
    Failed to get call frame on 0x100007b8
    Error: Failed to add events.

To avoid this issue, we need to check both the sections (.eh_frame and
.debug_frame), which is done in this patch.

Note that, we can always force everything into both .eh_frame and
.debug_frame by:

  $ gcc bin.c -fasynchronous-unwind-tables  -fno-dwarf2-cfi-asm -g -o bin

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1454426806-13974-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-02 13:30:16 -03:00
Adrian Hunter
3a4acda1ec perf tools: Fix thread lifetime related segfaut in intel_pt
intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread
that eventually needs to be freed by the last thread__put() on it, when
its refcount hits zero, which may happen in
intel_pt_process_auxtrace_info() error handling path and triggers the
following segfault, which would happen as well at intel_pt_free, when
tools using this intel_pt codebase frees up resources:

  # perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  0  a  anaconda-ks.cfg  bin   perf.data	perf.data.old  perf-f23-bringup.todo
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.217 MB perf.data ]
  #
  # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
  Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
  intel_pt_synth_events: failed to synthesize 'instructions' event type
  Segmentation fault (core dumped)
  #

The problem is: there's a union in 'struct thread' combines a list_head
and a rb_node. The standard life cycle of a thread is: init rb_node in
the constructor, insert it into machine->threads rbtree using rb_node,
move it to machine->dead_threads using list_head, clean in the last
thread__put: list_del_init(&thread->node).

In the above command, it clean a thread before adding it into list,
causes the above segfault.

Since pt->unknown_thread will never live in an rbtree, initialize its
list node so that when list_del_init() is done on it we don't segfault.

After this patch:

  # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
  Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
  intel_pt_synth_events: failed to synthesize 'instructions' event type
  0x248 [0x88]: failed to process type: 70
  #

Reported-by: Tong Zhang <ztong@vt.edu>
Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/1454296865-19749-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-02 12:51:11 -03:00
Namhyung Kim
3848c23b19 perf report: Don't show blank lines if entry has no callchain
When all callchains of a hist entry is percent-limited, do not add a
blank line at the end.  It makes the entry look like it doesn't have
callchains.

Reported-and-Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160128122454.GA27446@danjae.kornet
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:51:09 -03:00
Namhyung Kim
59c624e239 perf hists browser: Fix percent display in callchains
When there's only a single callchain, perf doesn't print its percentage
in front of the symbols.  This is because it assumes that the percentage
is same as parents.  But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown.  In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.

For example, let's see the following callchain.

  $ perf report --no-children --percent-limit 0.01 --tui
  ...
  -    0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - 0.04% mmap_region
             do_mmap_pgoff
           - vm_mmap_pgoff
              + 0.02% sys_mmap_pgoff
              + 0.02% vm_mmap
           + 0.02% mprotect_fixup

Current code omits the percent if 'mmap_region' becomes the only node
when percent limit is set to 0.03%, its percent is not 0.06% but users
will assume it incorrectly.

Before:

  $ perf report --no-children --percent-limit 0.03 --tui
  ...
     0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - mmap_region
          do_mmap_pgoff
          vm_mmap_pgoff

After:

  $ perf report --no-children --percent-limit 0.03 --tui
  ...
     0.06%  sleep    [kernel.vmlinux]    [k] kmem_cache_alloc_trace
       kmem_cache_alloc_trace
     - perf_event_mmap
        - 0.04% mmap_region
             do_mmap_pgoff
             vm_mmap_pgoff

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:46:27 -03:00
Namhyung Kim
5eca104eee perf hists browser: Pass parent_total to callchain print functions
Pass parent node's total period to callchain print functions.  This info
is needed by later patch to determine whether it can omit percent or not
correctly.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:45:42 -03:00
Namhyung Kim
0c841c6c16 perf hists browser: Fix dump to show correct callchain style
The commit 8c430a3486 ("perf hists browser: Support folded
callchains") missed to update hist_browser__dump() so it always shows
graph-style callchains regardless of current setting.

To fix that, factor out callchain printing code and rename the existing
function which prints graph-style callchain.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8c430a3486 ("perf hists browser: Support folded callchains")
Link: http://lkml.kernel.org/r/1453909257-26015-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:44:30 -03:00
Namhyung Kim
7ed5d6e28a perf report: Fix percent display in callchains on --stdio
When there's only a single callchain, perf doesn't print its percentage
in front of the symbols.  This is because it assumes that the percentage
is same as parents.  But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown.  In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.

For example, let's see the following callchain.

  $ perf report -s comm --percent-limit 0.01 --stdio
  ...
     9.95%  swapper
            |
            |--7.57%--intel_idle
            |          cpuidle_enter_state
            |          cpuidle_enter
            |          call_cpuidle
            |          cpu_startup_entry
            |          |
            |          |--4.89%--start_secondary
            |          |
            |           --2.68%--rest_init
            |                     start_kernel
            |                     x86_64_start_reservations
            |                     x86_64_start_kernel
	    |
	    |--0.15%--__schedule
	    |          |
	    |          |--0.13%--schedule
	    |          |          schedule_preempt_disable
	    |          |          cpu_startup_entry
            |          |          |
            |          |          |--0.09%--start_secondary
            |          |          |
            |          |           --0.04%--rest_init
            |          |                     start_kernel
            |          |                     x86_64_start_reservations
            |          |                     x86_64_start_kernel
            |          |
            |           --0.01%--schedule_preempt_disabled
            |                     cpu_startup_entry
  ...

Current code omits the percent if 'intel_idle' becomes the only node
when percent limit is set to 0.5%, its percent is not 9.95% but users
will assume it incorrectly.

Before:

  $ perf report --percent-limit 0.5 --stdio
  ...
     9.95%  swapper
            |
            ---intel_idle
               cpuidle_enter_state
               cpuidle_enter
               call_cpuidle
               cpu_startup_entry
               |
               |--4.89%--start_secondary
               |
                --2.68%--rest_init
                          start_kernel
                          x86_64_start_reservations
                          x86_64_start_kernel

After:

  $ perf report --percent-limit 0.5 --stdio
  ...
     9.95%  swapper
            |
             --7.57%--intel_idle
                       cpuidle_enter_state
                       cpuidle_enter
                       call_cpuidle
                       cpu_startup_entry
                       |
                       |--4.89%--start_secondary
                       |
                        --2.68%--rest_init
                                  start_kernel
                                  x86_64_start_reservations
                                  x86_64_start_kernel

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:41:32 -03:00
Namhyung Kim
54d27b3119 perf callchain: Pass parent_samples to __callchain__fprintf_graph()
Pass hist entry's period to graph callchain print function.  This info
is needed by later patch to determine whether it can omit percentage of
top-level node or not.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:39:52 -03:00
Namhyung Kim
7e597d327e perf report: Get rid of hist_entry__callchain_fprintf()
It's just a wrapper function to align the start position ofcallchains to
'comm' of each thread if it's a first sort key.  But it doesn't not work
with tracepoint events and also with upcoming hierarchy view.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:20:31 -03:00
Namhyung Kim
2665b4528d perf report: Apply --percent-limit to callchains also
Currently --percent-limit option only works for hist entries.  However
it'd be better to have same effect to callchains as well

Requested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 17:12:00 -03:00
Namhyung Kim
0f58474ec8 perf hists: Update hists' total period when adding entries
Currently the hist entry addition path doesn't update total_period of
hists and it's calculated during 'resort' path.  But the resort path
needs to know the total period before doing its job because it's used
for calculating percent limit of callchains in hist entries.

So this patch update the total period during the addition path.  It
makes the percent limit of callchains working (again).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 16:45:44 -03:00
Namhyung Kim
744070e0e4 perf hists: Fix min callchain hits calculation
The total period should be get using hists__total_period() since it
takes filtered entries into account.  In addition, if callchain mode is
'fractal', the total period should be the entry's period.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 16:42:31 -03:00
Adrian Hunter
ec183d22cc perf tools: tracepoint_error() can receive e=NULL, robustify it
Fixes segmentation fault using, for instance:

  (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib64/libthread_db.so.1".

 Program received signal SIGSEGV, Segmentation fault.
  0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  (gdb) bt
  #0  0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  #1  0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:433
  #2  0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:498
  #3  0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:936
  #4  0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
  #5  0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
  #6  0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
  #7  0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
  #8  0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
  #9  0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
  #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
  #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
  #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
  #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
  #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
  #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
(gdb)

Intel PT attempts to find the sched:sched_switch tracepoint but that seg
faults if tracefs is not readable, because the error reporting structure
is null, as errors are not reported when automatically adding
tracepoints.  Fix by checking before using.

Committer note:

This doesn't take place in a kernel that supports
perf_event_attr.context_switch, that is the default way that will be
used for tracking context switches, only in older kernels, like 4.2, in
a machine with Intel PT (e.g. Broadwell) for non-priviledged users.

Further info from a similar patch by Wang:

The error is in tracepoint_error: it assumes the 'e' parameter is valid.

However, there are many situation a parse_event() can be called without
parse_events_error. See result of

  $ grep 'parse_events(.*NULL)' ./tools/perf/ -r'

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tong Zhang <ztong@vt.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 196581717d ("perf tools: Enhance parsing events tracepoint error output")
Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-01 11:51:15 -03:00
Linus Torvalds
29d14f0835 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "This is much bigger than typical fixes, but Peter found a category of
  races that spurred more fixes and more debugging enhancements.  Work
  started before the merge window, but got finished only now.

  Aside of that this contains the usual small fixes to perf and tools.
  Nothing particular exciting"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  perf: Remove/simplify lockdep annotation
  perf: Synchronously clean up child events
  perf: Untangle 'owner' confusion
  perf: Add flags argument to perf_remove_from_context()
  perf: Clean up sync_child_event()
  perf: Robustify event->owner usage and SMP ordering
  perf: Fix STATE_EXIT usage
  perf: Update locking order
  perf: Remove __free_event()
  perf/bpf: Convert perf_event_array to use struct file
  perf: Fix NULL deref
  perf/x86: De-obfuscate code
  perf/x86: Fix uninitialized value usage
  perf: Fix race in perf_event_exit_task_context()
  perf: Fix orphan hole
  perf stat: Do not clean event's private stats
  perf hists: Fix HISTC_MEM_DCACHELINE width setting
  perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
  perf tests: Remove wrong semicolon in while loop in CQM test
  perf: Synchronously free aux pages in case of allocation failure
  ...
2016-01-31 15:38:27 -08:00
Arnaldo Carvalho de Melo
814568db64 perf build: Align the names of the build tests:
$ make -C tools/perf build-test
  make[1]: Entering directory `/home/acme/git/linux/tools/perf'
                 make_pure_O: cd . && make -f Makefile  O=/tmp/tmp.mPx0Cmik3f DESTDIR=/tmp/tmp.U0SUmVbtJm
            make_clean_all_O: cd . && make -f Makefile  O=/tmp/tmp.Yl5UzhTU7T DESTDIR=/tmp/tmp.fop1E4jdER clean all
                make_debug_O: cd . && make -f Makefile  O=/tmp/tmp.pMn2ozBoXC DESTDIR=/tmp/tmp.azxhDp5sEp DEBUG=1
           make_no_libperl_O: cd . && make -f Makefile  O=/tmp/tmp.qJPiINMtA7 DESTDIR=/tmp/tmp.KNMrLeGDxZ NO_LIBPERL=1
  <SNIP>

More needs to be done to make it more compact, i.e. elide the '-f Makefile',
remove that 'cd . &&', move the DESTDIR= and O= to the end, as they don't
convey that much information besides the fact that they are being set to some
random directory just for this build, move the meat, i.e. the meaningful
feature disabling bits to the start, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wir3w3o4f1nmbgcxgnx8cj9c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:51:04 -03:00
Hemant Kumar
78e6c39b23 perf kvm/powerpc: Add support for HCALL reasons
Powerpc provides hcall events that also provides insights into guest
behaviour. Enhance perf kvm stat to record and analyze hcall events.

 - To trace hcall events :
  perf kvm stat record

 - To show the results :
  perf kvm stat report --event=hcall

The result shows the number of hypervisor calls from the guest grouped
by their respective reasons displayed with the frequency.

This patch makes use of two additional tracepoints
"kvm_hv:kvm_hcall_enter" and "kvm_hv:kvm_hcall_exit". To map the hcall
codes to their respective names, it needs a mapping. Such mapping is
added in this patch in book3s_hcalls.h.

 # pgrep qemu
A sample output :
19378
60515

2 VMs running.

 # perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]

 # perf kvm stat report -p 60515 --event=hcall

Analyze events for all VMs, all VCPUs:

    HCALL-EVENT Samples Samples% Time% MinTime MaxTime  AvgTime

          H_IPI     822  66.08% 88.10% 0.63us  11.38us 2.05us (+- 1.42%)
     H_SEND_CRQ     144  11.58%  3.77% 0.41us   0.88us 0.50us (+- 1.47%)
   H_VIO_SIGNAL     118   9.49%  2.86% 0.37us   0.83us 0.47us (+- 1.43%)
H_PUT_TERM_CHAR      76   6.11%  2.07% 0.37us   0.90us 0.52us (+- 2.43%)
H_GET_TERM_CHAR      74   5.95%  2.23% 0.37us   1.70us 0.58us (+- 4.77%)
         H_RTAS       6   0.48%  0.85% 1.10us   9.25us 2.70us (+-48.57%)
      H_PERFMON       4   0.32%  0.12% 0.41us   0.96us 0.59us (+-20.92%)

Total Samples:1244, Total events handled time:1916.69us.

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott  Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-4-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:49:54 -03:00
Hemant Kumar
066d3593e1 perf kvm/powerpc: Port perf kvm stat to powerpc
perf kvm can be used to analyze guest exit reasons. This support already
exists in x86. Hence, porting it to powerpc.

 - To trace KVM events :
  perf kvm stat record
  If many guests are running, we can track for a specific guest by using
  --pid as in : perf kvm stat record --pid <pid>

 - To see the results :
  perf kvm stat report

The result shows the number of exits (from the guest context to
host/hypervisor context) grouped by their respective exit reasons with
their frequency.

Since, different powerpc machines have different KVM tracepoints, this
patch discovers the available tracepoints dynamically and accordingly
looks for them. If any single tracepoint is not present, this support
won't be enabled for reporting. To record, this will fail if any of the
events we are looking to record isn't available.  Right now, its only
supported on PowerPC Book3S_HV architectures.

To analyze the different exits, group them and present them (in a slight
descriptive way) to the user, we need a mapping between the "exit code"
(dumped in the kvm_guest_exit tracepoint data) and to its related
Interrupt vector description (exit reason). This patch adds this mapping
in book3s_hv_exits.h.

It records on two available KVM tracepoints for book3s_hv:

"kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter".

Here is a sample o/p:
 # pgrep qemu
19378
60515

2 Guests are running on the host.

 # perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]

 # perf kvm stat report -p 60515

Analyze events for pid(s) 60515, all VCPUs:

     VM-EXIT Samples Samples% Time% MinTime    MaxTime  Avg time

       SYSCALL  9141  63.67%  7.49% 1.26us   5782.39us    9.87us (+- 6.46%)
H_DATA_STORAGE  4114  28.66%  5.07% 1.72us   4597.68us   14.84us (+-20.06%)
HV_DECREMENTER   418   2.91%  4.26% 0.70us  30002.22us  122.58us (+-70.29%)
      EXTERNAL   392   2.73%  0.06% 0.64us    104.10us    1.94us (+-18.83%)
RETURN_TO_HOST   287   2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%)
H_INST_STORAGE     5   0.03%  0.00% 1.88us      3.73us    2.39us (+-14.20%)

Total Samples:14357, Total events handled time:1203918.42us.

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott  Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-3-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:49:54 -03:00
Hemant Kumar
48deaa74fc perf kvm/{x86,s390}: Remove const from kvm_events_tp
This patch removes the "const" qualifier from kvm_events_tp declaration
to account for the fact that some architectures may need to update this
variable dynamically. For instance, powerpc will need to update this
variable dynamically depending on the machine type.

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott  Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-2-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:49:53 -03:00
Hemant Kumar
162607ea20 perf kvm/{x86,s390}: Remove dependency on uapi/kvm_perf.h
Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic
discovery of kvm events (if its needed). To do this, some extern
variables have been introduced with which we can keep the generic
functions generic.

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott  Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:49:48 -03:00
Wang Nan
d2db9a98c3 perf record: Use OPT_BOOLEAN_SET for buildid cache related options
'perf record' knows whether buildid cache is enabled (via
--no-no-buildid-cache) deliberately. Buildid cache can be turned off in
some situations.

Output switching support needs this feature to turn off buildid cache
by default.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-33-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:39:07 -03:00
Wang Nan
37b20151ef perf tools: Move timestamp creation to util
Timestamp generation becomes a public available helper. Which will
be used by 'perf record', help it output to split output file based
on time.

For example:

 perf.data.2015122620363710
 perf.data.2015122620364092
 perf.data.2015122620365423
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-27-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:30:06 -03:00
Wang Nan
8fd34e1cce perf test: Improve bp_signal
Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:

 1. A watchpoint and a breakpoint that fire on the same instruction
 2. Nested signals

Test result:

 On x86_64 and ARM64 (result are similar with patch [2] on ARM64):

  # ./perf test -v signal
  17: Test breakpoint overflow signal handler                  :
  --- start ---
  test child forked, pid 10213
  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
  test child finished with 0
  ---- end ----
  Test breakpoint overflow signal handler: Ok

So at least 2 cases Will doubted are handled correctly.

[1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
[2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:28:46 -03:00
Wang Nan
6a7d550e8b perf test: Check environment before start real BPF test
Copying perf to old kernel system results:

  # perf test bpf
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : FAILED!
  37.2: Test BPF prologue generation                           : Skip

However, in case when kernel doesn't support a test case it should
return 'Skip', 'FAILED!' should be reserved for kernel tests for when
the kernel supports a feature that then fails to work as advertised.

This patch checks environment before real testcase.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:25:43 -03:00
Wang Nan
fd786fac78 perf buildid: Fix cpumode of buildid event
There is a nasty confusion that, for kernel module, dso->kernel is not
necessary to be DSO_TYPE_KERNEL or DSO_TYPE_GUEST_KERNEL.  These two
enums are for vmlinux. See thread [1]. We tried to fix this part but it
is costy.

Code machine__write_buildid_table() is another unfortunate function fall
into this trap that, when issuing buildid event for a kernel module,
cpumode it gives to the event is PERF_RECORD_MISC_USER, not
PERF_RECORD_MISC_KERNEL.

However, even with this bug, most of the time it doesn't causes real
problem. I find this issue when trying to use a perf before commit
3d39ac5386 ("perf machine: No need to have two DSOs lists") to parse a
perf.data generated by newest perf.

[1] https://lkml.org/lkml/2015/9/21/908

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454089251-203152-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:16:25 -03:00
Mathieu Poirier
14a05e13a0 perf auxtrace: Add perf_evlist pointer to *info_priv_size()
On some architecture the size of the private header may be dependent on
the number of tracers used in the session.  As such adding a "struct
perf_evlist *" parameter, which should contain all the required
information.

Also adjusting the existing client of the interface to take the new
parameter into account.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Chunyan Zhang <zhang.chunyan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Mike Leach <mike.leach@arm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabin@rab.in>
Cc: Tor Jeremiassen <tor@ti.com>
Link: http://lkml.kernel.org/r/1452807977-8069-22-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 17:14:30 -03:00
Arnaldo Carvalho de Melo
a639a62390 perf tools: Speed up build-tests by reducing the number of builds tested
The 'tools/perf/test/make' makefile has in its default, 'all' target
builds that will pollute the source code directory, i.e. that will not
use O= variable.

The 'build-test' should be run as often as possible, preferrably after
each non strictly non-code commit, so speed it up by selecting just
the O= targets.

Furthermore it tests both the Makefile.perf file, that is normally
driven by the main Makefile, and the Makefile, reduce the time in half
by having just MK=Makefile, the most usual, tested by 'build-test'.

Please run:

  make -C tools/perf -f tests/make

from time to time for testing also the in-place build tests.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jrt9utscsiqkmjy3ccufostd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 16:57:38 -03:00
Wang Nan
79191c89a0 perf build: Use feature dump file for build-test
To prevent the feature check tests to run repeately, one time per
'tests/make' target/test, this patch utilizes the previously introduced
'feature-dump' make target and FEATURES_DUMP variable, making sure that
the feature checkers run only once when doing build-test for normal test
cases.

However, since standard users doesn't reuse features dump result, we'd
better give an option to check their behaviors. The above feature
should be used to make build-test faster only. Only utilize it for
build-test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454068269-235999-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 14:47:52 -03:00
Wang Nan
5a155bb77a perf build: Remove all condition feature check {C,LD}FLAGS
'make feature-dump' should give a stable result, so even 'NO_SOMETHING=1'
is given (for babeltrace, if LIBBABELTRACE=1 is not given), we should
try to detect those feature and {C,LD}FLAGS. Build or not should be
controled independent.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1454047050-204993-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-29 14:47:52 -03:00
Arnaldo Carvalho de Melo
5ac76283b3 perf cpumap: Auto initialize cpu__max_{node,cpu}
Since it was always checking if the initialization was done, use that
branch to do the initialization if not done already.

With this we reduce the number of exported globals from these files.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160125212955.GG22501@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 16:08:36 -03:00
Namhyung Kim
b1baae8919 perf hists browser: Skip scripting when perf.data file not available
The script and data-switch context menu are only meaningful when it
deals with a data file.  So add a check so that it cannot be shown when
perf-top is run.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453555902-18401-4-git-send-email-namhyung@kernel.org
[ Use goto skip_scripting instead of two is_report_browser() tests ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 16:08:12 -03:00
Wang Nan
c053a1506f perf build: Select all feature checkers for feature-dump
Set FEATURE_TESTS to 'all' so all possible feature checkers are
executed. Without this setting the output feature dump file miss some
feature, for example, liberity. Select all checker so we won't get an
incomplete feature dump file.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1453715801-7732-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 13:53:10 -03:00
Wang Nan
7b6982ce4b perf test: Add libbpf relocation checker
There's a bug in LLVM that it can generate unneeded relocation
information. See [1] and [2]. Libbpf should check the target section of
a relocation symbol.

This patch adds a testcase which references a global variable (BPF
doesn't support global variables). Before fixing libbpf, the new test
case can be loaded into kernel, the global variable acts like the first
map. It is incorrect.

Result:

  # ~/perf test BPF
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : FAILED!

  # ~/perf test -v BPF
  ...
  libbpf: loading object '[bpf_relocation_test]' from buffer
  libbpf: section .strtab, size 126, link 0, flags 0, type=3
  libbpf: section .text, size 0, link 0, flags 6, type=1
  libbpf: section .data, size 0, link 0, flags 3, type=1
  libbpf: section .bss, size 0, link 0, flags 3, type=8
  libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
  libbpf: found program func=sys_write
  libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
  libbpf: section maps, size 16, link 0, flags 3, type=1
  libbpf: maps in [bpf_relocation_test]: 16 bytes
  libbpf: section license, size 4, link 0, flags 3, type=1
  libbpf: license of [bpf_relocation_test] is GPL
  libbpf: section version, size 4, link 0, flags 3, type=1
  libbpf: kernel version of [bpf_relocation_test] is 40400
  libbpf: section .symtab, size 144, link 1, flags 0, type=2
  libbpf: map 0 is "my_table"
  libbpf: collecting relocating info for: 'func=sys_write'
  libbpf: relocation: insn_idx=7
  Success unexpectedly: libbpf error when dealing with relocation
  test child finished with -1
  ---- end ----
  Test BPF filter subtest 2: FAILED!

[1] https://llvm.org/bugs/show_bug.cgi?id=26243
[2] https://patchwork.ozlabs.org/patch/571385/

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 12:10:55 -03:00
Arnaldo Carvalho de Melo
ab414dcda8 perf test: Fixup aliases checking in the 'vmlinux matches kallsyms' test
There are cases where looking at just the next and prev entries is not
enough, like with:

  $ readelf -sW /usr/lib/debug/lib/modules/4.3.3-301.fc23.x86_64/vmlinux | grep ffffffff81065ec0
   4979: ffffffff81065ec0 53 FUNC  LOCAL  DEFAULT 1 try_to_free_pud_page
   4980: ffffffff81065ec0 53 FUNC  LOCAL  DEFAULT 1 try_to_free_pte_page
   4981: ffffffff81065ec0 53 FUNC  LOCAL  DEFAULT 1 try_to_free_pmd_page

So just search by name to see if the symbol is in kallsyms.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jj1vlljg7ol4i713l60rt5ai@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:52 -03:00
Arnaldo Carvalho de Melo
8acd3da03c perf machine: Introduce machine__find_kernel_symbol_by_name()
To be used in the 'vmlinux matches kallsyms' 'perf test'  entry.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m56g1853lz2c6nhnqxibq4jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:51 -03:00
Namhyung Kim
4056132e10 perf hists browser: Offer non-symbol specific menu options for --sort without 'sym'
Now that we check more strictly what each of the menu entries need, we
can stop bailing out when 'sym' is not in the --sort order, instead we
let each option be added if what it needs is present.

This way, for instance, we can run scripts on all samples, see DSO map
details when 'dso' is in the --sort provided, etc.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:51 -03:00
Namhyung Kim
d9695d9f93 perf hists browser: Be a bit more strict about presenting CPU socket zoom
For consistency with the other sort order checks.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch, moved check to add_socket_opt() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:50 -03:00
Namhyung Kim
b1447a54f5 perf hists browser: Offer 'Zoom into DSO'/'Map details' only when sort order has 'dso'
We can't offer a zoom into DSO when a bucket (struct hist_entry) may
have samples for more than one DSO, i.e. when 'dso' is not part of
the sort order, ditto for 'Map details', fix it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch, moved check to add_{dso,map}_opt() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:50 -03:00
Namhyung Kim
c221acb0f9 perf hists browser: Only offer symbol scripting when a symbol is under the cursor
When this feature was introduced a check was made if there was a
resolved symbol under the cursor, it got lost in commit ea7cd59233
("perf hists browser: Split popup menu actions - part 2"), reinstate it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ea7cd59233 ("perf hists browser: Split popup menu actions - part 2")
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:49 -03:00
Namhyung Kim
2eafd410e6 perf hists browser: Only 'Zoom into thread' only when sort order has 'pid'
We can't offer a zoom into thread when a bucket (struct hist_entry) may
have samples for more than one thread, i.e. when 'pid' is not part of
the sort order, fix it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:49 -03:00
Namhyung Kim
cfd92dadc5 perf sort: Provide a way to find out if per-thread bucketing is in place
Now the UI browsers will be able to offer thread related operations only
if the thread is part of the sort order in use, i.e. if hist_entry stats
are all for a single thread.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a  larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:49 -03:00
Taeung Song
485311d978 perf config: Document 'hist.percentage' variable in man page
Explain 'hist.percentage' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-7-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:48 -03:00
Taeung Song
3b97629d13 perf config: Document variables for 'annotate' section in man page
Explain 'annotate' section and its variables.

'hide_src_code', 'use_offset', 'jump_arrows',
'show_linenr', 'show_nr_jump' and 'show_total_period'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-5-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:48 -03:00
Taeung Song
2733525b8c perf config: Document 'buildid.dir' variable in man page
Explain 'buildid.dir' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-4-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:47 -03:00
Taeung Song
3fa9f40718 perf config: Document variables for 'tui' and 'gtk' sections in man page
Explain 'tui' and 'gtk' sections and these variables.

'top', 'report' and 'annotate'

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:47 -03:00
Taeung Song
89debf1787 perf config: Document variables for 'colors' section in man page
Explain 'colors' section and its variables, used for The variables for
customizing the colors used in the output for the 'report', 'top' and
'annotate' in the TUI, those are:

'top', 'medium', 'normal', 'selected',
'jump_arrows', 'addr' and 'root'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:46 -03:00
Taeung Song
78ce08dfbd perf annotate: Rename 'colors.code' to 'colors.jump_arrows'
USe 'jump_arrows' config name instead of 'code' on 'colors' section.
'colors.code' config is only for jump arrows on assembly code listings
i.e.

    │     ┌──jmp    1333
    │     │  xchg   %ax,%ax
    │     │  mov    %r15,%r10
    │     └─→cmp    %r15,%r14

But this config name seems unfit.

 'jump_arrows' is more descriptive than 'code'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452240971-25418-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:46 -03:00
Ben Hutchings
3379e0c3ef perf tools: Document the perf sysctls
perf_event_paranoid was only documented in source code and a perf error
message.  Copy the documentation from the error message to
Documentation/sysctl/kernel.txt.

perf_cpu_time_max_percent was already documented but missing from the
list at the top, so add it there.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/20160119213515.GG2637@decadent.org.uk
[ Remove reference to external Documentation file, provide info inline, as before ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:45 -03:00
Namhyung Kim
1f7c254132 perf hists: Cleanup filtering functions
The hists__filter_by_xxx functions share same logic with different
filters.  Factor out the common code into the hists__filter_by_type.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453252521-24398-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:45 -03:00
Namhyung Kim
c84a5d1671 perf hists: Remove parent filter check in DSO filter function
The --exclude-other option sets HIST_FILTER__PARENT bit and it's only
set when a hist entry was created.  DSO filters don't change this so no
need to have the check in hists__filter_by_dso() IMHO.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453252521-24398-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:44 -03:00
Jiri Olsa
86a2cf3123 perf stat: Making several helper functions static
There's no need for the following functions to be global:

  perf_evsel__reset_stat_priv
  perf_evsel__alloc_stat_priv
  perf_evsel__free_stat_priv
  perf_evsel__alloc_prev_raw_counts
  perf_evsel__free_prev_raw_counts
  perf_evsel__alloc_stats

They all ended up in util/stat.c, and they no longer need to be called
from outside this object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453290995-18485-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:43 -03:00
Jiri Olsa
403567217d perf symbols: Do not read symbols/data from device files
With mem sampling we could get data source within mapped device file.
Processing such sample would block during report phase on trying to read
the device file.

Chacking for device files and skip the processing if it's detected.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453290995-18485-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:43 -03:00
Markus Trippelsdorf
d85ce830ee perf pmu: Fix misleadingly indented assignment (whitespace)
One line in perf_pmu__parse_unit() is indented wrongly, leading to a
warning (=> error) from gcc 6:

  util/pmu.c:156:3: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]

    sret = read(fd, alias->unit, UNIT_MAX_LEN);
    ^~~~

  util/pmu.c:153:2: note: ...this 'if' clause, but it is not
    if (fd == -1)
    ^~

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 410136f5dd ("tools/perf/stat: Add event unit and scale support")
Link: http://lkml.kernel.org/r/20151214154440.GC1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:52:42 -03:00
Jiri Olsa
3f416f22d1 perf stat: Do not clean event's private stats
Mel reported stddev reporting was broken due to following commit:

	106a94a0f8 ("perf stat: Introduce read_counters function")

This commit merged interval and overall counters reading into single
read_counters function.

The old interval code cleaned the stddev data for some reason (it's
never displayed in interval mode) and the mentioned commit kept on
cleaning the stddev data in merged function, which resulted in the
stddev not being displayed.

Removing the wrong stddev data cleanup init_stats call.

Reported-and-Tested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # v4.2+
Fixes: 106a94a0f8 ("perf stat: Introduce read_counters function")
Link: http://lkml.kernel.org/r/1453290995-18485-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:15:11 -03:00
Jiri Olsa
0805909f59 perf hists: Fix HISTC_MEM_DCACHELINE width setting
Set correct width for unresolved mem_dcacheline addr.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 9b32ba71ba ("perf tools: Add dcacheline sort")
Link: http://lkml.kernel.org/r/1453290995-18485-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:14:55 -03:00
Markus Trippelsdorf
d4913cbd05 perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
The issue was pointed out by gcc-6's -Wmisleading-indentation.

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: c97cf42219 ("perf top: Live TUI Annotation")
Link: http://lkml.kernel.org/r/20151214154403.GB1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:14:25 -03:00
Markus Trippelsdorf
cf89813a5b perf tests: Remove wrong semicolon in while loop in CQM test
The while loop was spinning. Fix by removing a semicolon.

The issue was pointed out by gcc-6's -Wmisleading-indentation.

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 035827e9f2 ("perf tests: Add Intel CQM test")
Link: http://lkml.kernel.org/r/20151214154335.GA1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-26 11:14:06 -03:00
Linus Torvalds
048ccca8c1 Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in
   ib_device struct
 - Move iopoll out of block and into lib, rename to irqpoll, and use
   in several places in the rdma stack as our new completion queue
   polling library mechanism.  Update the other block drivers that
   already used iopoll to use the new mechanism too.
 - Replace the per-entry GID table locks with a single GID table lock
 - IPoIB multicast cleanup
 - Cleanups to the IB MR facility
 - Add support for 64bit extended IB counters
 - Fix for netlink oops while parsing RDMA nl messages
 - RoCEv2 support for the core IB code
 - mlx4 RoCEv2 support
 - mlx5 RoCEv2 support
 - Cross Channel support for mlx5
 - Timestamp support for mlx5
 - Atomic support for mlx5
 - Raw QP support for mlx5
 - MAINTAINERS update for mlx4/mlx5
 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates
 - Add support for remote invalidate to the iSER driver (pushed through the
   RDMA tree due to dependencies, acknowledged by nab)
 - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies,
   acknowledged by Bruce)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoSygAAoJELgmozMOVy/dDjsP/2vbTda2MvQfkfkGEZBQdJSg
 095RN0gQgCJdg78lAl8yuaK8r4VN/7uefpDtFdudH1I/Pei7X0wxN9R1UzFNG4KR
 AD53lz92IVPs15328SbPR2kvNWISR9aBFQo3rlElq3Grqlp0EMn2Ou1vtu87rekF
 aMllxr8Nl0uZhP+eWusOsYpJUUtwirLgRnrAyfqo2UxZh/TMIroT0TCx1KXjVcAg
 dhDARiZAdu3OgSc6OsWqmH+DELEq6dFVA5F+DDBGAb8bFZqlJc7cuMHWInwNsNXT
 so4bnEQ835alTbsdYtqs5DUNS8heJTAJP4Uz0ehkTh/uNCcvnKeUTw1c2P/lXI1k
 7s33gMM+0FXj0swMBw0kKwAF2d9Hhus9UAN7NwjBuOyHcjGRd5q7SAnfWkvKx000
 s9jVW19slb2I38gB58nhjOh8s+vXUArgxnV1+kTia1+bJSR5swvVoWRicRXdF0vh
 TvLX/BjbSIU73g1TnnLNYoBTV3ybFKQ6bVdQW7fzSTDs54dsI1vvdHXi3bYZCpnL
 HVwQTZRfEzkvb0AdKbcvf8p/TlaAHem3ODqtO1eHvO4if1QJBSn+SptTEeJVYYdK
 n4B3l/dMoBH4JXJUmEHB9jwAvYOpv/YLAFIvdL7NFwbqGNsC3nfXFcmkVORB1W3B
 KEMcM2we4bz+uyKMjEAD
 =5oO7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "Initial roundup of 4.5 merge window patches

   - Remove usage of ib_query_device and instead store attributes in
     ib_device struct

   - Move iopoll out of block and into lib, rename to irqpoll, and use
     in several places in the rdma stack as our new completion queue
     polling library mechanism.  Update the other block drivers that
     already used iopoll to use the new mechanism too.

   - Replace the per-entry GID table locks with a single GID table lock

   - IPoIB multicast cleanup

   - Cleanups to the IB MR facility

   - Add support for 64bit extended IB counters

   - Fix for netlink oops while parsing RDMA nl messages

   - RoCEv2 support for the core IB code

   - mlx4 RoCEv2 support

   - mlx5 RoCEv2 support

   - Cross Channel support for mlx5

   - Timestamp support for mlx5

   - Atomic support for mlx5

   - Raw QP support for mlx5

   - MAINTAINERS update for mlx4/mlx5

   - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates

   - Add support for remote invalidate to the iSER driver (pushed
     through the RDMA tree due to dependencies, acknowledged by nab)

   - Update to NFSoRDMA (pushed through the RDMA tree due to
     dependencies, acknowledged by Bruce)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
  IB/mlx5: Unify CQ create flags check
  IB/mlx5: Expose Raw Packet QP to user space consumers
  {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
  IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
  IB/mlx5: Add Raw Packet QP query functionality
  IB/mlx5: Add create and destroy functionality for Raw Packet QP
  IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
  IB/mlx5: Allocate a Transport Domain for each ucontext
  net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
  net/mlx5_core: Add RQ and SQ event handling
  net/mlx5_core: Export transport objects
  IB/mlx5: Expose CQE version to user-space
  IB/mlx5: Add CQE version 1 support to user QPs and SRQs
  IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
  IB/sa: Fix netlink local service GFP crash
  IB/srpt: Remove redundant wc array
  IB/qib: Improve ipoib UD performance
  IB/mlx4: Advertise RoCE v2 support
  IB/mlx4: Create and use another QP1 for RoCEv2
  IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
  ...
2016-01-23 18:45:06 -08:00
Jiri Olsa
96b9e70b8e perf build: Introduce FEATURES_DUMP make variable
Introducing FEATURES_DUMP make variable to provide features
detection dump file and bypass the feature detection.

The intention is to use this during build tests to skip
repeated features detection, like:

Get feature dump static build into /tmp/fd file:
  $ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ OFF ]

  SNIP

  FEATURE-DUMP file copied into /tmp/fd

Use /tmp/fd to build perf:
  $ make FEATURES_DUMP=/tmp/fd LDFLAGS=-static

  $ file perf
  perf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for ...

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Jiri Olsa
b8e52be00c perf build: Add feature-dump target
To provide FEATURE-DUMP into $(FEATURE_DUMP_COPY) if defined, with no
further action.

Get feature dump of the current build:
  $ make feature-dump
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]

  FEATURE-DUMP file available in FEATURE-DUMP

Get feature dump static build into /tmp/fd file:

  $ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ OFF ]

  SNIP

  FEATURE-DUMP file copied into /tmp/fd

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Wang Nan
c15e758c4b perf build: Pass O option to kernel makefile in build-test
Kernel makefile only follows an 'O' option passed from command line
explicitely. In build-test with 'O' option set, kernel makefile
contaminate kernel source directory. Build test also fail if we don't
create output directory manually.

K_O_OPT is added and passed to kernel makefile if 'O' is passed
to build-test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:32:00 -03:00
Wang Nan
68824de19a perf build: Test correct path of perf in build-test
If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
will fail because perf resides in a different directory. Fix this by
computing PERF_OUT according to 'O' and test correct output files.
For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
instead because the path is different from others ($(O)/perf vs
 $(O)/tools/perf).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Wang Nan
eb807730c0 perf build: Pass O option to Makefile.perf in build-test
Unlike tools/perf/Makefile, tools/perf/Makefile.perf obey 'O' option
when it is passed through cmdline only, due to code in
tools/scripts/Makefile.include:

 ifneq ($(O),)
 ifeq ($(origin O), command line)
 	...
 	ABSOLUTE_O := $(shell cd $(O) ; pwd)
 	OUTPUT := $(ABSOLUTE_O)/$(if $(subdir),$(subdir)/)
 endif
 endif

This patch passes 'O' to Makefile.perf through cmdline explicitly
to make it follow O variable during build-test.

'make clean' should have identical 'O' option with 'make'. If not,
config-clean may error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Wang Nan
7be43dfb1e perf build: Set parallel making options build-test
'make build-test' is painful because of time consuming. In a full test,
all test cases are built twice with tools/perf/Makefile and
tools/perf/Makefile.perf. 'Makefile' automatically computes parallel
options for make, but 'Makefile.perf' not, so all test cases is built
with one job. It is very slow.

This patch adds '-j' options to Makefile.perf testing. It computes
parallel building options like what tools/perf/Makefile does, and pass
'-j' option to Makefile.perf test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452687442-6186-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:59 -03:00
Ben Hutchings
40c4a0f92a perf symbols: Fix reading of build-id from vDSO
We need to use the long name (the filename) when reading the build-id
from a DSO.  Using the short name doesn't work for (at least) vDSOs.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160113172301.GT28542@decadent.org.uk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:58 -03:00
Ravi Bangoria
3caeaa5627 perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.

Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc35
("perf session: Remove wrappers to machines__find").

This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.

This patch is generated from acme perf/core branch.

Below I've mention an example that demonstrate the behaviour before and
after applying patch.

Before applying patch:
[Note: One needs to run guest before recording data in host]

  ravi@ravi-bangoria:~$ ./perf kvm record -a
  Warning:
  5903 unprocessable samples recorded.
  Do you have a KVM guest running and not using 'perf kvm'?
  [ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]

  ravi@ravi-bangoria:~$ ./perf kvm report --stdio
  Warning:
  5903 unprocessable samples recorded.
  Do you have a KVM guest running and not using 'perf kvm'?
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 285  of event 'cycles'
  # Event count (approx.): 88715406
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ......
  #

  # (For a higher level overview, try: perf report --sort comm,dso)
  #

After applying patch:

  ravi@ravi-bangoria:~$ ./perf kvm record -a
  [ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]

  ravi@ravi-bangoria:~$ ./perf kvm report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 17  of event 'cycles'
  # Event count (approx.): 700746
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
      34.19%  :5758    [unknown]         [g] 0xffffffff818682ab
      22.79%  :5758    [unknown]         [g] 0xffffffff812dc7f8
      22.79%  :5758    [unknown]         [g] 0xffffffff818650d0
      14.83%  :5758    [unknown]         [g] 0xffffffff8161a1b6
       2.49%  :5758    [unknown]         [g] 0xffffffff818692bf
       0.48%  :5758    [unknown]         [g] 0xffffffff81869253
       0.05%  :5758    [unknown]         [g] 0xffffffff81869250

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v3.19+
Fixes: 54245fdc35 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-15 16:31:58 -03:00
Namhyung Kim
34b7b0f95d perf tools: Fallback to srcdir/Documentation/tips.txt
Some people don't install perf, but just use compiled version in the
source.  Fallback to lookup the source directory for those poor guys. :)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-4-git-send-email-namhyung@kernel.org
[ Make perf_tip() return NULL for ENOENT, making the fallback to really take place ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:08 -03:00
Namhyung Kim
090cff3eae perf ui/tui: Print helpline message as is
When a tip message contains a percent sign, it was treated printf format
specifier so broken string was printed like below.

  Tip: Limit to show entries above 577nly: perf report --percent-limit 5
                                   ^^^

As ui_browser__show receives format string, pass additional "%s" so that
the help (tip) message can be printed as is.

  Tip: Limit to show entries above 5% only: perf report --percent-limit 5

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452509594-13616-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:08 -03:00
Namhyung Kim
84cfac7f05 perf tools: Set and pass DOCDIR to builtin-report.c
It'll be used to locate perf document directory to find tips.txt in case
it's not installed on the system.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:07 -03:00
Namhyung Kim
dd8232bc9d perf tools: Add file_only config option to strlist
If strlist_config.dirname is present, the strlist__new() tries to load
stirngs from dirname/list file first but if it failes it falls back to
add 'list' as string.  But sometimes it's not desired so adds new
file_only field to prevent it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-2-git-send-email-namhyung@kernel.org
[ Add documentation for strlist_config::file_only, in the struct definition */
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:07 -03:00
Namhyung Kim
09f1985404 perf tools: Add more usage tips
Thanks to Andi Kleen for providing useful tips.

Suggested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452508510-28316-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:07 -03:00
Namhyung Kim
6156681b73 perf record: Add --buildid-all option
The --buildid-all option is to record build-id of all DSOs in the file.
It might be very costly to postprocess samples to find which DSO hits.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1452519429-31779-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 12:42:07 -03:00
Wang Nan
b0fb978e97 perf tools: Fix mmap2 event allocation in synthesize code
perf_event__synthesize_mmap_events() issues mmap2 events, but the memory
of that event is allocated using:

 mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);

If path of mmap source file is long (near PATH_MAX), random crash would
happen. Should use sizeof(mmap_event->mmap2).

Fix two memory allocations.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452593524-138970-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 11:24:43 -03:00
Jiri Olsa
8a59f3ccbc perf stat: Fix recort_usage typo
Markus reported gcc 6 complains when compiling perf stat command:

  builtin-stat.c:1591:27: error: ‘recort_usage’ defined but not used
  [-Werror=unused-const-variable]
   static const char * const recort_usage[] = {
                             ^~~~~~~~~~~~

I fixed the typo and realized we already export record_usage, so I also
prefixed it with stat (and included report_usage).

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452591329-27620-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-12 11:13:52 -03:00
Wang Nan
b0500c169b perf test: Reset err after using it hold errcode in hist testcases
All hists test cases forget to reset err after using it to hold an
error code. If error occure in setup_fake_machine() it incorrectly
return TEST_OK.

This patch fixes it.

Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-11 19:22:22 -03:00
Wang Nan
71b3ee7e65 perf test: Fix false TEST_OK result for 'perf test hist'
Commit 71d6de64fe ("perf test: Fix hist testcases when kptr_restrict is on")
solves a double free problem when 'perf test hist' calling
setup_fake_machine(). However, the result is still incorrect. For example:

  $ ./perf test -v 'filtering hist entries'
  25: Test filtering hist entries                              :
  --- start ---
  test child forked, pid 4186
  Cannot create kernel maps
  test child finished with 0
  ---- end ----
  Test filtering hist entries: Ok

In this case the body of this test is not get executed at all, but the
result is 'Ok'.

Actually, in setup_fake_machine() there's no need to create real kernel
maps. What we want are the fake maps. This patch removes the
machine__create_kernel_maps() in setup_fake_machine(), so it won't be
affected by kptr_restrict setting.

Test result:

  $ cat /proc/sys/kernel/kptr_restrict
  1
  $ ~/perf test -v hist
  15: Test matching and linking multiple hists                 :
  --- start ---
  test child forked, pid 24031
  test child finished with 0
  ---- end ----
  Test matching and linking multiple hists: Ok
  [SNIP]

Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-11 19:22:22 -03:00
Wang Nan
935e6bd310 tools: Move Makefile.arch from perf/config to tools/scripts
After this patch other directories can use this architecture detector
without directly including it from perf's directory. Libbpf would
utilize it to get proper $(ARCH) so it can receive correct uapi include
directory.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452520124-2073-8-git-send-email-wangnan0@huawei.com
[ Add missing srctree definition in tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
2016-01-11 19:22:20 -03:00
Wang Nan
3167eea27b perf tools: Fix phony build target for build-test
make_kernelsrc and make_kernelsrc_tools are skiped if a previous
build-test is done, because 'make build-test' creates two files with
same names. To avoid this, they should be included in .PHONY list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-11 19:03:24 -03:00
Wang Nan
11dc0c57ba perf tools: Add -lutil in python lib list for broken python-config
On some system the perf-config is broken, causes link failure like this:

   /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_forkpty':
   /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3816: undefined reference to `forkpty'
   /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_openpty':
   /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3756: undefined reference to `openpty'
   collect2: error: ld returned 1 exit status
  make[1]: *** [/home/wangnan/kernel-hydrogen/tools/perf/out/perf] Error 1
  make: *** [all] Error 2

  $ python-config --libs
  -lpthread -ldl -lpthread -lutil -lm -lpython2.7

In this case a '-lutil' should be appended to -lpython2.7.

(I know we have --start-group and --end-group. I can see them in command
line of collect2 by strace. However it doesn't work. Seems I have a
broken environment?)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-11 19:03:16 -03:00
Jiri Olsa
3238ba01c2 perf tools: Add missing sources to perf's MANIFEST
Adding missing bitmap.[ch] sources to the MANIFEST file. Fixes building
'make perf-*-src-pkg' generated tarballs.

Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 915b0882c3 ("tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/")
Link: http://lkml.kernel.org/r/1452509693-13452-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-11 12:09:25 -03:00
Namhyung Kim
775d8a1b0d perf evlist: Add --trace-fields option to show trace fields
To use dynamic sort keys, it might be good to add an option to see the
list of field names.

  $ perf evlist -i perf.data.sched
  sched:sched_switch
  sched:sched_stat_wait
  sched:sched_stat_sleep
  sched:sched_stat_iowait
  sched:sched_stat_runtime
  sched:sched_process_fork
  sched:sched_wakeup
  sched:sched_wakeup_new
  sched:sched_migrate_task
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

  $ perf evlist -i perf.data.sched --trace-fields
  sched:sched_switch: trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
  sched:sched_stat_wait: trace_fields: comm,pid,delay
  sched:sched_stat_sleep: trace_fields: comm,pid,delay
  sched:sched_stat_iowait: trace_fields: comm,pid,delay
  sched:sched_stat_runtime: trace_fields: comm,pid,runtime,vruntime
  sched:sched_process_fork: trace_fields: parent_comm,parent_pid,child_comm,child_pid
  sched:sched_wakeup: trace_fields: comm,pid,prio,success,target_cpu
  sched:sched_wakeup_new: trace_fields: comm,pid,prio,success,target_cpu
  sched:sched_migrate_task: trace_fields: comm,pid,prio,orig_cpu,dest_cpu

Committer notes:

For another file, in verbose mode:

  # perf evlist -v --trace-fields
  sched:sched_switch: type: 2, size: 112, config: 0x10b, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452125549-1511-5-git-send-email-namhyung@kernel.org
[ Replaced 'trace_fields=' with 'trace_fields: ' to make the output consistent in -v mode ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:23:02 -03:00
Jiri Olsa
5c0cf22477 perf record: Store data mmaps for dwarf unwind
Currently we don't synthesize data mmap by default. It depends on -d
option, that enables data address sampling.

But we've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Making data mmaps to be synthesized for dwarf unwind as well.

Reported-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160107133022.GA32115@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:17:45 -03:00
Jiri Olsa
0ba98149f8 perf libdw: Check for mmaps also in MAP__VARIABLE tree
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:57 -03:00
Jiri Olsa
0ddf5246f7 perf unwind: Check for mmaps also in MAP__VARIABLE tree
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.

Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:34 -03:00
Jiri Olsa
f22ed827a8 perf unwind: Use find_map function in access_dso_mem
The find_map helper is already there, so let's use it.

Also we're going to introduce wider search in following patch, so it'll
be easier to make this change on single place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Noel Grandin <noelgrandin@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:16:12 -03:00
Jiri Olsa
d2190a8091 perf evlist: Remove perf_evlist__(enable|disable)_event functions
Replacing them with perf_evsel__(enable|disable).

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:15:43 -03:00
Adrian Hunter
23df7f7984 perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
'perf record' uses perf_evsel__open() to open events and passes the
evsel->cpus and evsel->threads.  Many tests and some tools instead use
perf_evlist__open() which passes instead evlist->cpus and
evlist->threads.

Make perf_evlist__open() follow the 'perf record' behaviour so that a
consistent approach is taken.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 14:15:11 -03:00
Namhyung Kim
14cbfbeb76 perf report: Show random usage tip on the help line
Currently perf report only shows a help message "For a higher level
overview, try: perf report --sort comm,dso" unconditionally (even if
the sort keys were used).  Add more help tips and show randomly.

Load tips from ${prefix}/share/doc/perf-tip/tips.txt file.

  $ perf report | tail
      0.10%  swapper  [kernel.vmlinux]   [k] irq_exit
      0.09%  swapper  [kernel.vmlinux]   [k] flush_smp_call_function_queue
      0.08%  swapper  [kernel.vmlinux]   [k] native_write_msr_safe
      0.03%  swapper  [kernel.vmlinux]   [k] group_sched_in
      0.01%  perf     [kernel.vmlinux]   [k] native_write_msr_safe

  #
  # (Tip: Search options using a keyword: perf report -h <keyword>)
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452166913-27046-1-git-send-email-namhyung@kernel.org
[ Renamed it to perf_tip() and the parameter dirname to dirpath to fix the build on older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 13:15:46 -03:00
Namhyung Kim
fc284be9d8 perf hists: Export a couple of hist functions
These are necessary for multi threaded sample processing:

 - hists__get__get_rotate_entries_in()
 - hists__collapse_insert_entry()
 - __hists__init()

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:59:48 -03:00
Jiri Olsa
21e6d84286 perf diff: Use perf_hpp__register_sort_field interface
Using perf_hpp__register_sort_field interface instead of directly adding
the entry.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:59:30 -03:00
Jiri Olsa
b97511c5bc perf tools: Add overhead/overhead_children keys defaults via string
We currently set 'overhead' and 'overhead_children' as default sort keys
within perf_hpp__init function by directly adding into the sort list.

This patch adds 'overhead' and 'overhead_children' in text form into
sort_keys and let them be added by standard sort dimension interface.

We need to eliminate dirrect sort_list additions to be able to add
support for hists specific sort keys.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:58:58 -03:00
Jiri Olsa
bb4ced29f5 perf tools: Remove list entry from struct sort_entry
It's no longer needed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:58:04 -03:00
Jiri Olsa
685c84154c perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
Besides lockdep we use all the 'tools/lib' code in perf, so include it
completely in tags.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:57:46 -03:00
Jiri Olsa
9cdbc40962 perf script: Align event name properly
Adding code to align event names, so we get aligned output in case of
multiple events with different names.

Before:
  $ perf script
  :13757 13757 163918.230829: cpu/mem-snp-none/P: ffff88085f20d010
  :13757 13757 163918.230832: cpu/mem-loads,ldlat=30/P:     7f5a5f719f00
  :13757 13757 163918.230835: cpu/mem-loads,ldlat=30/P:     7f5a5f719f00
  :13758 13758 163918.230838: cpu/mem-snp-none/P: ffff88085f4ad810
  :13758 13758 163918.154093: cpu/mem-stores/P: ffff88085bb53f28
  :13757 13757 163918.155264: cpu/mem-snp-hitm/P:           601080
  ...

After:
  $ perf script
  :13757 13757 163918.228831:       cpu/mem-snp-none/P: ffffffff81a841c0
  :13757 13757 163918.228834: cpu/mem-loads,ldlat=30/P:     7f5a5f719f08
  :13757 13757 163918.228837: cpu/mem-loads,ldlat=30/P:     7f5a5f719f08
  :13758 13758 163918.228837:       cpu/mem-snp-none/P: ffff88085f4ad800
  :13758 13758 163918.154093:         cpu/mem-stores/P: ffff88085bb53f28
  :13757 13757 163918.155264:       cpu/mem-snp-hitm/P:           601080
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:57:26 -03:00
Wang Nan
2d7c03e6b0 perf tools: Add missing headers in perf's MANIFEST
These lost headers are found in arm64 cross buildings, failing to build
perf using tarballs generated using:

 $ make perf-targz-src-pkg

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452263041-225488-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:48:40 -03:00
Jiri Olsa
cbd08b7335 perf tools: Do not show trace command if it's not compiled in
The trace command still appears in help message when you run simple
'perf' command.

It's because the generate-cmdlist.sh does not care about the
HAVE_LIBAUDIT_SUPPORT dependency of trace command and puts it into
generated common_cmds array.

Wrapping trace command under HAVE_LIBAUDIT_SUPPORT dependency, which
will exclude it from common_cmds array if HAVE_LIBAUDIT_SUPPORT is not
set.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:46:17 -03:00
Namhyung Kim
1e9abf8b03 perf report: Change default to use event group view
The event group view feature is to see related events together.  To use
the group view, events should be recorded as a group with a dedicated
syntax of surrounding events by braces (-e '{ evt1, evt2, ... }').

Also 'perf report' also requires the --group option to enable it.
However it's almost always beneficial to use the group view to see the
group events as it's more expressive.  And I think it's more natural to
see events together if they are recorded as a group.

Thus this patch changes the default value to enable it.  If users don't
want to see like it and keep the original behavior, they can set the
report.group config variable to false and/or use --no-group option in
the 'perf report' command line.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1448807057-3506-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:41:37 -03:00
Namhyung Kim
42b276a235 perf top: Decay periods in callchains
It missed to decay periods in callchains when decaying hist entries.
This resulted in more than 100 percent overhead in callchains in the
fractal style output.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1451963160-17196-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:37:51 -03:00
Arnaldo Carvalho de Melo
915b0882c3 tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
So that lib/find_bit.c doesn't requires anything inside tools/perf/

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lkml.kernel.org/n/tip-7lxe7jgohaac5faodndhdmvk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:35:46 -03:00
Arnaldo Carvalho de Melo
64af4e0da4 tools lib: Sync tools/lib/find_bit.c with the kernel
Need to move the bitmap.[ch] things from tools/perf/ to tools/lib, will
be done in the next patches.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lkml.kernel.org/n/tip-5fys65wkd7gu8j7a7xgukc5t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:35:41 -03:00
Arnaldo Carvalho de Melo
552eb975b8 tools lib: Move find_next_bit.c to tools/lib/
The commit that introduced it should've moved it to the same place, plus
the 'tools/' prefix, but instead moved it to a bogus tools/lib/util/
directory, being the only file there.

Move it to tools/lib/find_bit.c, picking the name for the file where
these routines live since:

 8f6f19dd51 ("lib: move find_last_bit to lib/find_next_bit.c")

Next step is to make tools/lib/find_bit.c to differ from lib/find_bit.c
just in removing what is not used by tools/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lkml.kernel.org/n/tip-p391cex5mqvahp4pwrton87n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-08 12:35:35 -03:00
Arnaldo Carvalho de Melo
a831e67913 perf tests: Give a bit more information on the CQM test failure path
Before:

  $ perf test -v cqm
  48: Test intel cqm nmi context read                          :
  --- start ---
  test child forked, pid 1681
  parse_events failed
  test child finished with -2
  ---- end ----
  Test intel cqm nmi context read: Skip
  $

After:

  $ perf test -v cqm
  48: Test intel cqm nmi context read                          :
  --- start ---
  test child forked, pid 1681
  parse_events failed, is "intel_cqm/llc_occupancy/" available?
  test child finished with -2
  ---- end ----
  Test intel cqm nmi context read: Skip
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eidpiv5x4nkbsx37xwikbnir@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 16:59:27 -03:00
Arnaldo Carvalho de Melo
239849dde3 perf tests: No need to set attr.sample_freq for tracking !PERF_RECORD_SAMPLE
We were asking for a 4kHz sample_freq, making the test fail needlessly
when the system reduced /proc/sys/kernel/perf_event_max_sample_rate
below that.

Before:

  # perf test -vv dummy
  23: Test using a dummy software event to keep tracking       :
  --- start ---
  test child forked, pid 32421
  ------------------------------------------------------------
  perf_event_attr:
    type                             1
    size                             112
    config                           0x9
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|ID|PERIOD
  <SNIP>
  sys_perf_event_open failed, error -22
  Unable to open dummy and cycles event
  test child finished with -2
  ---- end ----
  Test using a dummy software event to keep tracking: Skip
  #
  [root@zoo ~]# cat /proc/sys/kernel/perf_event_max_sample_rate
  1000

After:

  [root@zoo ~]# perf test dummy
  23: Test using a dummy software event to keep tracking       : Ok

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-487iquegrs2379e5n0pi0tcp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 16:51:58 -03:00
Arnaldo Carvalho de Melo
372b212263 perf python: Add missing files to binding link list
Fixing this problem, introduced recently:

  $ perf test python
  16: Try 'import perf' in python, checking link problems      : FAILED!

In verbose mode we find out what is missing:

  $ perf test -v python
  16: Try 'import perf' in python, checking link problems      :
  --- start ---
  test child forked, pid 24894
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ImportError: /tmp/build/perf/python/perf.so: undefined symbol: find_next_bit
  test child finished with -1
  ---- end ----
  Try 'import perf' in python, checking link problems: FAILED!
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f77b57ad4f ("perf cpu_map: Add cpu_map__new_event function")
Link: http://lkml.kernel.org/n/tip-rajx0zkz6czdrnvvwf0jp76p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 16:47:11 -03:00
Arnaldo Carvalho de Melo
2e4f81ee90 perf test: No need for setting attr.sample_freq on the RECORD test
We're not looking at PERF_RECORD_SAMPLE entries and now by default we
use PERF_COUNT_SW_DUMMY, so just remove that setting.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-cly7cnotktv5rqao13pkorem@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 13:20:22 -03:00
Arnaldo Carvalho de Melo
69ef8f4755 perf test: Use "dummy" events in the PERF_RECORD_ test
As we're test just the !PERF_RECORD_SAMPLE records.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp8radcz3il4q9wbnseh337d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 13:17:00 -03:00
Arnaldo Carvalho de Melo
5bae025023 perf evlist: Introduce perf_evlist__new_dummy constructor
For case where all we need is an evlist with just an "dummy" evsel,
like in some 'perf test' entries.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q52le0pblm2k3ncvyilelr9z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 13:14:56 -03:00
Arnaldo Carvalho de Melo
4f4ba0e6af perf tests: No need to set attr.sample_freq in the perf time to TSC test
We were asking for a 4kHz sample_freq, making the test fail needlessly
when the system reduced /proc/sys/kernel/perf_event_max_sample_rate
below that.

In this test we only look at the PERF_SAMPLE_TIME fields in PERF_RECORD_
meta events, no need to set sample_freq.

Thanks to Namhyung for suggesting that max_sample_rate could be the
reason for the test failure, seeing the 'perf test -vv' output I sent.

Before:

  # echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
  # perf test TSC
  45: Test converting perf time to TSC   : FAILED!

After:

  # perf test TSC
  45: Test converting perf time to TSC   : Ok
  # cat /proc/sys/kernel/perf_event_max_sample_rate
  1000

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lcob05qhawkuvsyuu9g1fld5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-07 11:26:54 -03:00
Stephane Eranian
84530920de perf pmu: fix alias->snapshot missing initialization bug
This patch fixes a bug in __perf_pmu__new_alias() whereby the
alias->snapshot field was not initialized to false. This led to random
alias->snapshot value for an alias and was breaking some measurements
such as:

  $ perf stat -a -e uncore_imc/data_reads/ -I 1000 sleep 100

Because the event ended up being treated as snapshot mode, when it is
not.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1452106201-13073-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:16 -03:00
Jiri Olsa
b8a1962d17 perf script: Add stat-cpi.py script
Adding stat-cpi.py as an example of how to do stat scripting.

It computes the CPI metrics from cycles and instructions events.

The CPI is based performance metric showing the Cycles Per Instructions
ratio, which helps to identify cycles-hungry code.

Following stat record/report/script combinations could be used:

- get CPI for given workload

    $ perf stat -e cycles,instructions record ls

    SNIP

     Performance counter stats for 'ls':

             2,904,431      cycles
             3,346,878      instructions              #    1.15  insns per cycle

           0.001782686 seconds time elapsed

    $ perf script -s ./scripts/python/stat-cpi.py
           0.001783: cpu -1, thread -1 -> cpi 0.867803 (2904431/3346878)

    $ perf stat -e cycles,instructions record ls | perf script -s ./scripts/python/stat-cpi.py

    SNIP

           0.001730: cpu -1, thread -1 -> cpi 0.869026 (2928292/3369627)

- get CPI systemwide:

    $ perf stat -e cycles,instructions -a -I 1000 record sleep 3
    #           time             counts unit events
         1.000158618        594,274,711      cycles                     (100.00%)
         1.000158618        441,898,250      instructions
         2.000350973        567,649,705      cycles                     (100.00%)
         2.000350973        432,669,206      instructions
         3.000559210        561,940,430      cycles                     (100.00%)
         3.000559210        420,403,465      instructions
         3.000670798            780,105      cycles                     (100.00%)
         3.000670798            326,516      instructions

    $ perf script -s ./scripts/python/stat-cpi.py
           1.000159: cpu -1, thread -1 -> cpi 1.344823 (594274711/441898250)
           2.000351: cpu -1, thread -1 -> cpi 1.311972 (567649705/432669206)
           3.000559: cpu -1, thread -1 -> cpi 1.336669 (561940430/420403465)
           3.000671: cpu -1, thread -1 -> cpi 2.389178 (780105/326516)

    $ perf stat -e cycles,instructions -a -I 1000 record sleep 3 | perf script -s ./scripts/python/stat-cpi.py
           1.000202: cpu -1, thread -1 -> cpi 1.035091 (940778881/908885530)
           2.000392: cpu -1, thread -1 -> cpi 1.442600 (627493992/434974455)
           3.000545: cpu -1, thread -1 -> cpi 1.353612 (741463930/547766890)
           3.000622: cpu -1, thread -1 -> cpi 2.642110 (784083/296764)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452077397-31958-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:16 -03:00
Jiri Olsa
36e33c53f4 perf script: Display stat events by default
If no script is specified for stat data, display stat events in raw
form.

  $ perf stat record ls

  SNIP

   Performance counter stats for 'ls':

            0.851585      task-clock (msec)         #    0.717 CPUs utilized
                   0      context-switches          #    0.000 K/sec
                   0      cpu-migrations            #    0.000 K/sec
                 114      page-faults               #    0.134 M/sec
           2,620,918      cycles                    #    3.078 GHz
     <not supported>      stalled-cycles-frontend
     <not supported>      stalled-cycles-backend
           2,714,111      instructions              #    1.04  insns per cycle
             542,434      branches                  #  636.970 M/sec
              15,946      branch-misses             #    2.94% of all branches

         0.001186954 seconds time elapsed

  $ perf script
  CPU   THREAD             VAL             ENA             RUN            TIME EVENT
   -1    26185          851585          851585          851585         1186954 task-clock
   -1    26185               0          851585          851585         1186954 context-switches
   -1    26185               0          851585          851585         1186954 cpu-migrations
   -1    26185             114          851585          851585         1186954 page-faults
   -1    26185         2620918          853340          853340         1186954 cycles
   -1    26185               0               0               0         1186954 stalled-cycles-frontend
   -1    26185               0               0               0         1186954 stalled-cycles-backend
   -1    26185         2714111          853340          853340         1186954 instructions
   -1    26185          542434          853340          853340         1186954 branches
   -1    26185           15946          853340          853340         1186954 branch-misses

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452077397-31958-3-git-send-email-jolsa@kernel.org
[ Rename 'time' parameter to 'tstamp' to fix build on older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:16 -03:00
Jiri Olsa
15d2b9956b perf cpumap: Fix cpu conversion in cpu_map__from_entries
We can't convert u16 cpu_map_entries::cpu[x] value directly to int,
because it could hold -1, which would be converted as 65535.

Adding special treatment for -1, which is not real cpu number, to be
converted to (int -1).

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452077397-31958-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:16 -03:00
Jiri Olsa
aef9026356 perf script: Add python support for stat events
Add support to get stat events data in perf python scripts.

The python script shall implement the following new interface to process
stat data:

  def stat__<event_name>_[<modifier>](cpu, thread, time, val, ena, run):

    - is called for every stat event for given counter,
      if user monitors 'cycles,instructions:u" following
      callbacks should be defined:

      def stat__cycles(cpu, thread, time, val, ena, run):
      def stat__instructions_u(cpu, thread, time, val, ena, run):

  def stat__interval(time):

    - is called for every interval with its time,
      in non interval mode it's called after last
      stat event with total measured time in ns

The rest of the current interface stays untouched..

Please check example CPI metrics script in following patch
with command line examples in changelogs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-8-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:16 -03:00
Jiri Olsa
e099eba8c8 perf script: Add stat default handlers
Implement struct scripting_ops::(process_stat|process_stat_interval)
handlers - calling scripting handlers from stat events handlers.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-6-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:15 -03:00
Jiri Olsa
8058a30ce1 perf script: Add process_stat/process_stat_interval scripting interface
Python and perl scripting code will define those callbacks and get stat
data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-5-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:15 -03:00
Jiri Olsa
91a2c3d54f perf script: Process stat config event
Adding processing of stat config event and initialize stat_config
object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:15 -03:00
Jiri Olsa
cfc8874a48 perf script: Process cpu/threads maps
Adding processing of cpu/threads maps. Configuring session's evlist with
these maps.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:15 -03:00
Jiri Olsa
6db1a5c190 perf stat record: Keep sample_type 0 for pipe session
For pipe sessions we need to keep sample_type zero, because script's
perf_evsel__check_attr is triggered by sample_type != 0, and the check
would fail on stat session.

I was tempted to keep it zero unconditionally, but the pipe session is
sufficient. In perf.data session we are guarded by HEADER_STAT feature.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:14 -03:00
Namhyung Kim
4c96bee032 perf report: Add documentation for dynamic sort keys
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1451991518-25673-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:14 -03:00
Namhyung Kim
9735be24ec perf tools: Add all matching dynamic sort keys for field name
When a perf.data file has multiple events, it's likely to be similar
(tracepoint) events.  In that case, they might have same field name so
add all of them to sort keys instead of bailing out.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1451991518-25673-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:14 -03:00
Jiri Olsa
58683600df perf build: Use FEATURE-DUMP in bpf subproject
Using FEATURE-DUMP in bpf subproject for features detection in case bpf
is built via perf. Keeping the current features detection otherwise.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama <pi3orama@163.com>
Link: http://lkml.kernel.org/r/1450893514-9158-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:14 -03:00
Jiri Olsa
76ee2ff342 tools build feature: Move dwarf post unwind choice output into perf
We decide what dwarf unwind to choose way after the Makefile.feature
makefile is included. The $(dwarf-post-unwind) is not even set at that
time. For the same reason it was never included in FEATURE-DUMP file.

Moving it into perf VF=1 verbose display.

  $ make VF=1
    BUILD:   Doing 'make -j4' parallel build
  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...
  ...                 LIBUNWIND_DIR:
  ...                     LIBDW_DIR:
  ...     DWARF post unwind library: libunwind
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama <pi3orama@163.com>
Link: http://lkml.kernel.org/r/1450893514-9158-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:13 -03:00
Namhyung Kim
d49dadea78 perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events
When an evlist contains tracepoint events only, use 'trace' sort key as
default.  If --raw-trace option was given, use 'trace_fields' instead.
This will make users more convenient to see trace result.

Suggested-and-Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-14-git-send-email-namhyung@kernel.org
[ Check evlist in get_default_sort_order() fixing a segfault in 'perf test hists' reported by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:13 -03:00
Namhyung Kim
2e422fd1e4 perf tools: Add 'trace_fields' dynamic sort key
The 'trace_fields' sort key is similar as 'trace' sort key, but it shows
each fields separately.  Each event will get different columns as their
fields.

  $ perf report -s trace_fields --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 20K of event 'kmem:kmalloc'
  # Event count (approx.): 20533
  #
  # Overhead  Command           call_site                 ptr  bytes_req  bytes_alloc            gfp_flags
  # ........  .......  ..................  ..................  .........  ...........  ...................
  #
      99.89%  perf       ffffffffa01d4396  0xffff8803ffb79720         96           96    GFP_NOFS|GFP_ZERO
       0.06%  sleep      ffffffff8114e1cd  0xffff8803d228a000       4096         4096           GFP_KERNEL
       0.03%  perf       ffffffff811d6ae6  0xffff8803f7678f00        240          256  GFP_KERNEL|GFP_ZERO
       0.00%  perf       ffffffff812263c1  0xffff880406172380        128          128           GFP_KERNEL
       0.00%  perf       ffffffff812264b9  0xffff8803ffac1600        504          512           GFP_KERNEL
       0.00%  perf       ffffffff81226634  0xffff880401dc5280         28           32           GFP_KERNEL
       0.00%  sleep      ffffffff81226da9  0xffff8803ffac3a00        392          512           GFP_KERNEL

  # Samples: 20K of event 'kmem:kfree'
  # Event count (approx.): 20597
  #
  # Overhead           call_site                 ptr
  # ........  ..................  ..................
  #
      99.58%    ffffffffa01d85ad  0xffff8803ffb79720
       0.07%    ffffffff81443f5c  0xffff8803f7669400
       0.02%    ffffffff811d5753  0xffff8803f7678f00
       0.01%    ffffffff81443f5c  0xffff8803f766be00
       0.01%    ffffffff8114e359  0xffff8803d228a000
       0.01%    ffffffff81443f5c  0xffff8800d156dc00
       0.01%    ffffffff81443f5c  0xffff8803f7669400
       0.01%    ffffffff8114e359  0xffff8803d228a000
       0.01%    ffffffff8114e359  0xffff8803d228a000
       0.01%    ffffffff8114e359  0xffff8803d228a000

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-13-git-send-email-namhyung@kernel.org
[ Combined with "perf tools: Fix segfault when using -s trace_fields" ]
Link: http://lkml.kernel.org/r/1451991518-25673-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:13 -03:00
Namhyung Kim
361459f163 perf tools: Skip dynamic fields not defined for current event
When there are multiple events, each dynamic sort key is defined just
for one event.  In this case other events will always show "N/A" for
those fields.  But they are meaningless and consume precious screen
width.

Let's skip those undefined dynamic fields.

  $ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1

  $ perf report -s 'comm,kmalloc.*' --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 20K of event 'kmem:kmalloc'
  # Event count (approx.): 20533
  #
  # Overhead  Command           call_site                 ptr  bytes_req  bytes_alloc            gfp_flags
  # ........  .......  ..................  ..................  .........  ...........  ...................
  #
      99.89%  perf       ffffffffa01d4396  0xffff8803ffb79720         96           96    GFP_NOFS|GFP_ZERO
       0.06%  sleep      ffffffff8114e1cd  0xffff8803d228a000       4096         4096           GFP_KERNEL
       0.03%  perf       ffffffff811d6ae6  0xffff8803f7678f00        240          256  GFP_KERNEL|GFP_ZERO
       0.00%  perf       ffffffff812263c1  0xffff880406172380        128          128           GFP_KERNEL
       0.00%  perf       ffffffff812264b9  0xffff8803ffac1600        504          512           GFP_KERNEL
       0.00%  perf       ffffffff81226634  0xffff880401dc5280         28           32           GFP_KERNEL
       0.00%  sleep      ffffffff81226da9  0xffff8803ffac3a00        392          512           GFP_KERNEL

  # Samples: 20K of event 'kmem:kfree'
  # Event count (approx.): 20597
  #
  # Overhead  Command
  # ........  ..............
  #
      99.63%  perf
       0.14%  sleep
       0.11%  irq/36-iwlwifi
       0.11%  kworker/u16:0
       0.01%  Xorg
       0.00%  firefox

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-12-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-01-06 20:11:12 -03:00