2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-18 18:23:53 +08:00
Commit Graph

3340 Commits

Author SHA1 Message Date
Adrian Hunter
dddcf6abbf perf evlist: Add perf_evlist__id2evsel_strict()
perf_evlist__id2evsel_strict() is the same as perf_evlist__id2evsel()
except that it ensures that the id must match.

This will be used by perf inject to find a specific evsel that is to be
deleted, hence the need to match exactly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-22-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 17:11:00 -03:00
Adrian Hunter
44cbe7295c perf scripting python: Allow for max_stack greater than PERF_MAX_STACK_DEPTH
Use the scripting_max_stack value to allow for values greater than
PERF_MAX_STACK_DEPTH.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 17:09:12 -03:00
Adrian Hunter
03cd1fed2b perf script: Add a setting for maximum stack depth
Add a setting for maximum stack depth in preparation for allowing for
synthesized callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 17:08:48 -03:00
Adrian Hunter
96b40f3c05 perf hists: Allow for max_stack greater than PERF_MAX_STACK_DEPTH
Use the max_stack value instead of PERF_MAX_STACK_DEPTH so that
arbitrary-sized callchains can be supported.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 17:06:16 -03:00
Adrian Hunter
f14445ee72 perf intel-pt: Support generating branch stack
Add support for generating branch stack context for PT samples.  The
decoder reports a configurable number of branches as branch context for
each sample. Internally it keeps track of them by using a simple sliding
window.  We also flush the last branch buffer on each sample to avoid
overlapping intervals.

This is useful for:

- Reporting accurate basic block edge frequencies through the perf
  report branch view
- Using with --branch-history to get the wider context of samples
- Other users of LBRs

Also the Documentation is updated.

Examples:

	Record with Intel PT:

		perf record -e intel_pt//u ls

	Branch stacks are used by default if synthesized so:

		perf report --itrace=ile

	is the same as:

		perf report --itrace=ile -b

	Branch history can be requested also:

		perf report --itrace=igle --branch-history

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:59:14 -03:00
Adrian Hunter
385e33063f perf intel-pt: Move branch filter logic
intel_pt_synth_branch_sample() skips synthesizing if the branch does not
match the branch filter.  That logic was sitting in the middle of the
function but is more efficiently placed at the start of the function, so
move it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:58:27 -03:00
Adrian Hunter
601897b54c perf auxtrace: Add option to synthesize branch stacks on samples
Add AUX area tracing option 'l' to synthesize branch stacks on samples
just like sample type PERF_SAMPLE_BRANCH_STACK.  This is taken into use
by Intel PT in a subsequent patch.

Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:53:44 -03:00
Adrian Hunter
a38f48e300 perf session: Warn when AUX data has been lost
By default 'perf record' will postprocess the perf.data file to
determine build-ids.  When that happens, the number of lost perf events
is displayed.

Make that also happen for AUX events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:51:33 -03:00
Adrian Hunter
116f349c5b perf intel-pt: Make logging slightly more efficient
Logging is only used for debugging. Use macros to save calling into the
functions only to return immediately when logging is not enabled.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:45:26 -03:00
Adrian Hunter
9992c2d50a perf intel-pt: Fix potential loop forever
TSC packets contain only 7 bytes of TSC.  The 8th byte is assumed to
change so infrequently that its value can be inferred.  However the
logic must cater for a 7 byte wraparound, which it does by adding 1 to
the top byte.

The existing code was doing that with a while loop even though the
addition should only need to be done once.  That logic won't work (will
loop forever) if TSC wraps around at the 8th byte.  Theoretically that
would take at least 10 years, unless something else went wrong.

And what else could go wrong.  Well, if the chunks of trace data are
processed out of order, it will make it look like the 7-byte TSC has
gone backwards (i.e. wrapped).  If that happens 256 times then stuck in
the while loop it will be.

Fix that by getting rid of the unnecessary while loop.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 16:44:31 -03:00
Adrian Hunter
e1791347b5 perf auxtrace: Fix 'instructions' period of zero
Instruction tracing options (i.e. --itrace) include an option for
sampling instructions at an arbitrary period. e.g.

	--itrace=i10us

means make an 'instructions' sample for every 10us of trace.

Currently the logic does not distinguish between a period of
zero and no period being specified at all, so it gets treated
as the default period which is 100000.  That doesn't really
make sense.

Fix it so that zero period is accepted and treated as meaning
"as often as possible".

In the case of Intel PT that is the same as a period of 1 and
a unit of 'instructions' (i.e. --itrace=i1i).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-2-git-send-email-adrian.hunter@intel.com
[ Add a few lines describing this in the Documentation/intel-pt.txt file ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 15:50:56 -03:00
Arnaldo Carvalho de Melo
ab9c2bdc89 perf tools: Use __map__is_kernel() when synthesizing kernel module mmap records
Equivalent and removes one more case of using dso->kernel.

  # perf record -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.768 MB perf.data (30 samples) ]

Before:

  [root@zoo ~]# perf script --show-task --show-mmap | head -3
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko
  #

  # perf script --show-task --show-mmap | head -3
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko
   swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b65xe578dwq22mzmmj5y94wr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 15:50:54 -03:00
Ingo Molnar
6afc0c269c Merge branch 'linus' into perf/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28 08:06:57 +02:00
Adrian Hunter
b5cabbcbd1 perf tools: Fix copying of /proc/kcore
A copy of /proc/kcore containing the kernel text can be made to the
buildid cache. e.g.

	perf buildid-cache -v -k /proc/kcore

To workaround objdump limitations, a copy is also made when annotating
against /proc/kcore.

The copying process stops working from libelf about v1.62 onwards (the
problem was found with v1.63).

The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails
because additional validation has been added to gelf_getphdr().

The use of gelf_getphdr() is a misguided attempt to get default
initialization of the Gelf_Phdr structure.  That should not be
necessary because every member of the Gelf_Phdr structure is
subsequently assigned.  So just remove the call to gelf_getphdr().

Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be
removed also.

Committer notes:

Note to stable@kernel.org, from Adrian in the cover letter for this
patchkit:

The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think
it is important enough for stable.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-25 10:45:50 -03:00
Arnaldo Carvalho de Melo
266fa2b222 perf probe: Use existing routine to look for a kernel module by dso->short_name
We have map_groups__find_by_name() to look at the list of modules that
are in place for a given machine, so use it instead of traversing the
machine dso list, which also includes DSOs for userspace.

When merging the user and kernel DSO lists a bug was introduced where
'perf probe' stopped being able to add probes to modules using its short
name:

  # perf probe -m usbnet --add usbnet_start_xmit
  usbnet_start_xmit is out of .text, skip it.
    Error: Failed to add events.
  #

With this fix it works again:

  # perf probe -m usbnet --add usbnet_start_xmit
  Added new event:
    probe:usbnet_start_xmit (on usbnet_start_xmit in usbnet)

  You can now use it in all perf tools, such as:

  	perf record -e probe:usbnet_start_xmit -aR sleep 1
  #

Reported-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 3d39ac5386 ("perf machine: No need to have two DSOs lists")
Link: http://lkml.kernel.org/r/20150924015008.GE1897@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-25 10:41:31 -03:00
Ingo Molnar
968d712a25 perf/core improvements and fixes:
User visible:
 
 - Fix a segfault in 'perf probe' when removing uprobe events (Masami Hiramatsu)
 
 - Synthesize COMM event for workloads started from the command line in 'perf
   record' so that we can have the pid->comm mapping before we get the real
   PERF_RECORD_COMM switching from perf to the workload (Namhyung Kim)
 
 - Fix build tools/vm/ due to removal of tools/lib/api/fs/debugfs.h
   (Arnaldo Carvalho de Melo)
 
 Developer stuff:
 
 - Fix the make tarball targets by including the recently added err.h header in
   the perf MANIFEST file (Jiri Olsa)
 
 - Don't assume that the event parser returns a non empty evlist (Wang Nan)
 
 - Add way to disambiguate feature detection state files, needed to use
   tools/build feature detection for multiple components in a single O= output
   dir, which will be the case with tools/perf/ and tools/lib/bpf/
   (Arnaldo Carvalho de Melo)
 
 - Fixup FEATURE_{TESTS,DISPLAY} inversion in tools/lib/bpf/ (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWAgaZAAoJENZQFvNTUqpAsZ0P/AuZ92BE6yHPd8VwEYnREAIn
 PGO3rmW7+dy/rE7pDvKYkHkSAu86VVGCB8Kjbn+7mGssBBNnnC21BwdJQRi3c/If
 Upyl4w7jUB6UooXoAAt4tQqSIZV2fq0esN6ss7D2zpblB6ZBZ/TAC9sNWdDq9Yav
 fU+Wks5AMSQDNmj4l9/q3K5v2U5gUBginHyHKvnrOF6fMAN4ZZRAOlB5viisyt+P
 dRDol95SCnhTaOHoM3Eko7uNlHyWUW7HhXN9N5ZaoXeBn/QI2gwAzBz4snFpJPDm
 1Ua9t6Kx4sCNOcuHMVL4Jy2c/kyRSKWyWnb9PBv//m0HKAJyGp39/XIRkRD1ngQ2
 NXESfP4ljhLvUeU6zL5/q6qyNHvpXLxsi1y/tftGGpPXmLbjHiuPq57gdMyvRyOQ
 UbCGQl8aj/08jEPloCeoXk5cGin5iJA0wZg9JkT2kbSlklTGWYVaphy0Nlgn2Ojn
 S/vOVqpSBNlq2afT4lBicZ91QFl3AIfCWUGutbnaDdvZ/inTRfjBJoLJyvXikSSC
 p+FQDOXxbiUH0POc6A0VfYZWJzvXax9XA8H62EA47CQS4GD+CygmUUOD5Gwro0iX
 v9uXWFootRzvgD5ElG6gg7hgZgO1hqOp1xd+STlyGsYuyKIzu7Nk9GqoXf4vYLIV
 JWoEW0N+kypOx4quwne8
 =6DJ/
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Fix a segfault in 'perf probe' when removing uprobe events. (Masami Hiramatsu)

  - Synthesize COMM event for workloads started from the command line in 'perf
    record' so that we can have the pid->comm mapping before we get the real
    PERF_RECORD_COMM switching from perf to the workload. (Namhyung Kim)

  - Fix build tools/vm/ due to removal of tools/lib/api/fs/debugfs.h.
    (Arnaldo Carvalho de Melo)

Infrastructure changes:

  - Fix the make tarball targets by including the recently added err.h header in
    the perf MANIFEST file. (Jiri Olsa)

  - Don't assume that the event parser returns a non empty evlist. (Wang Nan)

  - Add way to disambiguate feature detection state files, needed to use
    tools/build feature detection for multiple components in a single O= output
    dir, which will be the case with tools/perf/ and tools/lib/bpf/.
    (Arnaldo Carvalho de Melo)

  - Fixup FEATURE_{TESTS,DISPLAY} inversion in tools/lib/bpf/. (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:42:58 +02:00
Ingo Molnar
b5727270ec Merge branch 'perf/urgent' into perf/core to pick up fixes before pulling new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:42:11 +02:00
Namhyung Kim
e803cf97a4 perf record: Synthesize COMM event for a command line workload
When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.

It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work if events are processed and displayed before the
COMM event overrides like in 'perf script'.  This leads to those early
samples (like native_write_msr_safe) not having a comm but pid (like
':15328').

So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm.  But at this time, the
comm will be "perf" since it's not exec-ed yet.

Committer note:

Before this patch:

  # perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
  # perf script --show-task-events
    :4429  4429 27909.079372:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
    :4429  4429 27909.079375:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
    :4429  4429 27909.079376:         10 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
    :4429  4429 27909.079377:        223 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
    :4429  4429 27909.079378:       6571 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
   usleep  4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
   usleep  4429 27909.079381:     185403 cycles:  ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
   usleep  4429 27909.079444:    2241110 cycles:      7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
   usleep  4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)

After:

  # perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
  # perf script --show-task
     perf     0     0.000000: PERF_RECORD_COMM: perf:8446/8446
     perf  8446 30154.038944:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
     perf  8446 30154.038948:          1 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
     perf  8446 30154.038949:          9 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
     perf  8446 30154.038950:        230 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
     perf  8446 30154.038951:       6772 cycles:  ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
   usleep  8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446
   usleep  8446 30154.038954:     196923 cycles:  ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1
   usleep  8446 30154.039021:    2292130 cycles:      7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so)
   usleep  8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446)
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-22 22:43:12 -03:00
Wang Nan
854f736364 perf tools: Don't assume that the parser returns non empty evsel list
Don't blindly retrieve and use a last element in the lists returned by
parse_events__scanner(), as it may have collected no entries, i.e.
return an empty list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1441523623-152703-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-21 18:01:17 -03:00
Mark Rutland
381c02f6d8 perf record: Avoid infinite loop at buildid processing with no samples
If a session contains no events, we can get stuck in an infinite loop in
__perf_session__process_events, with a non-zero file_size and data_offset, but
a zero data_size.

In this case, we can mmap the entirety of the file (consisting of the file and
attribute headers), and fetch_mmaped_event will correctly refuse to read any
(unmapped and non-existent) event headers. This causes
__perf_session__process_events to unmap the file and retry with the exact same
parameters, getting stuck in an infinite loop.

This has been observed to result in an exit-time hang when counting
rare/unschedulable events with perf record, and can be triggered artificially
with the script below:

  ----
  #!/bin/sh
  printf "REPRO: launching perf\n";
  ./perf record -e software/config=9/ sleep 1 &
  PERF_PID=$!;
  sleep 0.002;
  kill -2 $PERF_PID;
  printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID";
  wait $PERF_PID;
  printf "REPRO: perf exited\n";
  ----

To avoid this, have __perf_session__process_events bail out early when
the file has no data (i.e. it has no events).

Commiter note:

I only managed to reproduce this when setting
/proc/sys/kernel/kptr_restrict to '1' and changing the code to
purposefully not process any samples and no synthesized samples, i.e.
kptr_restrict prevents 'record' from synthesizing the kernel mmaps for
vmlinux + modules and since it is a workload started from perf, we don't
synthesize mmap/comm records for existing threads.

Adrian Hunter managed to reproduce it in his environment tho.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1442423929-12253-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-18 12:31:40 -03:00
Ingo Molnar
02386c356a Merge branch 'perf/urgent' into perf/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-18 09:24:01 +02:00
Peter Senna Tschudin
bf6445631c perf tools: Bool functions shouldn't return -1
Returning a negative value for a boolean function seem to have the
undesired effect of returning true. Replace -1 by false in a
bool-returning function.

The diff of the .s file before and after the change (for x86_64):

  3907c3907
  < 	movl	$1, %ebx
  ---
  > 	xorl	%ebx, %ebx

while if -1 is replaced by true, the diff is empty.

This issue was found by the following Coccinelle semantic patch:

  <smpl>
  @@
  identifier f;
  constant C;
  typedef bool;
  @@
  bool f (...){
  <+...
  * return -C;
  ...+>
  }
  </smpl>

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Milos Vyletel <milos@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1442484533-19742-1-git-send-email-peter.senna@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-17 15:31:52 -03:00
Arnaldo Carvalho de Melo
179f36dde3 Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum"
This reverts commit f785f23576.

We have a test to check if elf_getphdrnum() is present, so, if it fails,
we'll get:

  [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output
  cc1: warnings being treated as errors
  test-libelf-getphdrnum.c: In function ‘main’:
  test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’
  [acme@rhel5 linux]$

And this block will not be compiled:

  #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
  static int elf_getphdrnum(Elf *elf, size_t *dst)
  ...
  #endif

So, if elf_getphdrnum() is being defined somewhere, there is a problem
with the test that is not detecting that function, go fix it.

Reported-by: Vinson Lee <vlee@twopensource.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-17 13:10:05 -03:00
Stephane Eranian
02d8dabc50 perf stat: Fix per-pkg event reporting bug
Per-pkg events need to be captured once per processor socket. The code
in check_per_pkg() ensures only one value per processor package is used.
However there is a problem with this function in case the first CPU of
the package does not measure anything for the per-pkg event, but other
CPUs do.

Consider the following:

  $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop &
  $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100
    1.00000 <not counted> Bytes intel_cqm/llc_occupancy/  FOO

The reason for this is that CPU0 in the cgroup has nothing running on it.
Yet check_per_plg() will mark socket0 as processed and no other event
value will be considered for the socket.

This patch fixes the problem by having check_per_pkg() only consider
events which actually ran.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-16 18:01:03 -03:00
Ingo Molnar
d71b0ad8d3 Merge branch 'perf/urgent' into perf/core, to resolve a conflict
Conflicts:
	tools/perf/ui/browsers/hists.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-16 09:19:56 +02:00
Adrian Hunter
8c0498b689 perf evlist: Fix create_syswide_maps() not propagating maps
Fix it by making it call perf_evlist__set_maps() instead of setting the
maps itself.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 11:03:22 -03:00
Adrian Hunter
44c42d71c6 perf evlist: Fix add() not propagating maps
If evsels are added after maps are created, then they won't have any
maps propagated to them.  Fix that.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-12-git-send-email-adrian.hunter@intel.com
[ Moved the moving of propagate_maps() to the patch before, so that this
  one does _just_ the one lile fix calling in add()]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 11:01:25 -03:00
Adrian Hunter
adc0c3e87b perf evlist: Factor out a function to propagate maps for a single evsel
Subsequent fixes will need a function that just propagates maps for a
single evsel so factor it out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-11-git-send-email-adrian.hunter@intel.com
[ Moved them to before perf_evlist__add() to avoid having to move it in the next patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:54:04 -03:00
Adrian Hunter
74bfd2b25d perf evlist: Make create_maps() use set_maps()
Since there is a function to set maps, perf_evlist__create_maps() should
use it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:45:47 -03:00
Adrian Hunter
934e0f2053 perf evlist: Make set_maps() more resilient
Make perf_evlist__set_maps() more resilient by allowing for the
possibility that one or another of the maps isn't being changed and
therefore should not be "put".

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:44:22 -03:00
Adrian Hunter
fce4d296b4 perf evsel: Add own_cpus member
perf_evlist__propagate_maps() cannot easily tell if an evsel has its own
cpu map.  To make that simpler, keep a copy of the PMU cpu map and
adjust the propagation logic accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:41:13 -03:00
Adrian Hunter
b278c364b3 perf evlist: Fix missing thread_map__put in propagate_maps()
perf_evlist__propagate_maps() incorrectly assumes evsel->threads is NULL
before reassigning it, but it won't be NULL when perf_evlist__set_maps()
is used to set different (or NULL) maps.  Thus thread_map__put must be
used, which works even if evsel->threads is NULL.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:24:30 -03:00
Adrian Hunter
f114d6eff7 perf evlist: Fix splice_list_tail() not setting evlist
Commit d49e469507 ("perf evsel: Add a backpointer to the evlist a
evsel is in") updated perf_evlist__add() but not
perf_evlist__splice_list_tail().

This illustrates that it is better if perf_evlist__splice_list_tail()
calls perf_evlist__add() instead of duplicating the logic, so do that.
This will also simplify a subsequent fix for propagating maps.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:23:17 -03:00
Adrian Hunter
ec9a77a7e3 perf evlist: Add has_user_cpus member
Subsequent patches will need to call perf_evlist__propagate_maps without
reference to a "target".  Add evlist->has_user_cpus to record whether
the user has specified which cpus to target (and therefore whether that
list of cpus should override the default settings for a selected event
i.e. the cpu maps should be propagated)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:20:50 -03:00
Adrian Hunter
d5bc056e73 perf evlist: Remove redundant validation from propagate_maps()
The validation checks that the values that were just assigned, got
assigned i.e. the error can't ever happen.  Subsequent patches will call
this code in places where errors are not being returned.  Changing those
code paths to return this non-existent error is counter-productive, so
just remove it.

That in turn results in perf_evlist__set_maps not needing to return an
error, but callers aren't checking it either, so remove that too.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:16:48 -03:00
Adrian Hunter
725e06b2e2 perf evlist: Simplify set_maps() logic
Don't need to check for NULL when "putting" evlist->maps and
evlist->threads because the "put" functions already do that.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:15:39 -03:00
Adrian Hunter
a69b09e234 perf evlist: Simplify propagate_maps() logic
If evsel->cpus is to be reassigned then the current value must be "put",
which works even if it is NULL.  Simplify the current logic by moving
the "put" next to the assignment.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1441699142-18905-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 10:08:22 -03:00
Wang Nan
63ab024a5b perf tools: regs_query_register_offset() infrastructure
regs_query_register_offset() is a helper function which converts
register name like "%rax" to offset of a register in 'struct pt_regs',
which is required by BPF prologue generator.

PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET indicates an architecture
supports converting name of a register to its offset in 'struct
pt_regs'.

HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET is introduced as the corresponding
CFLAGS of PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1441523623-152703-19-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Extracted from eBPF patches ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:33 -03:00
Jiri Olsa
196581717d perf tools: Enhance parsing events tracepoint error output
Enhancing parsing events tracepoint error output. Adding
more verbose output when the tracepoint is not found or
the tracing event path cannot be access.

  $ sudo perf record -e sched:sched_krava ls
  event syntax error: 'sched:sched_krava'
                       \___ unknown tracepoint

  Error:  File /sys/kernel/debug/tracing//tracing/events/sched/sched_krava not found.
  Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.

  Run 'perf list' for a list of valid events
  ...

  $ perf record -e sched:sched_krava ls
  event syntax error: 'sched:sched_krava'
                       \___ can't access trace events

  Error:  No permissions to read /sys/kernel/debug/tracing//tracing/events/sched/sched_krava
  Hint:   Try 'sudo mount -o remount,mode=755 /sys/kernel/debug'

  Run 'perf list' for a list of valid events
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1441615087-13886-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:33 -03:00
Jiri Olsa
8dd2a1317e perf evsel: Propagate error info from tp_format
Propagate error info from tp_format via ERR_PTR to get it all the way
down to the parse-event.c tracepoint adding routines. Following
functions now return pointer with encoded error:

  - tp_format
  - trace_event__tp_format
  - perf_evsel__newtp_idx
  - perf_evsel__newtp

This affects several other places in perf, that cannot use pointer check
anymore, but must utilize the err.h interface, when getting error
information from above functions list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
Link: http://lkml.kernel.org/r/1441615087-13886-5-git-send-email-jolsa@kernel.org
[ Add two missing ERR_PTR() and one IS_ERR() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:33 -03:00
Jiri Olsa
e2f9f8ea6a perf tools: Propagate error info for the tracepoint parsing
Pass 'struct parse_events_error *error' to the parse-event.c tracepoint
adding path. It will be filled with error data in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Raphael Beamonte <raphael.beamonte@gmail.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1441615087-13886-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:32 -03:00
Namhyung Kim
9bae1e8c3f perf probe: Export init/exit_probe_symbol_maps()
The init/exit_symbols_maps() functions are to setup and cleanup
necessary info for probe events.  But they need to be called from out of
the probe code now, so this patch exports them.

However the names are too generic, so change them to have 'probe'. :)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1441852026-28974-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:32 -03:00
Namhyung Kim
a43aac299c perf probe: Free perf_probe_event in cleanup_perf_probe_events()
The cleanup_perf_probe_events() frees all resources related to a perf
probe event.  However it only freed resources in trace probe events, not
perf probe events.  So call clear_perf_probe_event() too.

Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1441852026-28974-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-15 09:48:32 -03:00
Kan Liang
84734b06b6 perf hists browser: Zoom in/out for processor socket
Currently, users can zoom in/out for threads and dso in 'perf top' and
'perf report'.

This patch extends it for the processor sockets.

'S' is the short key to zoom into current Processor Socket.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441377946-44429-4-git-send-email-kan.liang@intel.com
[ - Made it elide the Socket column when zooming into it,
    just like with the other zoom ops;
  - Make it use browser->pstack, to unzoom level by level;
  - Rename 'socket' variables to 'socket_id' to make it build on
    older systems where it shadows a global glibc declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 13:02:08 -03:00
Kan Liang
21394d948a perf report: Introduce --socket-filter option
Introduce --socket-filter option for 'perf report' to only show entries
for a processor socket that match this filter.

  $ perf report --socket-filter 1 --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 752  of event 'cycles'
  # Event count (approx.): 350995599
  # Processor Socket: 1
  #
  # Overhead  Command    Shared Object     Symbol
  # ........  .........  ................  .................................
  #
      97.02%  test       test              [.] plusB_c
       0.97%  test       test              [.] plusA_c
       0.23%  swapper    [kernel.vmlinux]  [k] acpi_idle_do_entry
       0.09%  rcu_sched  [kernel.vmlinux]  [k] dyntick_save_progress_counter
       0.01%  swapper    [kernel.vmlinux]  [k] task_waking_fair
       0.00%  swapper    [kernel.vmlinux]  [k] run_timer_softirq

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441377946-44429-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:31 -03:00
Kan Liang
2e7ea3ab82 perf tools: Introduce new sort type "socket" for the processor socket
This patch enable perf report to sort by processor socket:

  $ perf report --stdio --sort socket,comm,dso,symbol
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 686  of event 'cycles'
  # Event count (approx.): 349215462
  #
  # Overhead SOCKET Command Shared Object    Symbol
  # ........ ...... ....... ................ ............................
  #
    97.05%    000   test    test             [.] plusB_c
     0.98%    000   test    test             [.] plusA_c
     0.93%    001   perf    [kernel.vmlinux] [k] smp_call_function_single
     0.19%    001   perf    [kernel.vmlinux] [k] page_fault
     0.19%    001   swapper [kernel.vmlinux] [k] pm_qos_request
     0.16%    000   test    [kernel.vmlinux] [k] add_mm_counter_fast

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1441377946-44429-2-git-send-email-kan.liang@intel.com
[ Fix col calc, un-allcapsify col header & read the topology when not using perf.data ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:30 -03:00
Kan Liang
0c4c4debb0 perf tools: Add processor socket info to hist_entry and addr_location
This information will come from perf.data files of from the current
system, cached when needed, such as when the 'socket' sort order gets
introduced.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1441377946-44429-1-git-send-email-kan.liang@intel.com
[ Don't blindly use env->cpu[al.cpu].socket_id & use machine->env, fixes by Jiri & Arnaldo ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:29 -03:00
Arnaldo Carvalho de Melo
4cde998d20 perf machine: Add pointer to sample's environment
The 'struct machine' represents the machine where the samples were/are
being collected, and we also have a 'struct perf_env' with extra details
about such machine, that we were collecting at 'perf.data' creation time
but we also needed when no perf.data file is being used, such as in
'perf top'.

So, get those structs closer together, as they provide a bigger picture
of the sample's environment.

In 'perf session', when the file argument is NULL, we can assume that
the tool is sampling the running machine, so point machine->env to
the global put in place in previous patches, while set it to the
perf_header.env one when reading from a file.

This paves the way for machine->env to be used in
perf_event__preprocess_sample to populate addr_location.socket.

Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2ajotl0khscutm68exictoy9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:29 -03:00
Arnaldo Carvalho de Melo
aa36ddd7af perf env: Introduce read_cpu_topology_map() method
Out of the code to write the cpu topology map in the perf.data file
header.

Now if one needs the CPU topology map for the running machine, one needs
to call perf_env__read_cpu_topology_map(perf_env) and the info will be
stored in perf_env.cpu.

For now we're using a global perf_env variable, that will have its
contents freed after we run a builtin.

v2: Check perf_env__read_cpu_topology_map() return in
    write_cpu_topology() (Kan Liang)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1441828225-667-5-git-send-email-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:28 -03:00
Arnaldo Carvalho de Melo
5d8cf721cb perf cpu_map: Use sysfs__read_int in get_{core,socket}_id()
We have the tools/lib/ sysfs__read_int() for that, avoid code
duplication.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fqg6vt5ku72pbf54ljg6tmoy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-14 12:50:27 -03:00