When no event is specified perf will use the "cycles" hardware event
with the highest precision available in the processor, and excluding
kernel events for non-root users, so make that clear in the event name
by setting the "u" event modifier, i.e. "cycles:upp".
E.g.:
The default for root:
# perf record usleep 1
# perf evlist -v
cycles:ppp: ..., precise_ip: 3, exclude_kernel: 0, ...
#
And for !root:
$ perf record usleep 1
$ perf evlist -v
cycles:uppp: ... , precise_ip: 3, exclude_kernel: 1, ...
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lf29zcdl422i9knrgde0uwy3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To allow probing the max attr.precise_ip setting for non-root users
we unconditionally set attr.exclude_kernel, which makes the detection
work but should be done only for !root, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 97365e8136 ("perf evsel: Set attr.exclude_kernel when probing max attr.precise_ip")
Link: http://lkml.kernel.org/n/tip-bl6bbxzxloonzvm4nvt7oqgj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- Fix max attr.precise_ip probing to make perf use the best cycles:p
available in the processor for non root users (Arnaldo Carvalho de Melo)
- Fix processing of MMAP events for 32-bit binaries on 64-bit systems
when unwind support is not fully integrated, fixing DSO and symbol
resolution (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZW+uQAAoJENZQFvNTUqpARHYP/jzthQY9jH6BtLwntItNc8EY
7zweMpPbdRxXqILfkeEpGiGsH13j8+ZdUlg7Q07yuS2hSJJLYn3WPLb5kfVjDmKP
IeSkJSsGi448RCWr9yQiOIeVbU07vCb3fdcDK6E485n5gU6jjlfvVF9odt1Yg+PY
jNs6XSZDbi5GGMA2CpqHYFjRCQbst7ucF6MLlEF3CK6qY9TtiM1UrWwEoJgiKoab
rnwP9pe2om1KKbBaKE8jSS42yw1KJkvrE6To3cD0HwrnWBufWDZmwrF4Ba5UsClK
2q3uMJMOMzFOZaWAw1NkW5JfSb0iBxOMnFZXgng+zKsubHlkAkY7j+GhgV6ndSXX
viuBR6fFhC37xwHSzgw2z0LIwj8VMmsppZWrgB+El9PUiRJ3qIkooXVPaSV+Eet4
4XfIg4m1L1hlzp5OskV4H6Rh5cqp2g0mmr0iDMSJZVfMC4sCNBSQOT7l7Sks0EWV
4MeLdFcwIXFYjecu/MZ3H+EI2OIi4/KmuPgwyQmVW09tI00qXIQGFHnq9Q4InneU
jAwBxn2qNerXM2DS0Zw55rUWakjYQH2wq6MXFPTmyZXlibiFpbBhQjE9SyafoURN
bC2ago2QXny12WZEwPp3PAYJb6VWhL3W/z00vyawdso1iQJU2QLU4N54+qPq/iXn
4ng+dI3ZeaZQXAT8X6KS
=VehJ
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-4.12-20170704' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Fix max attr.precise_ip probing to make perf use the best cycles:p
available in the processor for non root users (Arnaldo Carvalho de Melo)
- Fix processing of MMAP events for 32-bit binaries on 64-bit systems
when unwind support is not fully integrated, fixing DSO and symbol
resolution (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We currently fail the MMAP event processing if we don't have the MMAP
event's specific arch unwind support compiled in.
That's wrong and can lead to unresolved mmaps in report output for 32bit
binaries on 64bit server, like in this example on x86_64 server:
$ cat ex.c
int main(int argc, char **argv)
{
while (1) {}
}
$ gcc -o ex -m32 ex.c
$ perf record ./ex
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.371 MB perf.data (9322 samples) ]
Before:
$ perf report --stdio
SNIP
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
100.00% ex [unknown] [.] 0x00000000080483de
0.00% ex [unknown] [.] 0x00000000f76dba4f
0.00% ex [unknown] [.] 0x00000000f76e4c11
0.00% ex [unknown] [.] 0x00000000f76daa30
After:
$ perf report --stdio
SNIP
# Overhead Command Shared Object Symbol
# ........ ....... ............. ...............
#
100.00% ex ex [.] main
0.00% ex ld-2.24.so [.] _dl_start
0.00% ex ld-2.24.so [.] do_lookup_x
0.00% ex ld-2.24.so [.] _start
The fix is not to fail, just warn if there's not unwind support compiled
in.
Reported-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170704131131.27508-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should set attr.exclude_kernel when probing for attr.precise_ip
level, otherwise !CAP_SYS_ADMIN users will not default to skidless
samples in capable hardware.
The increase in the paranoid level in commit 0161028b7c ("perf/core:
Change the default paranoia level to 2") broke this, fix it by excluding
kernel samples when probing.
Before:
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (6 samples) ]
$ perf evlist -v
cycles:u: sample_freq: 4000, sample_type: IP|TID|TIME|PERIOD, exclude_kernel: 1
After:
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
$ perf evlist -v
cycles:ppp: sample_freq: 4000, sample_type: IP|TID|TIME|PERIOD, exclude_kernel: 1, precise_ip: 3
^^^^^^^^^^^^^
^^^^^^^^^^^^^
^^^^^^^^^^^^^
$
To further clarify: we always set .exclude_kernel when non !CAP_SYS_ADMIN
users profile, its just on the attr.precise_ip probing that we weren't doing
so, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 7f8d1ade1b ("perf tools: By default use the most precise "cycles" hw counter available")
Link: http://lkml.kernel.org/n/tip-t2qttwhbnua62o5gt75cueml@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Ingo Molnar:
"Most of the changes are for tooling, the main changes in this cycle were:
- Improve Intel-PT hardware tracing support, both on the kernel and
on the tooling side: PTWRITE instruction support, power events for
C-state tracing, etc. (Adrian Hunter)
- Add support to measure SMI cost to the x86 architecture, with
tooling support in 'perf stat' (Kan Liang)
- Support function filtering in 'perf ftrace', plus related
improvements (Namhyung Kim)
- Allow adding and removing fields to the default 'perf script'
columns, using + or - as field prefixes to do so (Andi Kleen)
- Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso'
(Mark Santaniello)
- Add perf tooling unwind support for PowerPC (Paolo Bonzini)
- ... and various other improvements as well"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits)
perf auxtrace: Add CPU filter support
perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC
perf intel-pt: Update documentation to include new ptwrite and power events
perf intel-pt: Add example script for power events and PTWRITE
perf intel-pt: Synthesize new power and "ptwrite" events
perf intel-pt: Move code in intel_pt_synth_events() to simplify attr setting
perf intel-pt: Factor out intel_pt_set_event_name()
perf intel-pt: Tidy messages into called function intel_pt_synth_event()
perf intel-pt: Tidy Intel PT evsel lookup into separate function
perf intel-pt: Join needlessly wrapped lines
perf intel-pt: Remove unused instructions_sample_period
perf intel-pt: Factor out common code synthesizing event samples
perf script: Add synthesized Intel PT power and ptwrite events
perf/x86/intel: Constify the 'lbr_desc[]' array and make a function static
perf script: Add 'synth' field for synthesized event payloads
perf auxtrace: Add itrace option to output power events
perf auxtrace: Add itrace option to output ptwrite events
tools include: Add byte-swapping macros to kernel.h
perf script: Add 'synth' event type for synthesized events
x86/insn: perf tools: Add new ptwrite instruction
...
Decoding auxtrace data can take a long time. To avoid decoding
unnecessarily, filter auxtrace data that is collected per-cpu before it is
decoded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-38-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
CBR (core-to-bus ratio) packets provide an indication of CPU frequency. A
more accurate measure can be made by counting the cycles (given by CYC
packets) in between other timing packets (either MTC or TSC). Using TSC
packets has at least 2 issues: 1) timing might have stopped (e.g. mwait) or
2) TSC packets within PSB+ might slip past CYC packets. For now, simply do
not use TSC packets for calculating CPU cycles to TSC. That leaves the case
where 2 MTC packets are used, otherwise falling back to the CBR value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-37-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize new power and ptwrite events.
Power events report changes to C-state but I have also added support
for the existing CBR (core-to-bus ratio) packet and included that
when outputting power events.
The PTWRITE packet is associated with the new "ptwrite" instruction,
which is essentially just a way to stuff a 32 or 64 bit value into the
PT trace.
More details can be found in the patches that add documentation and in
the Intel SDM.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1498811805-2335-1-git-send-email-adrian.hunter@intel.com
[ Copy the description of such packet from the patchkit cover message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
intel_pt_synth_events() uses the same attr structure to create each event.
Move the code around a bit to simplify that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-33-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out common code in functions synthesizing event samples i.e.
intel_pt_synth_branch_sample(), intel_pt_synth_instruction_sample() and
intel_pt_synth_transaction_sample().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-27-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instruction trace decoders such as Intel PT may have additional information
recorded in the trace. For example, Intel PT has power information and a
there is a new instruction 'ptwrite' that can write a value into a PTWRITE
trace packet.
Such information may be associated with an IP and so can be treated as a
sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the
sample as raw_data (PERF_SAMPLE_RAW).
However a means of identifying the raw data format is needed. That will
be done by synthesizing an attribute for it.
So add an attribute type for custom synthesized events. Different
synthesized events will be identified by the attribute 'config'.
Committer notes:
Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U),
i.e. after perf_pmu_register() -> idr_alloc(end=0).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1498040239-32418-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add ptwrite to the op code map and the perf tools new instructions test.
To run the test:
$ tools/perf/perf test "x86 ins"
39: Test x86 instruction decoder - new instructions : Ok
Or to see the details:
$ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite
For information about ptwrite, refer the Intel SDM.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Finally can nuke this function, no more users.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eivvvzn8ie6w42gy3batxoy7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just warn the user and ignore those values.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tbf60nj3ierm6hrkhpothymx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To consolidate the error reporting facility.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b41iot1094katoffdf19w9zk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now everything uses pr_warning(), so ditch it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-hv8r0mgdhk73wtfq3zrhavgx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Convert sole user of warning() in this file to pr_warning(),
consolidating error reporting facilities.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3y7yf6v673ujl2rcs34tzv8n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
warning() is going away, consolidating error reporting.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5r3636cwl4z1varo90mervai@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michael reported the segfault when kernel.kptr_restrict=2 is set.
$ perf record ls
...
perf: Segmentation fault
Obtained 16 stack frames.
./perf(dump_stack+0x2d) [0x5068df]
./perf(sighandler_dump_stack+0x2d) [0x5069bf]
./perf() [0x43e47b]
/lib64/libc.so.6(+0x3594f) [0x7f762004794f]
/lib64/libc.so.6(strlen+0x26) [0x7f762009ef86]
/lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd]
./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f]
./perf(machine__create_kernel_maps+0x136) [0x50a7de]
./perf(perf_session__create_kernel_maps+0x2c) [0x510a81]
./perf(perf_session__new+0x13d) [0x510e23]
./perf() [0x43fd61]
./perf(cmd_record+0x704) [0x441823]
./perf() [0x4bc1a0]
./perf() [0x4bc40d]
./perf() [0x4bc55f]
./perf(main+0x2d5) [0x4bc939]
Segmentation fault (core dumped)
The reason is that with kernel.kptr_restrict=2, we don't get
the symbol from machine__get_running_kernel_start, which we
want to use in maps__set_kallsyms_ref_reloc_sym and we crash.
Check the symbol name value before calling
maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym
being set. It's safe because we check its existence before we use it.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170626095153.553-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In commit 613f050d68 ("perf probe: Fix to probe on gcc generated
functions in modules"), the offset from symbol is, incorrectly, added
to the trace point address. This leads to incorrect probe trace points
for inlined functions and when using relative line number on symbols.
Prior this patch:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626
After:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665
Committer testing:
Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are
the functions that while not being explicitely marked as inline, were inlined
by the compiler:
# pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head
__ew32
e1000_regdump
e1000e_dump_ps_pages
e1000_desc_unused
e1000e_systim_to_hwtstamp
e1000e_rx_hwtstamp
e1000e_update_rdt_wa
e1000e_update_tdt_wa
e1000_put_txbuf
e1000_consume_page
Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of
them:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in
e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for
that function:
$ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
<SNIP>
<1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram)
<13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq
<13e287> DW_AT_low_pc : 0x17a30
<3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e6f0> DW_AT_abstract_origin: <0x13ed2c>
<13e6f4> DW_AT_low_pc : 0x17be6
<SNIP>
<1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram)
<13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page
So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is
inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s
address, gives us the offset we should use in the probe definition:
0x17be6 - 0x17a30 = 438
but above we have 876, which is twice as much.
Lets see the second inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e86f> DW_AT_abstract_origin: <0x13ed2c>
<13e873> DW_AT_low_pc : 0x17d21
0x17d21 - 0x17a30 = 753
So we where adding it at twice the offset from the containing function as we
should.
And then after this patch:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753
p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353
#
Which matches the two first expansions and shows that because we were
doubling the offset it would spill over the next function:
readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq
674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps
This is the 3rd inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13ec78> DW_AT_abstract_origin: <0x13ed2c>
<13ec7c> DW_AT_low_pc : 0x17f79
0x17f79 - 0x17a30 = 1353
So:
0x17a30 + 2 * 1353 = 0x184c2
And:
0x184c2 - 0x18090 = 1074
Which explains the bogus third expansion for e1000_consume_page() to end up at:
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
All fixed now :-)
[1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 613f050d68 ("perf probe: Fix to probe on gcc generated functions in modules")
Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'transactions_sample_type' is needed to correctly inject transactions
samples but it was not being set. Set it from the event sample type.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'initial_skip' is checked inside the sample synthesis functions which means
it is actually being done twice for 'instructions' and 'transactions'
samples. Remove the redundant checks.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Future proof CBR packet decoding by passing through also the undefined
'reserved' byte in the packet payload.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This
patch only affects the decoder, so the tools still do not select or consume
the new information. That is added in subsequent patches.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel now supports the disabling of branch tracing, however the
decoder assumes branch tracing is always enabled. Pass through a parameter
to indicate whether branch tracing is enabled and use it to avoid cases
when the decoder is expecting branch packets. There are 2 such cases.
First, FUP packets which can bind to an IP even when there is no branch
tracing. Secondly, the decoder will try to use branch packets to find an IP
to start decoding or to recover from errors.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes a FUP packet is associated with a TSX transaction and a flag is
set to indicate that. Ensure that flag is cleared on any error condition
because at that point the decoder can no longer assume it is correct.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decoder will try to use branch packets to find an IP to start decoding
or to recover from errors. Currently the FUP packet is used only in the
case of an overflow, however there is no reason for that to be a special
case. So just use FUP always when scanning for an IP.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses IP compression based on the last IP. For decoding purposes,
'last IP' is not updated when a branch target has been suppressed, which is
indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
ensure never to set 'last_ip' when packet 'count' is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses IP compression based on the last IP. For decoding
purposes, 'last IP' is considered to be reset to zero whenever there is
a synchronization packet (PSB). The decoder wasn't doing that, and was
treating the zero value to mean that there was no last IP, whereas
compression can be done against the zero value. Fix by setting last_ip
to zero when a PSB is received and keep track of have_last_ip.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached. Improve that situation by using the pkt_state to
determine when to use the current or previous timestamp.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implementing a new --smi-cost mode in perf stat to measure SMI cost.
During the measurement, the /sys/device/cpu/freeze_on_smi will be set.
The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).
In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.
SMI cycles% = (aperf - unhalted core cycles) / aperf
Here is an example output.
Performance counter stats for 'sudo echo ':
SMI cycles% SMI#
0.1% 1
0.010858678 seconds time elapsed
Users who wants to get the actual value can apply additional
--no-metric-only.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Curious as to what this was for I looked at /usr/include/ and only some
python headers define this, and it ends up being to enable "extensions"
on some old OSes:
/* Enable extensions on AIX 3, Interix */
I guess we can remove this one safely.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While trying to reduce util.[ch] I noticed that fetch_kernel_version()
and fetch_ubuntu_kernel_version() do lots of operations only to check if
they are needed, i.e. it checks if the pointer where to return the
kernel version is NULL only after obtaining the kernel version from
/proc/version_signature or by parsing the results from uname().
Do it earlier not to confuse people reading this code in the future :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And make it static, nobody else uses it, if we ever need it in more
places we can carve a new source file for process related methods,
for now lets reduce util.{c,h} a tad more.
Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In annotate browser, we will add support to check fused instructions.
While this is x86-specific feature so we need the annotate browser to
know what the arch it runs on.
symbol__disassemble() has figured out the arch. This patch just lets the
arch return from symbol__disassemble and save the arch in annotate
browser.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These defines were probably dragged in from sampling support in earlier
patches. They can be put back when needed.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170616112339.3fb6986e4ff33e353008244b@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to use a specific
alignment, making tools/ look more like kernel source code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8jiem6ubg9rlpbs7c2p900no@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to not insert alignment
paddings in a struct, making tools/ look more like kernel source code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-byp46nr7hsxvvyc9oupfb40q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of defining __unused or redefining __maybe_unused.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4eleto5pih31jw1q4dypm9pf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to perform scanf like
argument validation.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yzqrhfjrn26lqqtwf55egg0h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to ask the compiler to perform printf like
vargargs validation.
v2: Fixed up build on arm, squashing a patch by Kim Phillips, thanks!
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dopkqmmuqs04cxzql0024nnu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To have a more compact way to specify that a function doesn't return,
instead of the open coded:
__attribute__((noreturn))
And use it instead of the tools/perf/ specific variation, NORETURN.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-l0y144qzixcy5t4c6i7pdiqj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With commit: 0a943cb10c (tools build: Add HOSTARCH Makefile variable)
when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of
ARCH=x86, so the perf build process searchs header files from
tools/arch/x86_64/include, which doesn't exist.
The following build failure is seen:
In file included from util/event.c:2:0:
tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory
compilation terminated.
Fix this issue by using SRCARCH instead of ARCH in perf, just like the
main kernel Makefile and tools/objtool's.
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Rui Teng <rui.teng@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0a943cb10c ("tools build: Add HOSTARCH Makefile variable")
Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 18e7a45af9 ("perf/x86: Reject non sampling events with
precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
in the routine used to probe the max precise level when no events were
passed to 'perf record' or 'perf top', i.e.:
perf_evsel__new_cycles()
perf_event_attr__set_max_precise_ip()
The x86 code, in x86_pmu_hw_config(), which is called all the way from
sys_perf_event_open() did, starting with the aforementioned commit:
/* There's no sense in having PEBS for non sampling events: */
if (!is_sampling_event(event))
return -EINVAL;
Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
just the non precise cycles variant.
To make sure that this is the case, I tested it, before this patch,
with:
# perf probe -L x86_pmu_hw_config
<x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0>
0 int x86_pmu_hw_config(struct perf_event *event)
1 {
2 if (event->attr.precise_ip) {
<SNIP>
17 if (event->attr.precise_ip > precise)
18 return -EOPNOTSUPP;
/* There's no sense in having PEBS for non sampling events: */
21 if (!is_sampling_event(event))
22 return -EINVAL;
}
<SNIP>
# perf probe x86_pmu_hw_config:22
Added new events:
probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)
You can now use it in all perf tools, such as:
perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1
# perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
#
I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.
So fix it by just setting attr.sample_period
Now, after this patch:
# perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
[ perf record: Woken up 1 times to write data ]
0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_open_cloexec_flag (/home/acme/bin/perf)
0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
#
The probe one called from perf_event_attr__set_max_precise_ip() works
the first time, with attr.precise_ip = 3, wit hthe next ones being the
per cpu ones for the cycles:ppp event.
And here is the text from a report and alternative proposed patch by
Thomas-Mich Richter:
---
On s390 the counter and sampling facility do not support a precise IP
skid level and sometimes returns EOPNOTSUPP when structure member
precise_ip in struct perf_event_attr is not set to zero.
On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This
happens only when no events are specified on command line.
The functions called are
...
--> perf_evlist__add_default
--> perf_evsel__new_cycles
--> perf_event_attr__set_max_precise_ip
The last function determines the value of structure member precise_ip by
invoking the perf_event_open() system call and checking the return code.
The first successful open is the value for precise_ip.
However the value is determined without setting member sample_period and
indicates no sampling.
On s390 the counter facility and sampling facility are different. The
above procedure determines a precise_ip value of 3 using the counter
facility. Later it uses the sampling facility with a value of 3 and
fails with EOPNOTSUPP.
---
v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
of unnamed union members in the container struct initialization, so
move from:
struct perf_event_attr attr = {
...
.sample_period = 1,
};
to right after it as:
struct perf_event_attr attr = {
...
};
attr.sample_period = 1;
v3: We need to reset .sample_period to 0 to let the users of
perf_evsel__new_cycles() to properly setup attr.sample_period or
attr.sample_freq. Reported by Ingo Molnar.
Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 18e7a45af9 ("perf/x86: Reject non sampling events with precise_ip")
Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit e7ee404757 ("perf symbols: Fix symbols searching for module
in buildid-cache") added the function to check kernel modules reside in
the build-id cache. This was because there's no way to identify a DSO
which is actually a kernel module. So it searched linkname of the file
and find ".ko" suffix.
But this does not work for compressed kernel modules and now such DSOs
hCcave correct symtab_type now. So no need to check it anymore. This
patch essentially reverts the commit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The symsrc__init() overwrites dso->symtab_type as symsrc->type in
dso__load_sym(). But for compressed kernel modules in the build-id
cache, it should have original symtab type to be decompressed as needed.
This fixes perf annotate to show disassembly of the function properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On failure, it should free the 'name', so clean up the error path using
goto.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf decompresses kernel modules when loading the symbol table
but it missed to do it when reading raw data.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Convert open-coded decompress routine to use the function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move decompress_kmodule() to util/dso.c and split it into two functions
returning fd and (decompressed) file path. The existing user only wants
the fd version but the path version will be used soon.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'name' variable should be freed on the error path.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 6ebd2547dd ("perf annotate: Fix a bug following symbolic
link of a build-id file") changed to use dirname to follow the symlink.
But it only considers new-style build-id cache names so old names fail
on readlink() and force to use system path which might not available.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Fixes: 6ebd2547dd ("perf annotate: Fix a bug following symbolic link of a build-id file")
Link: http://lkml.kernel.org/r/20170608073109.30699-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Script generated by the '--gen-script' option contains an outdated
comment. It mentions a 'perf-trace-python' document while it has been
renamed to 'perf-script-python'. Fix it.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
Link: http://lkml.kernel.org/r/20170530111827.21732-2-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~
But with libdw and without this patch this sample is not properly
unwound:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~
Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decompress_kmodule() decompresses kernel modules in order to load
symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
the full file path to extract the file extension to determine the
decompression method. But overwriting 'name' will fail the
decompression since it might point to a non-existing old file.
Instead, use dso->long_name for having the correct extension and use the
real filename to decompress.
In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
be the same. This allows resolving symbols in the old modules.
Before:
$ perf report -i perf.data.old | grep scsi_mod
0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6
0.00% as [scsi_mod] [k] 0x00000000000099e1
0.00% cc1 [scsi_mod] [k] 0x0000000000009830
0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f
After:
0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up
0.00% as [scsi_mod] [k] scsi_sg_alloc
0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd
0.00% cc1 [scsi_mod] [k] scsi_get_command
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like machine__findnew_module_dso(), it should set necessary info for
kernel modules to find symbol info from the file. Factor out
dso__set_module_info() to do it.
This is needed for dso__needs_decompress() to detect such DSOs.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf processes build-id event, it creates DSOs with the build-id.
But it didn't set the module short name (like '[module-name]') so when
processing a kernel mmap event of the module, it cannot found the DSO as
it only checks the short names.
That leads for perf to create a same DSO without the build-id info and
it'll lookup the system path even if the DSO is already in the build-id
cache. After kernel was updated, perf cannot find the DSO and cannot
show symbols in it anymore.
You can see this if you have an old data file (w/ old kernel version):
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
...
The second message didn't show the build-id. With this patch:
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
/lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
...
Now it shows the build-id but still cannot load the symbol table. This
is a different problem which will be fixed in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
[ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When filename contains special chars, perf annotate fails
with an error:
$ perf annotate --vmlinux ./vmlinux\(test\) --stdio native_safe_halt
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `objdump --start-address=0xffffffff8184e840
--stop-address=0xffffffff8184e848 -l -d --no-show-raw -S -C
./vmlinux(test) 2>/dev/null|grep -v ./vmlinux(test):|expand'
Fix it by surrounding filename in double quotes.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adam Stylinski <adam.stylinski@etegent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170505101417.2117-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.
Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mostly in the documentation.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com
[ Fix spelling of "parameter" in one of the spell-checked lines ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Symbol versioning, as in glibc, results in symbols being defined as:
<real symbol>@[@]<version>
(Note that "@@" identifies a default symbol, if the symbol name is
repeated.)
perf is currently unable to deal with this, and is unable to create user
probes at such symbols:
--
$ nm /lib/powerpc64le-linux-gnu/libpthread.so.0 | grep pthread_create
0000000000008d30 t __pthread_create_2_1
0000000000008d30 T pthread_create@@GLIBC_2.17
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
probe-definition(0): pthread_create
symbol:pthread_create file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Open Debuginfo file: /usr/lib/debug/lib/powerpc64le-linux-gnu/libpthread-2.19.so
Try to find probe point from debuginfo.
Probe point 'pthread_create' not found.
Error: Failed to add events. Reason: No such file or directory (Code: -2)
--
One is not able to specify the fully versioned symbol, either, due to
syntactic conflicts with other uses of "@" by perf:
--
$ /usr/bin/sudo perf probe -v -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create@@GLIBC_2.17
probe-definition(0): pthread_create@@GLIBC_2.17
Semantic error :SRC@SRC is not allowed.
0 arguments
Error: Command Parse Error. Reason: Invalid argument (Code: -22)
--
This patch ignores versioning for default symbols, thus allowing probes to be
created for these symbols:
--
$ /usr/bin/sudo ./perf probe -x /lib/powerpc64le-linux-gnu/libpthread.so.0 pthread_create
Added new event:
probe_libpthread:pthread_create (on pthread_create in /lib/powerpc64le-linux-gnu/libpthread-2.19.so)
You can now use it in all perf tools, such as:
perf record -e probe_libpthread:pthread_create -aR sleep 1
$ /usr/bin/sudo ./perf record -e probe_libpthread:pthread_create -aR ./test 2
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.052 MB perf.data (2 samples) ]
$ /usr/bin/sudo ./perf script
test 2915 [000] 19124.260729: probe_libpthread:pthread_create: (3fff99248d38)
test 2916 [000] 19124.260962: probe_libpthread:pthread_create: (3fff99248d38)
$ /usr/bin/sudo ./perf probe --del=probe_libpthread:pthread_create
Removed event: probe_libpthread:pthread_create
--
Committer note:
Change the variable storing the result of strlen() to 'int', to fix the build
on debian:experimental-x-mipsel, fedora:24-x-ARC-uClibc, ubuntu:16.04-x-arm,
etc:
util/symbol.c: In function 'symbol__match_symbol_name':
util/symbol.c:422:11: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (len < versioning - name)
^
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/c2b18d9c-17f8-9285-4868-f58b6359ccac@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That is the case of _text on s390, and we have some functions that return an
address, using address zero to report problems, oops.
This would lead the symbol loading routines to not use "_text" as the reference
relocation symbol, or the first symbol for the kernel, but use instead
"_stext", that is at the same address on x86_64 and others, but not on s390:
[acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms
0000000000000000 T _text
0000000000000418 t iplstart
0000000000000800 T start
000000000000080a t .base
000000000000082e t .sk8x8
0000000000000834 t .gotr
0000000000000842 t .cmd
0000000000000846 t .parm
000000000000084a t .lowcase
0000000000010000 T startup
0000000000010010 T startup_kdump
0000000000010214 t startup_kdump_relocated
0000000000011000 T startup_continue
00000000000112a0 T _ehead
0000000000100000 T _stext
[acme@localhost perf-4.11.0-rc6]$
Which in turn would make 'perf test vmlinux' to fail because it wouldn't find
the symbols before "_stext" in kallsyms.
Fix it by using the return value only for errors and storing the
address, when the symbol is successfully found, in a provided pointer
arg.
Before this patch:
After:
[acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 40693
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols
ERR : 0: _text not on kallsyms
ERR : 0x418: iplstart not on kallsyms
ERR : 0x800: start not on kallsyms
ERR : 0x80a: .base not on kallsyms
ERR : 0x82e: .sk8x8 not on kallsyms
ERR : 0x834: .gotr not on kallsyms
ERR : 0x842: .cmd not on kallsyms
ERR : 0x846: .parm not on kallsyms
ERR : 0x84a: .lowcase not on kallsyms
ERR : 0x10000: startup not on kallsyms
ERR : 0x10010: startup_kdump not on kallsyms
ERR : 0x10214: startup_kdump_relocated not on kallsyms
ERR : 0x11000: startup_continue not on kallsyms
ERR : 0x112a0: _ehead not on kallsyms
<SNIP warnings>
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
[acme@localhost perf-4.11.0-rc6]$
After:
[acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 47160
<SNIP warnings>
test child finished with 0
---- end ----
vmlinux symtab matches kallsyms: Ok
[acme@localhost perf-4.11.0-rc6]$
Reported-by: Michael Petlan <mpetlan@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both had copies originating from git.git, move those to
tools/lib/string.c, getting both tools/lib/subcmd/ and tools/perf/ to
use it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uidwtticro1qhttzd2rkrkg1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its basically to do units handling, so move to a more appropriately
named object.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-90ob9vfepui24l8l2makhd9u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a perl specific hack, so move it from util.h to where perl
headers are used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4igctbinuom2sr6g4b03jqht@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just one more step into splitting util.[ch] to reduce the includes hell.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-navarr9mijkgwgbzu464dwam@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
More needs to be done to have the actual functions and variables in a
smaller .c file that can then be included in the python binding,
avoiding dragging more stuff into it.
Link: http://lkml.kernel.org/n/tip-uecxz7cqkssouj7tlxrkqpl4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recent commit broke command name strip in perf_event__get_comm_ids
function. It replaced left to right search for '\n' with rtrim, which
actually does right to left search. It occasionally caught earlier '\n'
and kept trash in the command name.
Keeping the ltrim, but moving back the left to right '\n' search
instead of the rtrim.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Taeung Song <treeze.taeung@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Fixes: bdd97ca63f ("perf tools: Refactor the code to strip command name with {l,r}trim()")
Link: http://lkml.kernel.org/r/20170420092430.29657-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/event.h header needs PERF_ALIGN(), but wasn't including
linux/kernel.h, where it is defined, instead it was getting it by
luck by including map.h, which it doesn't need at all.
Fix it by including the right header.
Link: http://lkml.kernel.org/n/tip-nf3t9blzm5ncoxsczi8oy9mx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>