The qspinlock code supports up to 4 levels of slowpath nesting using
four per-CPU mcs_spinlock structures. For 64-bit architectures, they
fit nicely in one 64-byte cacheline.
For para-virtualized (PV) qspinlocks it needs to store more information
in the per-CPU node structure than there is space for. It uses a trick
to use a second cacheline to hold the extra information that it needs.
So PV qspinlock needs to access two extra cachelines for its information
whereas the native qspinlock code only needs one extra cacheline.
Freshly added counter profiling of the qspinlock code, however, revealed
that it was very rare to use more than two levels of slowpath nesting.
So it doesn't make sense to penalize PV qspinlock code in order to have
four mcs_spinlock structures in the same cacheline to optimize for a case
in the native qspinlock code that rarely happens.
Extend the per-CPU node structure to have two more long words when PV
qspinlock locks are configured to hold the extra data that it needs.
As a result, the PV qspinlock code will enjoy the same benefit of using
just one extra cacheline like the native counterpart, for most cases.
[ mingo: Minor changelog edits. ]
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1539697507-28084-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Queued spinlock supports up to 4 levels of lock slowpath nesting -
user context, soft IRQ, hard IRQ and NMI. However, we are not sure how
often the nesting happens.
So add 3 more per-CPU stat counters to track the number of instances where
nesting index goes to 1, 2 and 3 respectively.
On a dual-socket 64-core 128-thread Zen server, the following were the
new stat counter values under different circumstances:
State slowpath index1 index2 index3
----- -------- ------ ------ -------
After bootup 1,012,150 82 0 0
After parallel build + perf-top 125,195,009 82 0 0
So the chance of having more than 2 levels of nesting is extremely low.
[ mingo: Minor changelog edits. ]
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1539697507-28084-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To prevent dynamic completion objects from being de-allocated while still
in use, add a recommendation to embed them in long lived data structures.
Also add a note for the on-stack case that emphasizes the dangers of
the limited scope, and recommends dynamic allocation if scope limitations
are not clearly understood.
[ mingo: Minor touch-ups of the text, expanded it a bit to make the
warnings Nicholas added more prominent. ]
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john.garry@huawei.com
Link: http://lkml.kernel.org/r/1539697539-24055-1-git-send-email-hofrat@osadl.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add an entry for the Broadcom SPI controller in the MAINTAINERS file.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
So the extra user build flags are propagated to libtraceevent.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: "Herton R. Krzesinski" <herton@redhat.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Link: http://lkml.kernel.org/r/20181016150614.21260-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the function name for an inline frame is invalid, we must not try
to demangle this symbol, otherwise we crash with:
#0 0x0000555555895c01 in bfd_demangle ()
#1 0x0000555555823262 in demangle_sym (dso=0x555555d92b90, elf_name=0x0, kmodule=0) at util/symbol-elf.c:215
#2 dso__demangle_sym (dso=dso@entry=0x555555d92b90, kmodule=<optimized out>, kmodule@entry=0, elf_name=elf_name@entry=0x0) at util/symbol-elf.c:400
#3 0x00005555557fef4b in new_inline_sym (funcname=0x0, base_sym=0x555555d92b90, dso=0x555555d92b90) at util/srcline.c:89
#4 inline_list__append_dso_a2l (dso=dso@entry=0x555555c7bb00, node=node@entry=0x555555e31810, sym=sym@entry=0x555555d92b90) at util/srcline.c:264
#5 0x00005555557ff27f in addr2line (dso_name=dso_name@entry=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf", addr=addr@entry=2888, file=file@entry=0x0,
line=line@entry=0x0, dso=dso@entry=0x555555c7bb00, unwind_inlines=unwind_inlines@entry=true, node=0x555555e31810, sym=0x555555d92b90) at util/srcline.c:313
#6 0x00005555557ffe7c in addr2inlines (sym=0x555555d92b90, dso=0x555555c7bb00, addr=2888, dso_name=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf")
at util/srcline.c:358
So instead handle the case where we get invalid function names for
inlined frames and use a fallback '??' function name instead.
While this crash was originally reported by Hadrien for rust code, I can
now also reproduce it with trivial C++ code. Indeed, it seems like
libbfd fails to interpret the debug information for the inline frame
symbol name:
$ addr2line -e /home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf -if b48
main
/usr/include/c++/8.2.1/complex:610
??
/usr/include/c++/8.2.1/complex:618
??
/usr/include/c++/8.2.1/complex:675
??
/usr/include/c++/8.2.1/complex:685
main
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
I've reported this bug upstream and also attached a patch there which
should fix this issue:
https://sourceware.org/bugzilla/show_bug.cgi?id=23715
Reported-by: Hadrien Grasland <grasland@lal.in2p3.fr>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a64489c56c ("perf report: Find the inline stack for a given address")
[ The above 'Fixes:' cset is where originally the problem was
introduced, i.e. using a2l->funcname without checking if it is NULL,
but this current patch fixes the current codebase, i.e. multiple csets
were applied after a64489c56c before the problem was reported by Hadrien ]
Link: http://lkml.kernel.org/r/20180926135207.30263-3-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
According to rfc7496 section 4.3 or 4.4:
sprstat_policy: This parameter indicates for which PR-SCTP policy
the user wants the information. It is an error to use
SCTP_PR_SCTP_NONE in sprstat_policy. If SCTP_PR_SCTP_ALL is used,
the counters provided are aggregated over all supported policies.
We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL
instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is
an error to use SCTP_PR_SCTP_NONE in sprstat_policy. "
Fixes: 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
Fixes: d229d48d18 ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David writes:
"Sparc fixes
1) Revert the %pOF change, it causes regressions.
2) Wire up io_pgetevents().
3) Fix perf events on single-PCR sparc64 cpus.
4) Do proper perf event throttling like arm and x86."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
Revert "sparc: Convert to using %pOFn instead of device_node.name"
sparc64: Set %l4 properly on trap return after handling signals.
sparc64: Make proc_id signed.
sparc: Throttle perf events properly.
sparc: Fix single-pcr perf event counter management.
sparc: Wire up io_pgetevents system call.
sunvdc: Remove VLA usage
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlvFDJkUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXO0Gw/9HZWNjxZuiBeJfC0D+CYLn9Tkj4Ym
3bUzc44+f2F+B5irzWZS19btX3WzqRGjmplMnVoVCGkABPx1I2jUH88WiUSgAi0R
rqnZD7rHhtGVdKh97cqcF9imt10RE5myxpaArG9V9Cr8G8EEZXaYGMRZ7hGc05o9
i9ZotSQSirSAfEUkVNmiA0ek74TWHYOc8SgALNzMmzTFcy3K2qRT7vPEMAa406gW
s/8XFWkFhiYmSivk0Nzgc4yOKKBNAzRC63ssl9k8u0ztJvc2HSy/tR+b1bDoIK2k
TY0bc5UVl/X/YrRTx8M7Mzr9KNRSO3hugEzDyq8B55myxI++TFRgc4OWqa976lH8
J51kvf+TEh5jDvo+3nHMUJj1DQM6ZIJy8mGrEWahabLaxbiYvWzuh9FkNZDOiy4/
B0sUskoKoaih02oHOuOuSmVe8nhJ+HYA6ENk+sGfEKjColGNUGpHulQhXK1FYX6w
62drdcp599u8QCtYqzIEgV4nGrEKw3SyBaM5sb2dcN7l2najr0qQwDzzLVxVLWu6
lMV1ulqUZNU1SF/VlpHwn4yvdOV89ENoghJrbNzcrFfIuETl/D3y7udn37woi6GC
COPb/MimsKFrHa36dxAJHhbUjZRn6JYVaM+Nt39kn9b+mvlGDUGQlrQXTyuNqC8T
yxk/9NEmd83hDuM=
=fQ6g
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20181015' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Paul writes:
"SELinux fixes for v4.19
We've got one SELinux "fix" that I'd like to get into v4.19 if
possible. I'm using double quotes on "fix" as this is just an update
to the MAINTAINERS file and not a code change. From my perspective,
MAINTAINERS updates generally don't warrant inclusion during the -rcX
phase, but this is a change to the mailing list location so it seemed
prudent to get this in before v4.19 is released"
* tag 'selinux-pr-20181015' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
MAINTAINERS: update the SELinux mailing list location
hdr.cmd can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/infiniband/core/ucma.c:1686 ucma_write() warn: potential
spectre issue 'ucma_cmd_table' [r] (local cap)
Fix this by sanitizing hdr.cmd before using it to index
ucm_cmd_table.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Ditch the deffered list, lock, and workqueue handling. Just mark the
set as being blocking, so we are invoked from a workqueue already.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This driver likes to fetch requests from all over the place, so make
queue_rq put requests on a list so that the logic stays the same. Tested
with QEMU.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Converted to blk_mq_init_sq_queue() and fixed a few spots where the
tag_set leaked on cleanup.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This driver is already pretty broken, in that it has two wait_events()
(one in stdma_lock()) in request_fn. Get rid of the first one by
freezing/quiescing the queue on format, and the second one by replacing
it with stdma_try_lock(). The rest is straightforward. Compile-tested
only and probably incorrect.
Cc: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Converted to blk_mq_init_sq_queue()
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move queue allocation next to disk allocation to fix a couple of issues:
- If add_disk() hasn't been called, we should clear disk->queue before
calling put_disk().
- If we fail to allocate a request queue, we still need to put all of
the disks, not just the ones that we allocated queues for.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
atafd.h and atafdreg.h are only used from ataflop.c, so merge them in
there.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Straightforward conversion, just use the existing amiflop_lock to
serialize access to the controller. Compile-tested only.
Cc: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Converted to blk_mq_init_sq_queue()
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The error handling in fd_probe_drives() doesn't clean up at all. Fix it
up in preparation for converting to blk-mq. While we're here, get rid of
the commented out amiga_floppy_remove().
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
amifd.h and amifdreg.h are only used from amiflop.c, and they're pretty
small, so move the contents to amiflop.c and get rid of the .h files.
This is preparation for adding a struct blk_mq_tag_set to struct
amiga_floppy_struct.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pretty simple conversion. grab_drive() could probably be replaced by
some freeze/quiesce incantation, but I left it alone, and just used
freeze/quiesce for eject. Compile-tested only.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Converted to blk_mq_init_sq_queue().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The driver doesn't have support for removing a device that has already
been configured, but with more careful ordering we can avoid the need
for that and make sure that we don't leak generic resources.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The only interesting thing here is that there may be two floppies (i.e.,
request queues) sharing the same controller, so we use the global struct
swim_priv->lock to check whether the controller is busy. Compile-tested
only.
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Converted to blk_mq_init_sq_queue()
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we fail to allocate the request queue for a disk, we still need to
free that disk, not just the previous ones. Additionally, we need to
cleanup the previous request queues.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.
Using two instructions of course opens a window we previously did not
have. Consider the scenario:
CPU0 CPU1 CPU2
1) lock
trylock -> (0,0,1)
2) lock
trylock /* fail */
3) unlock -> (0,0,0)
4) lock
trylock -> (0,0,1)
5) tas-pending -> (0,1,1)
load-val <- (0,1,0) from 3
6) clear-pending-set-locked -> (0,0,1)
FAIL: _2_ owners
where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).
To avoid this we need 2 things:
- the load must come after the tas-pending (obviously, otherwise it
can trivially observe prior state).
- the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
example, such that we cannot observe other state prior to setting
pending.
On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.
Note that observing later state is not a problem:
- if we fail to observe a later unlock, we'll simply spin-wait for
that store to become visible.
- if we observe a later xchg_tail(), there is no difference from that
xchg_tail() having taken place before the tas-pending.
Suggested-by: Will Deacon <will.deacon@arm.com>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently the GEN_*_RMWcc() macros include a return statement, which
pretty much mandates we directly wrap them in a (inline) function.
Macros with return statements are tricky and, as per the above, limit
use, so remove the return statement and make them
statement-expressions. This allows them to be used more widely.
Also, shuffle the arguments a bit. Place the @cc argument as 3rd, this
makes it consistent between UNARY and BINARY, but more importantly, it
makes the @arg0 argument last.
Since the @arg0 argument is now last, we can do CPP trickery and make
it an optional argument, simplifying the users; 17 out of 18
occurences do not need this argument.
Finally, change to asm symbolic names, instead of the numeric ordering
of operands, which allows us to get rid of __BINARY_RMWcc_ARG and get
cleaner code overall.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: JBeulich@suse.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@linux.intel.com
Link: https://lkml.kernel.org/r/20181003130957.108960094@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While working my way through the code again; I felt the comments could
use help.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
such that we loose the indent. This also result in a more natural code
flow IMO.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hdr.cmd can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/infiniband/core/ucm.c:1127 ib_ucm_write() warn: potential
spectre issue 'ucm_cmd_table' [r] (local cap)
Fix this by sanitizing hdr.cmd before using it to index
ucm_cmd_table.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The size of the resulting cpu map can be smaller than a multiple of
sizeof(u64), resulting in SIGBUS on cpus like Sparc as the next event
will not be aligned properly.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Fixes: 6c872901af ("perf cpu_map: Add cpu_map event synthesize function")
Link: http://lkml.kernel.org/r/20181011.224655.716771175766946817.davem@davemloft.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Memory events depends on PEBS support and access to LDLAT MSR, but we
display them in /sys/devices/cpu/events even if the CPU does not
provide those, like for KVM guests.
That brings the false assumption that those events should be
available, while they fail event to open.
Separating the mem-* events attributes and merging them with
cpu_events only if there's PEBS support detected.
We could also check if LDLAT MSR is available, but the PEBS check
seems to cover the need now.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180906135748.GC9577@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If there's no tracefs (RHEL7) support the tracing_path_mount
returns debugfs path which results in following fail:
# perf probe sys_write
kprobe_events file does not exist - please rebuild kernel with CONFIG_KPROBE_EVENTS.
Error: Failed to add events.
In tracing_path_debugfs_mount function we need to return the
'tracing' path instead of just the mount to make it work:
# perf probe sys_write
Added new event:
probe:sys_write (on sys_write)
You can now use it in all perf tools, such as:
perf record -e probe:sys_write -aR sleep 1
Adding the 'return tracing_path;' also to tracing_path_tracefs_mount
function just for consistency with tracing_path_debugfs_mount.
Upstream keeps working, because it has the tracefs support.
Link: http://lkml.kernel.org/n/tip-yiwkzexq9fk1ey1xg3gnjlw4@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: 23773ca18b ("perf tools: Make perf aware of tracefs")
Link: http://lkml.kernel.org/r/20181016114818.3595-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a build is run from something like a cron job, the user's $PATH is
rather minimal, of note, not including /usr/sbin in my own case. Because
of that, an automated rpm package build ultimately fails to find
libperf-jvmti.so, because somewhere within the build, this happens...
/bin/sh: alternatives: command not found
/bin/sh: alternatives: command not found
Makefile.config:849: No openjdk development package found, please install
JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel
...and while the build continues, libperf-jvmti.so isn't built, and
things fall down when rpm tries to find all the %files specified. Exact
same system builds everything just fine when the job is launched from a
login shell instead of a cron job, since alternatives is in $PATH, so
openjdk is actually found.
The test required to get into this section of code actually specifies
the full path, as does a block just above it, so let's do that here too.
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: William Cohen <wcohen@redhat.com>
Fixes: d4dfdf00d4 ("perf jvmti: Plug compilation into perf build")
Link: http://lkml.kernel.org/r/20180906221812.11167-1-jarod@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Straight forward conversion, using an internal list to enable the
driver to pull requests at will.
Dynamically allocate the tag set to avoid having to pull in the
block headers for blktrans.h, since various mtd drivers use
block conflicting names for defines and functions.
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
John reported crash when recording on an event under PMU with cpumask defined:
root@localhost:~# ./perf_debug_ record -e armv8_pmuv3_0/br_mis_pred/ sleep 1
perf: Segmentation fault
Obtained 9 stack frames.
./perf_debug_() [0x4c5ef8]
[0xffff82ba267c]
./perf_debug_() [0x4bc5a8]
./perf_debug_() [0x419550]
./perf_debug_() [0x41a928]
./perf_debug_() [0x472f58]
./perf_debug_() [0x473210]
./perf_debug_() [0x4070f4]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe0) [0xffff8294c8a0]
Segmentation fault (core dumped)
We synthesize an update event that needs to touch the evsel id array, which is
not defined at that time. Fixing this by forcing the id allocation for events
with their own cpus.
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Fixes: bfd8f72c27 ("perf record: Synthesize unit/scale/... in event update")
Link: http://lkml.kernel.org/r/20181003212052.GA32371@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 7a68d9fb85 ("USB: usbdevfs: sanitize flags more") checks the
transfer flags for URBs submitted from userspace via usbfs. However,
the check for whether the USBDEVFS_URB_SHORT_NOT_OK flag should be
allowed for a control transfer was added in the wrong place, before
the code has properly determined the direction of the control
transfer. (Control transfers are special because for them, the
direction is set by the bRequestType byte of the Setup packet rather
than direction bit of the endpoint address.)
This patch moves code which sets up the allow_short flag for control
transfers down after is_in has been set to the correct value.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+24a30223a4b609bb802e@syzkaller.appspotmail.com
Fixes: 7a68d9fb85 ("USB: usbdevfs: sanitize flags more")
CC: Oliver Neukum <oneukum@suse.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When there is a mismatch in the CTR_EL0 field, we trap
access to CTR from EL0 on all CPUs to expose the safe
value. However, we could skip trapping on a CPU which
matches the safe value.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
CTR_EL0.IDC reports the data cache clean requirements for instruction
to data coherence. However, if the field is 0, we need to check the
CLIDR_EL1 fields to detect the status of the feature. Currently we
don't do this and generate a warning with tainting the kernel, when
there is a mismatch in the field among the CPUs. Also the userspace
doesn't have a reliable way to check the CLIDR_EL1 register to check
the status.
This patch fixes the problem by checking the CLIDR_EL1 fields, when
(CTR_EL0.IDC == 0) and updates the kernel's copy of the CTR_EL0 for
the CPU with the actual status of the feature. This would allow the
sanity check infrastructure to do the proper checking of the fields
and also allow the CTR_EL0 emulation code to supply the real status
of the feature.
Now, if a CPU has raw CTR_EL0.IDC == 0 and effective IDC == 1 (with
overall system wide IDC == 1), we need to expose the real value to
the user. So, we trap CTR_EL0 access on the CPU which reports incorrect
CTR_EL0.IDC.
Fixes: commit 6ae4b6e057 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The matches() routine for a capability must honor the "scope"
passed to it and return the proper results.
i.e, when passed with SCOPE_LOCAL_CPU, it should check the
status of the capability on the current CPU. This is used by
verify_local_cpu_capabilities() on a late secondary CPU to make
sure that it's compliant with the established system features.
However, ARM64_HAS_CACHE_{IDC/DIC} always checks the system wide
registers and this could mean that a late secondary CPU could return
"true" (since the CPU hasn't updated the system wide registers yet)
and thus lead the system in an inconsistent state, where
the system assumes it has IDC/DIC feature, while the new CPU
doesn't.
Fixes: commit 6ae4b6e057 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
If the policy limits change between invocations of cs_dbs_update(),
the requested frequency value stored in dbs_info may not be updated
and the function may use a stale value of it next time. Moreover, if
idle periods are takem into account by cs_dbs_update(), the requested
frequency value stored in dbs_info may be below the min policy limit,
which is incorrect.
To fix these problems, always update the requested frequency value
in dbs_info along with the local copy of it when the previous
requested frequency is beyond the policy limits and avoid decreasing
the requested frequency below the min policy limit when taking
idle periods into account.
Fixes: abb6627910 (cpufreq: conservative: Fix next frequency selection)
Fixes: 00bfe05889 (cpufreq: conservative: Decrease frequency faster for deferred updates)
Reported-by: Waldemar Rymarkiewicz <waldemarx.rymarkiewicz@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Waldemar Rymarkiewicz <waldemarx.rymarkiewicz@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
As noticed by Dave Anglin, the last commit introduced a small bug where
the potentially uninitialized r struct is used instead of the regs
pointer as input for unwind_frame_init(). Fix it.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: John David Anglin <dave.anglin@bell.net>
When an invalid mount option is passed to jffs2, jffs2_parse_options()
will fail and jffs2_sb_info will be freed, but then jffs2_sb_info will
be used (use-after-free) and freeed (double-free) in jffs2_kill_sb().
Fix it by removing the buggy invocation of kfree() when getting invalid
mount options.
Fixes: 92abc475d8 ("jffs2: implement mount option parsing and compression overriding")
Cc: stable@kernel.org
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Updated documentation to explain base_frequency attribute.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Expose base_frequency to user space via cpufreq sysfs when HWP is in
use.
This HWP base frequency is read from the ACPI _CPC object if present,
or from the HWP Capabilities MSR otherwise.
On the majority of the HWP platforms the _CPC object will point to
the HWP Capabilities MSR using the "Functional Fixed Hardware"
address space type. The address space type also can simply be
ACPI_TYPE_INTEGER, however, in which case the platform firmware
can set its value at the initialization time based on the system
constraints.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Continuous Performance Control package may contain an optional
guaranteed performance field.
Add support to read guaranteed performance from _CPC.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This cpupower update for Linux 4.20-rc1 consists of fixes for bugs and
compile warnings from Prarit Bhargava and Anders Roxell.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAlvE5l8ACgkQCwJExA0N
QxwFrQ//QJO6IKz63LpHE0NZ1o57SrORtEC3ZNl3gZl/T4avvlnyzR3bQBNmsGVT
m/XPkXN6QGOKYyAC6WHSNgWKeziFYL13Z/vy98l5+hCT5ZCENYJiN4zgTufkFM0U
hexYLGwFG40fOSMbyyyeAb/awRuO0oh4IFgUVthbnG9nPrhC7Ym/pkIscQ2Zg8jx
HhcQaW5znVDy6AP1Tz7QPWdLFBKnM103TpFodl/5tLm4zChqUt1ZXH9+7b3NlgCh
LURrN96BGonm+b45yrWm2EJ/Wc3FpIzUJ2x+XHQhrTYUmn5H6K4CG4vY7yknDxFG
syrOdi3+oclkXVtcG9MRNw1UV7YcIpGch//rLGXYrySI60lpBBKw2qy7oQxbqovX
OL8IqAAl7WC9rVL2CtZe/o0xUftr66FTtxOOZoe8Ur2zoqTBB9k++qbTeGUUFxH7
vp/LVOqiMtnaY95YsiaNbQtQaZ8E3AqT3tzVQ6XZzs3pEK7F1AdKEg09YhFYb6xg
B0L0BCkis+2B/4zzp/V3g0bmcBErDa4F4gKTmtC9wPwz8SfsFpXR+7Kvz8tIiiUS
Al85gDdfdVvf+/VkF5A0/TCSAL/x1Exn0UYLt7ZDa0QBO+qoG712rtfJ9pseRLXZ
/3jybXr6CdW0IbV3AOaMsmpmlUnzjze8+LDeunoe/7PLB8VX3+I=
=S5Ie
-----END PGP SIGNATURE-----
Merge tag 'linux-cpupower-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux into pm-tools
Pull cpupower utility changes for 4.20 from Shuah Khan:
"This cpupower update consists of fixes for bugs and compile warnings
from Prarit Bhargava and Anders Roxell."
* tag 'linux-cpupower-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: Fix coredump on VMWare
cpupower: Fix AMD Family 0x17 msr_pstate size
cpupower: remove stringop-truncation waring
If 'krealloc()' fails, 'pctl->functions' is set to NULL.
We should instead use a temp variable in order to be able to free the
previously allocated memeory, in case of OOM.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The reason of including <linux/bitops.h> here is just for BIT() and
GENMASK macros.
Since commit 8bd9cb51da ("locking/atomics, asm-generic: Move some
macros from <linux/bitops.h> to a new <linux/bits.h> file"),
<linux/bits.h> is enough for such compile-time macros.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>