Pull x86 cdso updates from Peter Anvin:
"Vdso cleanups and improvements largely from Andy Lutomirski. This
makes the vdso a lot less ''special''"
* 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso, build: Make LE access macros clearer, host-safe
x86/vdso, build: Fix cross-compilation from big-endian architectures
x86/vdso, build: When vdso2c fails, unlink the output
x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls
x86, mm: Improve _install_special_mapping and fix x86 vdso naming
mm, fs: Add vm_ops->name as an alternative to arch_vma_name
x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments
x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO
x86, vdso: Move the 32-bit vdso special pages after the text
x86, vdso: Reimplement vdso.so preparation in build-time C
x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c
x86, vdso: Clean up 32-bit vs 64-bit vdso params
x86, mm: Ensure correct alignment of the fixmap
Pull x86-64 espfix changes from Peter Anvin:
"This is the espfix64 code, which fixes the IRET information leak as
well as the associated functionality problem. With this code applied,
16-bit stack segments finally work as intended even on a 64-bit
kernel.
Consequently, this patchset also removes the runtime option that we
added as an interim measure.
To help the people working on Linux kernels for very small systems,
this patchset also makes these compile-time configurable features"
* 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
x86, espfix: Make it possible to disable 16-bit support
x86, espfix: Make espfix64 a Kconfig option, fix UML
x86, espfix: Fix broken header guard
x86, espfix: Move espfix definitions into a separate header file
x86-32, espfix: Remove filter for espfix32 due to race
x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
Pull x86 x32 ABI fix from Peter Anvin:
"A single fix for the x32 ABI: the io_setup() and io_submit() system
call need to use the compat stubs"
* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, x32: Use compat shims for io_{setup,submit}
Merge misc updates from Andrew Morton:
- a few fixes for 3.16. Cc'ed to stable so they'll get there somehow.
- various misc fixes and cleanups
- most of the ocfs2 queue. Review is slow...
- most of MM. The MM queue is pretty huge this time, but not much in
the way of feature work.
- some tweaks under kernel/
- printk maintenance work
- updates to lib/
- checkpatch updates
- tweaks to init/
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits)
fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul
init/main.c: remove an ifdef
kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND
init/main.c: add initcall_blacklist kernel parameter
init/main.c: don't use pr_debug()
fs/binfmt_flat.c: make old_reloc() static
fs/binfmt_elf.c: fix bool assignements
fs/efs: convert printk(KERN_DEBUG to pr_debug
fs/efs: add pr_fmt / use __func__
fs/efs: convert printk to pr_foo()
scripts/checkpatch.pl: device_initcall is not the only __initcall substitute
checkpatch: check stable email address
checkpatch: warn on unnecessary void function return statements
checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar);
checkpatch: add warning for kmalloc/kzalloc with multiply
checkpatch: warn on #defines ending in semicolon
checkpatch: make --strict a default for files in drivers/net and net/
checkpatch: always warn on missing blank line after variable declaration block
checkpatch: fix wildcard DT compatible string checking
...
autofs_dev_ioctl_init is only called by __init init_autofs4_fs
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. Remove CLONE_KERNEL, it has no users and it is dangerous.
The (old) comment says "List of flags we want to share for kernel
threads" but this is not true, we do not want to share ->sighand by
default. This flag can only be used if the caller is sure that both
parent/child will never play with signals (say, allow_signal/etc).
2. Change rest_init() to clone kernel_init() without CLONE_SIGHAND.
In this case CLONE_SIGHAND does not really hurt, and it looks like
optimization because copy_sighand() can avoid kmem_cache_alloc().
But in fact this only adds the minor pessimization. kernel_init()
is going to exec the init process, and de_thread() will need to
unshare ->sighand and do kmem_cache_alloc(sighand_cachep) anyway,
but it needs to do more work and take tasklist_lock and siglock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a module is built into the kernel the module_init() function
becomes an initcall. Sometimes debugging through dynamic debug can
help, however, debugging built in kernel modules is typically done by
changing the .config, recompiling, and booting the new kernel in an
effort to determine exactly which module caused a problem.
This patchset can be useful stand-alone or combined with initcall_debug.
There are cases where some initcalls can hang the machine before the
console can be flushed, which can make initcall_debug output inaccurate.
Having the ability to skip initcalls can help further debugging of these
scenarios.
Usage: initcall_blacklist=<list of comma separated initcalls>
ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
the log contains:
blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted
ex) added "initcall_blacklist=foo_bar,sgi_uv_sysfs_init" as a kernel parameter
and the log contains:
blacklisting initcall foo_bar
blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted
[akpm@linux-foundation.org: tweak printk text]
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Rob Landley <rob@landley.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pertially revert commit ea676e846a ("init/main.c: convert to
pr_foo()").
Unbeknownst to me, pr_debug() is different from the other pr_foo()
levels: pr_debug() is a no-op when DEBUG is not defined.
Happily, init/main.c does have a #define DEBUG so we didn't break
initcall_debug. But the functioning of initcall_debug should not be
dependent upon the presence of that #define DEBUG.
Reported-by: Russell King <rmk@arm.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
old_reloc() is only used in this file, make it static.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All KERN_DEBUG callsites being under #ifdef DEBUG we can safely convert
everything to pr_debug without changing current behaviour.
Remove #ifdef DEBUG around pr_debugs only (suggested by Joe Perches)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Also uniformize function arguments.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert all except KERN_DEBUG
(pr_debug doesn't work the same as printk(KERN_DEBUG and requires
special check)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a link to init.h to find appropriate initcall function to
replace obsolete __initcall
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It should be stable@vger.kernel.org, not stable@kernel.org.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
void function lines that use a single tab then "return;" are generally
unnecessary.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the kstrto<foo> functions in preference to sscanf.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Protect against sizeof overflows by preferring kmalloc_array/kcalloc over
kmalloc/kzalloc with a sizeof multiply.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using a #define ending in a semicolon is poor style and can lead to
unexpected code paths being executed.
Warn on uses of these #define types:
#define foo[(...)] bar;
#define foo[(...)] \
bar;
Based on a patch from Borislav Petkov.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Networking files are generally more strictly conformant to linux-kernel
style so make checkpatch more verbose by default for patches to files or
when checking files in these directories.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the test system wide, modify the message too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We attempt to search for compatible strings which use a variable token in
the documented name such as <chip> or <soc>. While this was attempted to
be handled, it's utterly broken.
The desired forms of matching are:
vendor,<chip>-*
vendor,name<part#>-*
For <chip>, lower case characters and numbers are permitted. For <part#>,
only numeric values are allowed.
With this change, the number of missing compatible strings reported in
arch/arm/boot/dts is reduced from 1071 to 960.
Reported-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Florian Vaussard <florian.vaussard@epfl.ch>
Cc: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert printk to current pr_foo() logging functions.
Also add pr_fmt based on KBUILD_MODNAME to avoid repeating prefix. Prefix
is now "atomic64_test: "
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Coalesce formats
- "WARNING:" prefix unchanged to keep bug format.
- printk(KERN_DEFAULT not converted.
- define pr_fmt without prefix to avoid any default prefix update
(suggested by Joe Perches).
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix some checkpatch warnings:
WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Pablo Neira <pablo@netfilter.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use cpu_to_le32 instead of __constant_cpu_to_le32.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
index has been removed from __radix_tree_delete_node in 449dd6984d
("mm: keep page cache radix tree nodes in check")
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change CONFIG_DEBUG_PI_LIST to be user-selectable, and add a title and
description. Remove the dependency on DEBUG_RT_MUTEXES since they were
changed to use rbtrees, and there are other users of plists now.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I use btree from 3.14-rc2 in my own module. When the btree module is
removed, a warning arises:
kmem_cache_destroy btree_node: Slab cache still has objects
CPU: 13 PID: 9150 Comm: rmmod Tainted: GF O 3.14.0-rc2 #1
Hardware name: Inspur NF5270M3/NF5270M3, BIOS CHEETAH_2.1.3 09/10/2013
Call Trace:
dump_stack+0x49/0x5d
kmem_cache_destroy+0xcf/0xe0
btree_module_exit+0x10/0x12 [btree]
SyS_delete_module+0x198/0x1f0
system_call_fastpath+0x16/0x1b
The cause is that it doesn't release the last btree node, when height = 1
and fill = 1.
[akpm@linux-foundation.org: remove unneeded test of NULL]
Signed-off-by: Minfei Huang <huangminfei@ucloud.cn>
Cc: Joern Engel <joern@logfs.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixing 2 coccinelle warnings:
lib/vsprintf.c:2350:2-9: WARNING: Assignment of bool to 0/1
lib/vsprintf.c:2389:3-10: WARNING: Assignment of bool to 0/1
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This restores the old behavior that existed before 2013-02-22, when
changes were made by 64dbfb444c ("decompressors: drop dependency on
CONFIG_EXPERT") and 5dc49c75a2 ("decompressors: make the default
XZ_DEC_* config match the selected architecture").
Disabling the filters only makes sense on embedded systems.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Acked-by: Kyle McMartin <kyle@infradead.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace pr_debug() in lib/plist.c test function plist_test() with
printk(KERN_DEBUG ...).
Without DEBUG defined, pr_debug() is complied out, but the entire
plist_test() function is already inside CONFIG_DEBUG_PI_LIST, so printk
should just be used directly.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For strncpy() and friends the source string may or may not have an actual
NUL character at the end. The documentation is confusing in this because
it specifically mentions that you are passing a "NUL-terminated" string.
Wikipedia says that "C-string" is an alternative name we can use instead.
http://en.wikipedia.org/wiki/Null-terminated_string
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... instead of naked numbers.
Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.
Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the log ring buffer becomes full, we silently overwrite old messages
with new data. console_unlock will detect this case and fast-forward the
console_* pointers to skip over the corrupted data, but nothing will be
reported to the user.
This patch hijacks the first valid log message after detecting that we
dropped messages and prefixes it with a note detailing how many messages
were dropped. For long (~1000 char) messages, this will result in some
truncation of the real message, but given that we're dropping things
anyway, that doesn't seem to be the end of the world.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The pr_debug() and related debug print macros all differ from the normal
pr_XXX() macros, in that the normal ones print unconditionally, while
the debug macros are compiled out unless DEBUG is defined or
CONFIG_DYNAMIC_DEBUG is set. This isn't obvious, and the only way to
find this out is either to review the actual printk.h code or to read
CodingStyle, and the message there doesn't highlight the fact.
Change Documentation/CodingStyle to clearly indicate that pr_debug() and
related debug printing macros behave differently than all other pr_XXX()
macros, and attempt to clarify when and where the different debug
printing methods might be used.
Add short comment to printk.h above the pr_XXX() macros indicating that
while these macros print unconditionally, pr_debug() does not.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Joe Perches <joe@perches.com>
Cc: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Bohac pointed out that there are rare but potential deadlock
possibilities when calling printk while holding the timekeeping
seqlock.
This is due to printk() triggering console sem wakeup, which can
cause scheduling code to trigger hrtimers which may try to read
the time.
Specifically, as Jiri pointed out, that path is:
printk
vprintk_emit
console_unlock
up(&console_sem)
__up
wake_up_process
try_to_wake_up
ttwu_do_activate
ttwu_activate
activate_task
enqueue_task
enqueue_task_fair
hrtick_update
hrtick_start_fair
hrtick_start_fair
get_time
ktime_get
--> endless loop on
read_seqcount_retry(&timekeeper_seq, ...)
This patch tries to avoid this issue by using printk_deferred (previously
named printk_sched) which should defer printing via a irq_work_queue.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reported-by: Jiri Bohac <jbohac@suse.cz>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two of the three prink_deferred uses are really printk_once style
uses, so add a printk_deferred_once macro to simplify those call
sites.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After learning we'll need some sort of deferred printk functionality in
the timekeeping core, Peter suggested we rename the printk_sched function
so it can be reused by needed subsystems.
This only changes the function name. No logic changes.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
An earlier change in -mm (printk: remove separate printk_sched
buffers...), removed the printk_sched irqsave/restore lines since it was
safe for current users. Since we may be expanding usage of
printk_sched(), disable preepmtion for this function to make it more
generally safe to call.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.
What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.
There's a couple of issues with this approach.
1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.
2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.
In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.
Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the up(&console_sem)
may do a wake up of any pending waiters. This must be avoided while
holding the logbuf_lock.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()). However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().
We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock()
especially since later in the patch series we will want to spin on
console_sem in console_trylock_for_printk().
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>