2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-27 06:34:11 +08:00
linux-next/drivers/tty/sysrq.c

1141 lines
26 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
// SPDX-License-Identifier: GPL-2.0
/*
* Linux Magic System Request Key Hacks
*
* (c) 1997 Martin Mares <mj@atrey.karlin.mff.cuni.cz>
* based on ideas by Pavel Machek <pavel@atrey.karlin.mff.cuni.cz>
*
* (c) 2000 Crutcher Dunnavant <crutcher+kernel@datastacks.com>
* overhauled to use key registration
* based upon discusions in irc://irc.openprojects.net/#kernelnewbies
*
* Copyright (c) 2010 Dmitry Torokhov
* Input handler conversion
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/sched/signal.h>
#include <linux/sched/rt.h>
#include <linux/sched/debug.h>
#include <linux/sched/task.h>
#include <linux/interrupt.h>
#include <linux/mm.h>
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/kdev_t.h>
#include <linux/major.h>
#include <linux/reboot.h>
#include <linux/sysrq.h>
#include <linux/kbd_kern.h>
#include <linux/proc_fs.h>
debug lockups: Improve lockup detection When debugging a recent lockup bug i found various deficiencies in how our current lockup detection helpers work: - SysRq-L is not very efficient as it uses a workqueue, hence it cannot punch through hard lockups and cannot see through most soft lockups either. - The SysRq-L code depends on the NMI watchdog - which is off by default. - We dont print backtraces from the RCU code's built-in 'RCU state machine is stuck' debug code. This debug code tends to be one of the first (and only) mechanisms that show that a lockup has occured. This patch changes the code so taht we: - Trigger the NMI backtrace code from SysRq-L instead of using a workqueue (which cannot punch through hard lockups) - Trigger print-all-CPU-backtraces from the RCU lockup detection code Also decouple the backtrace printing code from the NMI watchdog: - Dont use variable size cpumasks (it might not be initialized and they are a bit more fragile anyway) - Trigger an NMI immediately via an IPI, instead of waiting for the NMI tick to occur. This is a lot faster and can produce more relevant backtraces. It will also work if the NMI watchdog is disabled. - Dont print the 'dazed and confused' message when we print a backtrace from the NMI - Do a show_regs() plus a dump_stack() to get maximum info out of the dump. Worst-case we get two stacktraces - which is not a big deal. Sometimes, if register content is corrupted, the precise stack walker in show_regs() wont give us a full backtrace - in this case dump_stack() will do it. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02 17:28:21 +08:00
#include <linux/nmi.h>
#include <linux/quotaops.h>
perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 18:02:48 +08:00
#include <linux/perf_event.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/suspend.h>
#include <linux/writeback.h>
#include <linux/swap.h>
#include <linux/spinlock.h>
#include <linux/vt_kern.h>
#include <linux/workqueue.h>
#include <linux/hrtimer.h>
#include <linux/oom.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
#include <linux/slab.h>
#include <linux/input.h>
#include <linux/uaccess.h>
#include <linux/moduleparam.h>
#include <linux/jiffies.h>
#include <linux/syscalls.h>
#include <linux/of.h>
#include <linux/rcupdate.h>
#include <asm/ptrace.h>
#include <asm/irq_regs.h>
/* Whether we react on sysrq keys or just ignore them */
static int __read_mostly sysrq_enabled = CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE;
static bool __read_mostly sysrq_always_enabled;
static bool sysrq_on(void)
{
return sysrq_enabled || sysrq_always_enabled;
}
/*
* A value of 1 means 'all', other nonzero values are an op mask:
*/
static bool sysrq_on_mask(int mask)
{
return sysrq_always_enabled ||
sysrq_enabled == 1 ||
(sysrq_enabled & mask);
}
static int __init sysrq_always_enabled_setup(char *str)
{
sysrq_always_enabled = true;
pr_info("sysrq always enabled.\n");
return 1;
}
__setup("sysrq_always_enabled", sysrq_always_enabled_setup);
static void sysrq_handle_loglevel(int key)
{
int i;
i = key - '0';
console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
pr_info("Loglevel set to %d\n", i);
console_loglevel = i;
}
static struct sysrq_key_op sysrq_loglevel_op = {
.handler = sysrq_handle_loglevel,
.help_msg = "loglevel(0-9)",
.action_msg = "Changing Loglevel",
.enable_mask = SYSRQ_ENABLE_LOG,
};
#ifdef CONFIG_VT
static void sysrq_handle_SAK(int key)
{
struct work_struct *SAK_work = &vc_cons[fg_console].SAK_work;
schedule_work(SAK_work);
}
static struct sysrq_key_op sysrq_SAK_op = {
.handler = sysrq_handle_SAK,
.help_msg = "sak(k)",
.action_msg = "SAK",
.enable_mask = SYSRQ_ENABLE_KEYBOARD,
};
#else
#define sysrq_SAK_op (*(struct sysrq_key_op *)NULL)
#endif
#ifdef CONFIG_VT
static void sysrq_handle_unraw(int key)
{
vt_reset_unicode(fg_console);
}
static struct sysrq_key_op sysrq_unraw_op = {
.handler = sysrq_handle_unraw,
.help_msg = "unraw(r)",
.action_msg = "Keyboard mode set to system default",
.enable_mask = SYSRQ_ENABLE_KEYBOARD,
};
#else
#define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)
#endif /* CONFIG_VT */
static void sysrq_handle_crash(int key)
{
/* release the RCU read lock before crashing */
rcu_read_unlock();
panic("sysrq triggered crash\n");
}
sysrq, kdump: make sysrq-c consistent commit d6580a9f15238b87e618310c862231ae3f352d2d ("kexec: sysrq: simplify sysrq-c handler") changed the behavior of sysrq-c to unconditional dereference of NULL pointer. So in cases with CONFIG_KEXEC, where crash_kexec() was directly called from sysrq-c before, now it can be said that a step of "real oops" was inserted before starting kdump. However, in contrast to oops via SysRq-c from keyboard which results in panic due to in_interrupt(), oops via "echo c > /proc/sysrq-trigger" will not become panic unless panic_on_oops=1. It means that even if dump is properly configured to be taken on panic, the sysrq-c from proc interface might not start crashdump while the sysrq-c from keyboard can start crashdump. This confuses traditional users of kdump, i.e. people who expect sysrq-c to do common behavior in both of the keyboard and proc interface. This patch brings the keyboard and proc interface behavior of sysrq-c in line, by forcing panic_on_oops=1 before oops in sysrq-c handler. And some updates in documentation are included, to clarify that there is no longer dependency with CONFIG_KEXEC, and that now the system can just crash by sysrq-c if no dump mechanism is configured. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: Brayan Arraes <brayan@yack.com.br> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-30 06:04:14 +08:00
static struct sysrq_key_op sysrq_crash_op = {
.handler = sysrq_handle_crash,
.help_msg = "crash(c)",
.action_msg = "Trigger a crash",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static void sysrq_handle_reboot(int key)
{
[PATCH] sysrq: disable lockdep on reboot SysRq : Emergency Sync Emergency Sync complete SysRq : Emergency Remount R/O Emergency Remount complete SysRq : Resetting BUG: warning at kernel/lockdep.c:1816/trace_hardirqs_on() (Not tainted) Call Trace: [<ffffffff8026d56d>] show_trace+0xae/0x319 [<ffffffff8026d7ed>] dump_stack+0x15/0x17 [<ffffffff802a68d1>] trace_hardirqs_on+0xbc/0x13d [<ffffffff803a8eec>] sysrq_handle_reboot+0x9/0x11 [<ffffffff803a8f8d>] __handle_sysrq+0x99/0x130 [<ffffffff803a903b>] handle_sysrq+0x17/0x19 [<ffffffff803a36ee>] kbd_event+0x32e/0x57d [<ffffffff80401e35>] input_event+0x42d/0x45b [<ffffffff804063eb>] atkbd_interrupt+0x44d/0x53d [<ffffffff803fe5c5>] serio_interrupt+0x49/0x86 [<ffffffff803ff2a4>] i8042_interrupt+0x202/0x21a [<ffffffff80210cf0>] handle_IRQ_event+0x2c/0x64 [<ffffffff802bfd8b>] __do_IRQ+0xaf/0x114 [<ffffffff8026ea24>] do_IRQ+0xf8/0x107 [<ffffffff8025f886>] ret_from_intr+0x0/0xf DWARF2 unwinder stuck at ret_from_intr+0x0/0xf Leftover inexact backtrace: <IRQ> <EOI> [<ffffffff80258e36>] mwait_idle+0x3f/0x54 [<ffffffff8024a33a>] cpu_idle+0xa2/0xc5 [<ffffffff8026c34e>] rest_init+0x2b/0x2d [<ffffffff809708bc>] start_kernel+0x24a/0x24c [<ffffffff8097028b>] _sinittext+0x28b/0x292 Since we're shutting down anyway, don't bother being smart, just turn the thing off. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 14:28:02 +08:00
lockdep_off();
local_irq_enable();
emergency_restart();
}
static struct sysrq_key_op sysrq_reboot_op = {
.handler = sysrq_handle_reboot,
.help_msg = "reboot(b)",
.action_msg = "Resetting",
.enable_mask = SYSRQ_ENABLE_BOOT,
};
static void sysrq_handle_sync(int key)
{
emergency_sync();
}
static struct sysrq_key_op sysrq_sync_op = {
.handler = sysrq_handle_sync,
.help_msg = "sync(s)",
.action_msg = "Emergency Sync",
.enable_mask = SYSRQ_ENABLE_SYNC,
};
static void sysrq_handle_show_timers(int key)
{
sysrq_timer_list_show();
}
static struct sysrq_key_op sysrq_show_timers_op = {
.handler = sysrq_handle_show_timers,
.help_msg = "show-all-timers(q)",
.action_msg = "Show clockevent devices & pending hrtimers (no others)",
};
static void sysrq_handle_mountro(int key)
{
emergency_remount();
}
static struct sysrq_key_op sysrq_mountro_op = {
.handler = sysrq_handle_mountro,
.help_msg = "unmount(u)",
.action_msg = "Emergency Remount R/O",
.enable_mask = SYSRQ_ENABLE_REMOUNT,
};
#ifdef CONFIG_LOCKDEP
static void sysrq_handle_showlocks(int key)
{
debug_show_all_locks();
}
static struct sysrq_key_op sysrq_showlocks_op = {
.handler = sysrq_handle_showlocks,
.help_msg = "show-all-locks(d)",
.action_msg = "Show Locks Held",
};
#else
#define sysrq_showlocks_op (*(struct sysrq_key_op *)NULL)
#endif
#ifdef CONFIG_SMP
static DEFINE_RAW_SPINLOCK(show_lock);
static void showacpu(void *dummy)
{
unsigned long flags;
/* Idle CPUs have no interesting backtrace. */
if (idle_cpu(smp_processor_id()))
return;
raw_spin_lock_irqsave(&show_lock, flags);
pr_info("CPU%d:\n", smp_processor_id());
show_stack(NULL, NULL);
raw_spin_unlock_irqrestore(&show_lock, flags);
}
static void sysrq_showregs_othercpus(struct work_struct *dummy)
{
smp_call_function(showacpu, NULL, 0);
}
static DECLARE_WORK(sysrq_showallcpus, sysrq_showregs_othercpus);
static void sysrq_handle_showallcpus(int key)
{
/*
* Fall back to the workqueue based printing if the
* backtrace printing did not succeed or the
* architecture has no support for it:
*/
if (!trigger_all_cpu_backtrace()) {
sysrq : fix Show Regs call trace on ARM When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [<ffffff8008088e80>] dump_backtrace+0x0/0x1d0 [<ffffff8008089074>] show_stack+0x24/0x30 [<ffffff8008447970>] dump_stack+0x90/0xb0 [<ffffff8008463950>] check_preemption_disabled+0x100/0x108 [<ffffff8008463998>] __this_cpu_preempt_check+0x18/0x20 [<ffffff80084c9194>] sysrq_handle_showregs+0x1c/0x40 [<ffffff80084c9c7c>] __handle_sysrq+0x12c/0x1a0 [<ffffff80084ca140>] write_sysrq_trigger+0x60/0x70 [<ffffff8008251e00>] proc_reg_write+0x90/0xd0 [<ffffff80081f1788>] __vfs_write+0x48/0x90 [<ffffff80081f241c>] vfs_write+0xa4/0x190 [<ffffff80081f3354>] SyS_write+0x54/0xb0 [<ffffff80080833f0>] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu <jibin.xu@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-11 11:11:42 +08:00
struct pt_regs *regs = NULL;
sysrq : fix Show Regs call trace on ARM When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [<ffffff8008088e80>] dump_backtrace+0x0/0x1d0 [<ffffff8008089074>] show_stack+0x24/0x30 [<ffffff8008447970>] dump_stack+0x90/0xb0 [<ffffff8008463950>] check_preemption_disabled+0x100/0x108 [<ffffff8008463998>] __this_cpu_preempt_check+0x18/0x20 [<ffffff80084c9194>] sysrq_handle_showregs+0x1c/0x40 [<ffffff80084c9c7c>] __handle_sysrq+0x12c/0x1a0 [<ffffff80084ca140>] write_sysrq_trigger+0x60/0x70 [<ffffff8008251e00>] proc_reg_write+0x90/0xd0 [<ffffff80081f1788>] __vfs_write+0x48/0x90 [<ffffff80081f241c>] vfs_write+0xa4/0x190 [<ffffff80081f3354>] SyS_write+0x54/0xb0 [<ffffff80080833f0>] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu <jibin.xu@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-11 11:11:42 +08:00
if (in_irq())
regs = get_irq_regs();
if (regs) {
pr_info("CPU%d:\n", smp_processor_id());
show_regs(regs);
}
schedule_work(&sysrq_showallcpus);
}
}
static struct sysrq_key_op sysrq_showallcpus_op = {
.handler = sysrq_handle_showallcpus,
.help_msg = "show-backtrace-all-active-cpus(l)",
.action_msg = "Show backtrace of all active CPUs",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
#endif
static void sysrq_handle_showregs(int key)
{
sysrq : fix Show Regs call trace on ARM When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [<ffffff8008088e80>] dump_backtrace+0x0/0x1d0 [<ffffff8008089074>] show_stack+0x24/0x30 [<ffffff8008447970>] dump_stack+0x90/0xb0 [<ffffff8008463950>] check_preemption_disabled+0x100/0x108 [<ffffff8008463998>] __this_cpu_preempt_check+0x18/0x20 [<ffffff80084c9194>] sysrq_handle_showregs+0x1c/0x40 [<ffffff80084c9c7c>] __handle_sysrq+0x12c/0x1a0 [<ffffff80084ca140>] write_sysrq_trigger+0x60/0x70 [<ffffff8008251e00>] proc_reg_write+0x90/0xd0 [<ffffff80081f1788>] __vfs_write+0x48/0x90 [<ffffff80081f241c>] vfs_write+0xa4/0x190 [<ffffff80081f3354>] SyS_write+0x54/0xb0 [<ffffff80080833f0>] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu <jibin.xu@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-11 11:11:42 +08:00
struct pt_regs *regs = NULL;
if (in_irq())
regs = get_irq_regs();
IRQ: Maintain regs pointer globally rather than passing to IRQ handlers Maintain a per-CPU global "struct pt_regs *" variable which can be used instead of passing regs around manually through all ~1800 interrupt handlers in the Linux kernel. The regs pointer is used in few places, but it potentially costs both stack space and code to pass it around. On the FRV arch, removing the regs parameter from all the genirq function results in a 20% speed up of the IRQ exit path (ie: from leaving timer_interrupt() to leaving do_IRQ()). Where appropriate, an arch may override the generic storage facility and do something different with the variable. On FRV, for instance, the address is maintained in GR28 at all times inside the kernel as part of general exception handling. Having looked over the code, it appears that the parameter may be handed down through up to twenty or so layers of functions. Consider a USB character device attached to a USB hub, attached to a USB controller that posts its interrupts through a cascaded auxiliary interrupt controller. A character device driver may want to pass regs to the sysrq handler through the input layer which adds another few layers of parameter passing. I've build this code with allyesconfig for x86_64 and i386. I've runtested the main part of the code on FRV and i386, though I can't test most of the drivers. I've also done partial conversion for powerpc and MIPS - these at least compile with minimal configurations. This will affect all archs. Mostly the changes should be relatively easy. Take do_IRQ(), store the regs pointer at the beginning, saving the old one: struct pt_regs *old_regs = set_irq_regs(regs); And put the old one back at the end: set_irq_regs(old_regs); Don't pass regs through to generic_handle_irq() or __do_IRQ(). In timer_interrupt(), this sort of change will be necessary: - update_process_times(user_mode(regs)); - profile_tick(CPU_PROFILING, regs); + update_process_times(user_mode(get_irq_regs())); + profile_tick(CPU_PROFILING); I'd like to move update_process_times()'s use of get_irq_regs() into itself, except that i386, alone of the archs, uses something other than user_mode(). Some notes on the interrupt handling in the drivers: (*) input_dev() is now gone entirely. The regs pointer is no longer stored in the input_dev struct. (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does something different depending on whether it's been supplied with a regs pointer or not. (*) Various IRQ handler function pointers have been moved to type irq_handler_t. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-05 21:55:46 +08:00
if (regs)
show_regs(regs);
perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 18:02:48 +08:00
perf_event_print_debug();
}
static struct sysrq_key_op sysrq_showregs_op = {
.handler = sysrq_handle_showregs,
.help_msg = "show-registers(p)",
.action_msg = "Show Regs",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static void sysrq_handle_showstate(int key)
{
show_state();
workqueue: dump workqueues on sysrq-t Workqueues are used extensively throughout the kernel but sometimes it's difficult to debug stalls involving work items because visibility into its inner workings is fairly limited. Although sysrq-t task dump annotates each active worker task with the information on the work item being executed, it is challenging to find out which work items are pending or delayed on which queues and how pools are being managed. This patch implements show_workqueue_state() which dumps all busy workqueues and pools and is called from the sysrq-t handler. At the end of sysrq-t dump, something like the following is printed. Showing busy workqueues and worker pools: ... workqueue filler_wq: flags=0x0 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 in-flight: 491:filler_workfn, 507:filler_workfn pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 in-flight: 501:filler_workfn pending: filler_workfn ... workqueue test_wq: flags=0x8 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1 in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500) delayed: test_workfn1 BAR(492), test_workfn2 ... pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137 pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469 pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16 pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62 The above shows that test_wq is executing test_workfn() on pid 510 which is the rescuer and also that there are two tasks 69 and 500 waiting for the work item to finish in flush_work(). As test_wq has max_active of 1, there are two work items for test_workfn1() and test_workfn2() which are delayed till the current work item is finished. In addition, pid 492 is flushing test_workfn1(). The work item for test_workfn() is being executed on pwq of pool 2 which is the normal priority per-cpu pool for CPU 1. The pool has three workers, two of which are executing filler_workfn() for filler_wq and the last one is assuming the manager role trying to create more workers. This extra workqueue state dump will hopefully help chasing down hangs involving workqueues. v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting. v2: As suggested by Andrew, minor formatting change in pr_cont_work(), printk()'s replaced with pr_info()'s, and cpumask printing now uses cpulist_pr_cont(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@redhat.com>
2015-03-09 21:22:28 +08:00
show_workqueue_state();
}
static struct sysrq_key_op sysrq_showstate_op = {
.handler = sysrq_handle_showstate,
.help_msg = "show-task-states(t)",
.action_msg = "Show State",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static void sysrq_handle_showstate_blocked(int key)
{
show_state_filter(TASK_UNINTERRUPTIBLE);
}
static struct sysrq_key_op sysrq_showstate_blocked_op = {
.handler = sysrq_handle_showstate_blocked,
.help_msg = "show-blocked-tasks(w)",
.action_msg = "Show Blocked State",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
#ifdef CONFIG_TRACING
#include <linux/ftrace.h>
static void sysrq_ftrace_dump(int key)
{
ftrace_dump(DUMP_ALL);
}
static struct sysrq_key_op sysrq_ftrace_dump_op = {
.handler = sysrq_ftrace_dump,
.help_msg = "dump-ftrace-buffer(z)",
.action_msg = "Dump ftrace buffer",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
#else
#define sysrq_ftrace_dump_op (*(struct sysrq_key_op *)NULL)
#endif
static void sysrq_handle_showmem(int key)
{
show_mem(0, NULL);
}
static struct sysrq_key_op sysrq_showmem_op = {
.handler = sysrq_handle_showmem,
.help_msg = "show-memory-usage(m)",
.action_msg = "Show Memory",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
/*
* Signal sysrq helper function. Sends a signal to all user processes.
*/
static void send_sig_all(int sig)
{
struct task_struct *p;
read_lock(&tasklist_lock);
for_each_process(p) {
if (p->flags & PF_KTHREAD)
continue;
if (is_global_init(p))
continue;
do_send_sig_info(sig, SEND_SIG_PRIV, p, PIDTYPE_MAX);
}
read_unlock(&tasklist_lock);
}
static void sysrq_handle_term(int key)
{
send_sig_all(SIGTERM);
console_loglevel = CONSOLE_LOGLEVEL_DEBUG;
}
static struct sysrq_key_op sysrq_term_op = {
.handler = sysrq_handle_term,
.help_msg = "terminate-all-tasks(e)",
.action_msg = "Terminate All Tasks",
.enable_mask = SYSRQ_ENABLE_SIGNAL,
};
2006-11-22 22:55:48 +08:00
static void moom_callback(struct work_struct *ignored)
{
const gfp_t gfp_mask = GFP_KERNEL;
struct oom_control oc = {
.zonelist = node_zonelist(first_memory_node, gfp_mask),
.nodemask = NULL,
.memcg = NULL,
.gfp_mask = gfp_mask,
.order = -1,
};
mutex_lock(&oom_lock);
if (!out_of_memory(&oc))
oom: improve oom disable handling Tetsuo has reported that sysrq triggered OOM killer will print a misleading information when no tasks are selected: sysrq: SysRq : Manual OOM execution Out of memory: Kill process 4468 ((agetty)) score 0 or sacrifice child Killed process 4468 ((agetty)) total-vm:43704kB, anon-rss:1760kB, file-rss:0kB, shmem-rss:0kB sysrq: SysRq : Manual OOM execution Out of memory: Kill process 4469 (systemd-cgroups) score 0 or sacrifice child Killed process 4469 (systemd-cgroups) total-vm:10704kB, anon-rss:120kB, file-rss:0kB, shmem-rss:0kB sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled sysrq: SysRq : Manual OOM execution sysrq: OOM request ignored because killer is disabled The real reason is that there are no eligible tasks for the OOM killer to select but since commit 7c5f64f84483 ("mm: oom: deduplicate victim selection code for memcg and global oom") the semantic of out_of_memory has changed without updating moom_callback. This patch updates moom_callback to tell that no task was eligible which is the case for both oom killer disabled and no eligible tasks. In order to help distinguish first case from the second add printk to both oom_killer_{enable,disable}. This information is useful on its own because it might help debugging potential memory allocation failures. Fixes: 7c5f64f84483 ("mm: oom: deduplicate victim selection code for memcg and global oom") Link: http://lkml.kernel.org/r/20170404134705.6361-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-04 05:54:57 +08:00
pr_info("OOM request ignored. No task eligible\n");
mutex_unlock(&oom_lock);
}
2006-11-22 22:55:48 +08:00
static DECLARE_WORK(moom_work, moom_callback);
static void sysrq_handle_moom(int key)
{
schedule_work(&moom_work);
}
static struct sysrq_key_op sysrq_moom_op = {
.handler = sysrq_handle_moom,
.help_msg = "memory-full-oom-kill(f)",
.action_msg = "Manual OOM execution",
.enable_mask = SYSRQ_ENABLE_SIGNAL,
};
#ifdef CONFIG_BLOCK
static void sysrq_handle_thaw(int key)
{
emergency_thaw_all();
}
static struct sysrq_key_op sysrq_thaw_op = {
.handler = sysrq_handle_thaw,
.help_msg = "thaw-filesystems(j)",
.action_msg = "Emergency Thaw of all frozen filesystems",
.enable_mask = SYSRQ_ENABLE_SIGNAL,
};
#endif
static void sysrq_handle_kill(int key)
{
send_sig_all(SIGKILL);
console_loglevel = CONSOLE_LOGLEVEL_DEBUG;
}
static struct sysrq_key_op sysrq_kill_op = {
.handler = sysrq_handle_kill,
.help_msg = "kill-all-tasks(i)",
.action_msg = "Kill All Tasks",
.enable_mask = SYSRQ_ENABLE_SIGNAL,
};
static void sysrq_handle_unrt(int key)
{
normalize_rt_tasks();
}
static struct sysrq_key_op sysrq_unrt_op = {
.handler = sysrq_handle_unrt,
.help_msg = "nice-all-RT-tasks(n)",
.action_msg = "Nice All RT Tasks",
.enable_mask = SYSRQ_ENABLE_RTNICE,
};
/* Key Operations table and lock */
static DEFINE_SPINLOCK(sysrq_key_table_lock);
static struct sysrq_key_op *sysrq_key_table[36] = {
&sysrq_loglevel_op, /* 0 */
&sysrq_loglevel_op, /* 1 */
&sysrq_loglevel_op, /* 2 */
&sysrq_loglevel_op, /* 3 */
&sysrq_loglevel_op, /* 4 */
&sysrq_loglevel_op, /* 5 */
&sysrq_loglevel_op, /* 6 */
&sysrq_loglevel_op, /* 7 */
&sysrq_loglevel_op, /* 8 */
&sysrq_loglevel_op, /* 9 */
/*
* a: Don't use for system provided sysrqs, it is handled specially on
* sparc and will never arrive.
*/
NULL, /* a */
&sysrq_reboot_op, /* b */
&sysrq_crash_op, /* c */
&sysrq_showlocks_op, /* d */
&sysrq_term_op, /* e */
&sysrq_moom_op, /* f */
/* g: May be registered for the kernel debugger */
NULL, /* g */
NULL, /* h - reserved for help */
&sysrq_kill_op, /* i */
#ifdef CONFIG_BLOCK
&sysrq_thaw_op, /* j */
#else
NULL, /* j */
#endif
&sysrq_SAK_op, /* k */
#ifdef CONFIG_SMP
&sysrq_showallcpus_op, /* l */
#else
NULL, /* l */
#endif
&sysrq_showmem_op, /* m */
&sysrq_unrt_op, /* n */
/* o: This will often be registered as 'Off' at init time */
NULL, /* o */
&sysrq_showregs_op, /* p */
&sysrq_show_timers_op, /* q */
&sysrq_unraw_op, /* r */
&sysrq_sync_op, /* s */
&sysrq_showstate_op, /* t */
&sysrq_mountro_op, /* u */
/* v: May be registered for frame buffer console restore */
NULL, /* v */
&sysrq_showstate_blocked_op, /* w */
/* x: May be registered on mips for TLB dump */
/* x: May be registered on ppc/powerpc for xmon */
/* x: May be registered on sparc64 for global PMU dump */
NULL, /* x */
/* y: May be registered on sparc64 for global register dump */
NULL, /* y */
&sysrq_ftrace_dump_op, /* z */
};
/* key2index calculation, -1 on invalid index */
static int sysrq_key_table_key2index(int key)
{
int retval;
if ((key >= '0') && (key <= '9'))
retval = key - '0';
else if ((key >= 'a') && (key <= 'z'))
retval = key + 10 - 'a';
else
retval = -1;
return retval;
}
/*
* get and put functions for the table, exposed to modules.
*/
struct sysrq_key_op *__sysrq_get_key_op(int key)
{
struct sysrq_key_op *op_p = NULL;
int i;
i = sysrq_key_table_key2index(key);
if (i != -1)
op_p = sysrq_key_table[i];
return op_p;
}
static void __sysrq_put_key_op(int key, struct sysrq_key_op *op_p)
{
int i = sysrq_key_table_key2index(key);
if (i != -1)
sysrq_key_table[i] = op_p;
}
void __handle_sysrq(int key, bool check_mask)
{
struct sysrq_key_op *op_p;
int orig_log_level;
panic: avoid the extra noise dmesg When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-15 06:45:34 +08:00
int orig_suppress_printk;
int i;
panic: avoid the extra noise dmesg When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-15 06:45:34 +08:00
orig_suppress_printk = suppress_printk;
suppress_printk = 0;
rcu_sysrq_start();
rcu_read_lock();
/*
* Raise the apparent loglevel to maximum so that the sysrq header
* is shown to provide the user with positive feedback. We do not
* simply emit this at KERN_EMERG as that would change message
* routing in the consumers of /proc/kmsg.
*/
orig_log_level = console_loglevel;
console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
op_p = __sysrq_get_key_op(key);
if (op_p) {
/*
* Should we check for enabled operations (/proc/sysrq-trigger
* should not) and is the invoked operation enabled?
*/
if (!check_mask || sysrq_on_mask(op_p->enable_mask)) {
pr_info("%s\n", op_p->action_msg);
console_loglevel = orig_log_level;
op_p->handler(key);
} else {
pr_info("This sysrq operation is disabled.\n");
console_loglevel = orig_log_level;
}
} else {
pr_info("HELP : ");
/* Only print the help msg once per handler */
for (i = 0; i < ARRAY_SIZE(sysrq_key_table); i++) {
if (sysrq_key_table[i]) {
int j;
for (j = 0; sysrq_key_table[i] !=
sysrq_key_table[j]; j++)
;
if (j != i)
continue;
pr_cont("%s ", sysrq_key_table[i]->help_msg);
}
}
pr_cont("\n");
console_loglevel = orig_log_level;
}
rcu_read_unlock();
rcu_sysrq_end();
panic: avoid the extra noise dmesg When kernel panic happens, it will first print the panic call stack, then the ending msg like: [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception [ 35.749975] ------------[ cut here ]------------ The above message are very useful for debugging. But if system is configured to not reboot on panic, say the "panic_timeout" parameter equals 0, it will likely print out many noisy message like WARN() call stack for each and every CPU except the panic one, messages like below: WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190 Call Trace: <IRQ> try_to_wake_up default_wake_function autoremove_wake_function __wake_up_common __wake_up_common_lock __wake_up wake_up_klogd_work_func irq_work_run_list irq_work_tick update_process_times tick_sched_timer __hrtimer_run_queues hrtimer_interrupt smp_apic_timer_interrupt apic_timer_interrupt For people working in console mode, the screen will first show the panic call stack, but immediately overridden by these noisy extra messages, which makes debugging much more difficult, as the original context gets lost on screen. Also these noisy messages will confuse some users, as I have seen many bug reporters posted the noisy message into bugzilla, instead of the real panic call stack and context. Adding a flag "suppress_printk" which gets set in panic() to avoid those noisy messages, without changing current kernel behavior that both panic blinking and sysrq magic key can work as is, suggested by Petr Mladek. To verify this, make sure kernel is not configured to reboot on panic and in console # echo c > /proc/sysrq-trigger to see if console only prints out the panic call stack. Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-15 06:45:34 +08:00
suppress_printk = orig_suppress_printk;
}
void handle_sysrq(int key)
{
if (sysrq_on())
__handle_sysrq(key, true);
}
EXPORT_SYMBOL(handle_sysrq);
#ifdef CONFIG_INPUT
static int sysrq_reset_downtime_ms;
/* Simple translation table for the SysRq keys */
static const unsigned char sysrq_xlate[KEY_CNT] =
"\000\0331234567890-=\177\t" /* 0x00 - 0x0f */
"qwertyuiop[]\r\000as" /* 0x10 - 0x1f */
"dfghjkl;'`\000\\zxcv" /* 0x20 - 0x2f */
"bnm,./\000*\000 \000\201\202\203\204\205" /* 0x30 - 0x3f */
"\206\207\210\211\212\000\000789-456+1" /* 0x40 - 0x4f */
"230\177\000\000\213\214\000\000\000\000\000\000\000\000\000\000" /* 0x50 - 0x5f */
"\r\000/"; /* 0x60 - 0x6f */
struct sysrq_state {
struct input_handle handle;
struct work_struct reinject_work;
unsigned long key_down[BITS_TO_LONGS(KEY_CNT)];
unsigned int alt;
unsigned int alt_use;
bool active;
bool need_reinject;
bool reinjecting;
/* reset sequence handling */
bool reset_canceled;
bool reset_requested;
unsigned long reset_keybit[BITS_TO_LONGS(KEY_CNT)];
int reset_seq_len;
int reset_seq_cnt;
int reset_seq_version;
struct timer_list keyreset_timer;
};
#define SYSRQ_KEY_RESET_MAX 20 /* Should be plenty */
static unsigned short sysrq_reset_seq[SYSRQ_KEY_RESET_MAX];
static unsigned int sysrq_reset_seq_len;
static unsigned int sysrq_reset_seq_version = 1;
static void sysrq_parse_reset_sequence(struct sysrq_state *state)
{
int i;
unsigned short key;
state->reset_seq_cnt = 0;
for (i = 0; i < sysrq_reset_seq_len; i++) {
key = sysrq_reset_seq[i];
if (key == KEY_RESERVED || key > KEY_MAX)
break;
__set_bit(key, state->reset_keybit);
state->reset_seq_len++;
if (test_bit(key, state->key_down))
state->reset_seq_cnt++;
}
/* Disable reset until old keys are not released */
state->reset_canceled = state->reset_seq_cnt != 0;
state->reset_seq_version = sysrq_reset_seq_version;
}
static void sysrq_do_reset(struct timer_list *t)
{
struct sysrq_state *state = from_timer(state, t, keyreset_timer);
state->reset_requested = true;
orderly_reboot();
}
static void sysrq_handle_reset_request(struct sysrq_state *state)
{
if (state->reset_requested)
__handle_sysrq(sysrq_xlate[KEY_B], false);
if (sysrq_reset_downtime_ms)
mod_timer(&state->keyreset_timer,
jiffies + msecs_to_jiffies(sysrq_reset_downtime_ms));
else
sysrq_do_reset(&state->keyreset_timer);
}
static void sysrq_detect_reset_sequence(struct sysrq_state *state,
unsigned int code, int value)
{
if (!test_bit(code, state->reset_keybit)) {
/*
* Pressing any key _not_ in reset sequence cancels
* the reset sequence. Also cancelling the timer in
* case additional keys were pressed after a reset
* has been requested.
*/
if (value && state->reset_seq_cnt) {
state->reset_canceled = true;
del_timer(&state->keyreset_timer);
}
} else if (value == 0) {
/*
* Key release - all keys in the reset sequence need
* to be pressed and held for the reset timeout
* to hold.
*/
del_timer(&state->keyreset_timer);
if (--state->reset_seq_cnt == 0)
state->reset_canceled = false;
} else if (value == 1) {
/* key press, not autorepeat */
if (++state->reset_seq_cnt == state->reset_seq_len &&
!state->reset_canceled) {
sysrq_handle_reset_request(state);
}
}
}
#ifdef CONFIG_OF
static void sysrq_of_get_keyreset_config(void)
{
u32 key;
struct device_node *np;
struct property *prop;
const __be32 *p;
np = of_find_node_by_path("/chosen/linux,sysrq-reset-seq");
if (!np) {
pr_debug("No sysrq node found");
return;
}
/* Reset in case a __weak definition was present */
sysrq_reset_seq_len = 0;
of_property_for_each_u32(np, "keyset", prop, p, key) {
if (key == KEY_RESERVED || key > KEY_MAX ||
sysrq_reset_seq_len == SYSRQ_KEY_RESET_MAX)
break;
sysrq_reset_seq[sysrq_reset_seq_len++] = (unsigned short)key;
}
/* Get reset timeout if any. */
of_property_read_u32(np, "timeout-ms", &sysrq_reset_downtime_ms);
of_node_put(np);
}
#else
static void sysrq_of_get_keyreset_config(void)
{
}
#endif
static void sysrq_reinject_alt_sysrq(struct work_struct *work)
{
struct sysrq_state *sysrq =
container_of(work, struct sysrq_state, reinject_work);
struct input_handle *handle = &sysrq->handle;
unsigned int alt_code = sysrq->alt_use;
if (sysrq->need_reinject) {
/* we do not want the assignment to be reordered */
sysrq->reinjecting = true;
mb();
/* Simulate press and release of Alt + SysRq */
input_inject_event(handle, EV_KEY, alt_code, 1);
input_inject_event(handle, EV_KEY, KEY_SYSRQ, 1);
input_inject_event(handle, EV_SYN, SYN_REPORT, 1);
input_inject_event(handle, EV_KEY, KEY_SYSRQ, 0);
input_inject_event(handle, EV_KEY, alt_code, 0);
input_inject_event(handle, EV_SYN, SYN_REPORT, 1);
mb();
sysrq->reinjecting = false;
}
}
static bool sysrq_handle_keypress(struct sysrq_state *sysrq,
unsigned int code, int value)
{
bool was_active = sysrq->active;
bool suppress;
switch (code) {
case KEY_LEFTALT:
case KEY_RIGHTALT:
if (!value) {
/* One of ALTs is being released */
if (sysrq->active && code == sysrq->alt_use)
sysrq->active = false;
sysrq->alt = KEY_RESERVED;
} else if (value != 2) {
sysrq->alt = code;
sysrq->need_reinject = false;
}
break;
case KEY_SYSRQ:
if (value == 1 && sysrq->alt != KEY_RESERVED) {
sysrq->active = true;
sysrq->alt_use = sysrq->alt;
/*
* If nothing else will be pressed we'll need
* to re-inject Alt-SysRq keysroke.
*/
sysrq->need_reinject = true;
}
/*
* Pretend that sysrq was never pressed at all. This
* is needed to properly handle KGDB which will try
* to release all keys after exiting debugger. If we
* do not clear key bit it KGDB will end up sending
* release events for Alt and SysRq, potentially
* triggering print screen function.
*/
if (sysrq->active)
clear_bit(KEY_SYSRQ, sysrq->handle.dev->key);
break;
default:
if (sysrq->active && value && value != 2) {
sysrq->need_reinject = false;
__handle_sysrq(sysrq_xlate[code], true);
}
break;
}
suppress = sysrq->active;
if (!sysrq->active) {
/*
* See if reset sequence has changed since the last time.
*/
if (sysrq->reset_seq_version != sysrq_reset_seq_version)
sysrq_parse_reset_sequence(sysrq);
/*
* If we are not suppressing key presses keep track of
* keyboard state so we can release keys that have been
* pressed before entering SysRq mode.
*/
if (value)
set_bit(code, sysrq->key_down);
else
clear_bit(code, sysrq->key_down);
if (was_active)
schedule_work(&sysrq->reinject_work);
/* Check for reset sequence */
sysrq_detect_reset_sequence(sysrq, code, value);
} else if (value == 0 && test_and_clear_bit(code, sysrq->key_down)) {
/*
* Pass on release events for keys that was pressed before
* entering SysRq mode.
*/
suppress = false;
}
return suppress;
}
static bool sysrq_filter(struct input_handle *handle,
unsigned int type, unsigned int code, int value)
{
struct sysrq_state *sysrq = handle->private;
bool suppress;
/*
* Do not filter anything if we are in the process of re-injecting
* Alt+SysRq combination.
*/
if (sysrq->reinjecting)
return false;
switch (type) {
case EV_SYN:
suppress = false;
break;
case EV_KEY:
suppress = sysrq_handle_keypress(sysrq, code, value);
break;
default:
suppress = sysrq->active;
break;
}
return suppress;
}
static int sysrq_connect(struct input_handler *handler,
struct input_dev *dev,
const struct input_device_id *id)
{
struct sysrq_state *sysrq;
int error;
sysrq = kzalloc(sizeof(struct sysrq_state), GFP_KERNEL);
if (!sysrq)
return -ENOMEM;
INIT_WORK(&sysrq->reinject_work, sysrq_reinject_alt_sysrq);
sysrq->handle.dev = dev;
sysrq->handle.handler = handler;
sysrq->handle.name = "sysrq";
sysrq->handle.private = sysrq;
timer_setup(&sysrq->keyreset_timer, sysrq_do_reset, 0);
error = input_register_handle(&sysrq->handle);
if (error) {
pr_err("Failed to register input sysrq handler, error %d\n",
error);
goto err_free;
}
error = input_open_device(&sysrq->handle);
if (error) {
pr_err("Failed to open input device, error %d\n", error);
goto err_unregister;
}
return 0;
err_unregister:
input_unregister_handle(&sysrq->handle);
err_free:
kfree(sysrq);
return error;
}
static void sysrq_disconnect(struct input_handle *handle)
{
struct sysrq_state *sysrq = handle->private;
input_close_device(handle);
cancel_work_sync(&sysrq->reinject_work);
del_timer_sync(&sysrq->keyreset_timer);
input_unregister_handle(handle);
kfree(sysrq);
}
/*
* We are matching on KEY_LEFTALT instead of KEY_SYSRQ because not all
* keyboards have SysRq key predefined and so user may add it to keymap
* later, but we expect all such keyboards to have left alt.
*/
static const struct input_device_id sysrq_ids[] = {
{
.flags = INPUT_DEVICE_ID_MATCH_EVBIT |
INPUT_DEVICE_ID_MATCH_KEYBIT,
.evbit = { [BIT_WORD(EV_KEY)] = BIT_MASK(EV_KEY) },
.keybit = { [BIT_WORD(KEY_LEFTALT)] = BIT_MASK(KEY_LEFTALT) },
},
{ },
};
static struct input_handler sysrq_handler = {
.filter = sysrq_filter,
.connect = sysrq_connect,
.disconnect = sysrq_disconnect,
.name = "sysrq",
.id_table = sysrq_ids,
};
static bool sysrq_handler_registered;
static inline void sysrq_register_handler(void)
{
int error;
sysrq_of_get_keyreset_config();
error = input_register_handler(&sysrq_handler);
if (error)
pr_err("Failed to register input handler, error %d", error);
else
sysrq_handler_registered = true;
}
static inline void sysrq_unregister_handler(void)
{
if (sysrq_handler_registered) {
input_unregister_handler(&sysrq_handler);
sysrq_handler_registered = false;
}
}
static int sysrq_reset_seq_param_set(const char *buffer,
const struct kernel_param *kp)
{
unsigned long val;
int error;
error = kstrtoul(buffer, 0, &val);
if (error < 0)
return error;
if (val > KEY_MAX)
return -EINVAL;
*((unsigned short *)kp->arg) = val;
sysrq_reset_seq_version++;
return 0;
}
static const struct kernel_param_ops param_ops_sysrq_reset_seq = {
.get = param_get_ushort,
.set = sysrq_reset_seq_param_set,
};
#define param_check_sysrq_reset_seq(name, p) \
__param_check(name, p, unsigned short)
/*
* not really modular, but the easiest way to keep compat with existing
* bootargs behaviour is to continue using module_param here.
*/
module_param_array_named(reset_seq, sysrq_reset_seq, sysrq_reset_seq,
&sysrq_reset_seq_len, 0644);
module_param_named(sysrq_downtime_ms, sysrq_reset_downtime_ms, int, 0644);
#else
static inline void sysrq_register_handler(void)
{
}
static inline void sysrq_unregister_handler(void)
{
}
#endif /* CONFIG_INPUT */
int sysrq_toggle_support(int enable_mask)
{
bool was_enabled = sysrq_on();
sysrq_enabled = enable_mask;
if (was_enabled != sysrq_on()) {
if (sysrq_on())
sysrq_register_handler();
else
sysrq_unregister_handler();
}
return 0;
}
static int __sysrq_swap_key_ops(int key, struct sysrq_key_op *insert_op_p,
struct sysrq_key_op *remove_op_p)
{
int retval;
spin_lock(&sysrq_key_table_lock);
if (__sysrq_get_key_op(key) == remove_op_p) {
__sysrq_put_key_op(key, insert_op_p);
retval = 0;
} else {
retval = -1;
}
spin_unlock(&sysrq_key_table_lock);
/*
* A concurrent __handle_sysrq either got the old op or the new op.
* Wait for it to go away before returning, so the code for an old
* op is not freed (eg. on module unload) while it is in use.
*/
synchronize_rcu();
return retval;
}
int register_sysrq_key(int key, struct sysrq_key_op *op_p)
{
return __sysrq_swap_key_ops(key, op_p, NULL);
}
EXPORT_SYMBOL(register_sysrq_key);
int unregister_sysrq_key(int key, struct sysrq_key_op *op_p)
{
return __sysrq_swap_key_ops(key, NULL, op_p);
}
EXPORT_SYMBOL(unregister_sysrq_key);
#ifdef CONFIG_PROC_FS
/*
* writing 'C' to /proc/sysrq-trigger is like sysrq-C
*/
static ssize_t write_sysrq_trigger(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
if (count) {
char c;
if (get_user(c, buf))
return -EFAULT;
__handle_sysrq(c, false);
}
return count;
}
static const struct file_operations proc_sysrq_trigger_operations = {
.write = write_sysrq_trigger,
llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-16 00:52:59 +08:00
.llseek = noop_llseek,
};
static void sysrq_init_procfs(void)
{
if (!proc_create("sysrq-trigger", S_IWUSR, NULL,
&proc_sysrq_trigger_operations))
pr_err("Failed to register proc interface\n");
}
#else
static inline void sysrq_init_procfs(void)
{
}
#endif /* CONFIG_PROC_FS */
static int __init sysrq_init(void)
{
sysrq_init_procfs();
if (sysrq_on())
sysrq_register_handler();
return 0;
}
device_initcall(sysrq_init);