ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
/*
|
|
|
|
* Code for replacing ftrace calls with jumps.
|
|
|
|
*
|
|
|
|
* Copyright (C) 2007-2008 Steven Rostedt <srostedt@redhat.com>
|
|
|
|
*
|
|
|
|
* Thanks goes to Ingo Molnar, for suggesting the idea.
|
|
|
|
* Mathieu Desnoyers, for suggesting postponing the modifications.
|
|
|
|
* Arjan van de Ven, for keeping me straight, and explaining to me
|
|
|
|
* the dangers of modifying code on the run.
|
|
|
|
*/
|
|
|
|
|
2009-10-05 08:53:29 +08:00
|
|
|
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/hardirq.h>
|
2008-08-21 00:55:07 +08:00
|
|
|
#include <linux/uaccess.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
#include <linux/ftrace.h>
|
|
|
|
#include <linux/percpu.h>
|
2008-11-11 18:57:02 +08:00
|
|
|
#include <linux/sched.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/list.h>
|
2010-11-17 05:35:16 +08:00
|
|
|
#include <linux/module.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2009-04-09 02:40:59 +08:00
|
|
|
#include <trace/syscall.h>
|
|
|
|
|
2009-02-18 06:57:30 +08:00
|
|
|
#include <asm/cacheflush.h>
|
2012-05-04 21:26:16 +08:00
|
|
|
#include <asm/kprobes.h>
|
2008-06-22 02:17:27 +08:00
|
|
|
#include <asm/ftrace.h>
|
ftrace: use only 5 byte nops for x86
Mathieu Desnoyers revealed a bug in the original code. The nop that is
used to relpace the mcount caller can be a two part nop. This runs the
risk where a process can be preempted after executing the first nop, but
before the second part of the nop.
The ftrace code calls kstop_machine to keep multiple CPUs from executing
code that is being modified, but it does not protect against a task preempting
in the middle of a two part nop.
If the above preemption happens and the tracer is enabled, after the
kstop_machine runs, all those nops will be calls to the trace function.
If the preempted process that was preempted between the two nops is executed
again, it will execute half of the call to the trace function, and this
might crash the system.
This patch instead uses what both the latest Intel and AMD spec suggests.
That is the P6_NOP5 sequence of "0x0f 0x1f 0x44 0x00 0x00".
Note, some older CPUs and QEMU might fault on this nop, so this nop
is executed with fault handling first. If it detects a fault, it will then
use the code "0x66 0x66 0x66 0x66 0x90". If that faults, it will then
default to a simple "jmp 1f; .byte 0x00 0x00 0x00; 1:". The jmp is
not optimal but will do if the first two can not be executed.
TODO: Examine the cpuid to determine the nop to use.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 06:05:05 +08:00
|
|
|
#include <asm/nops.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2008-11-11 14:03:45 +08:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2009-02-18 06:57:30 +08:00
|
|
|
int ftrace_arch_code_modify_prepare(void)
|
|
|
|
{
|
|
|
|
set_kernel_text_rw();
|
2010-11-17 05:35:16 +08:00
|
|
|
set_all_modules_text_rw();
|
2009-02-18 06:57:30 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_arch_code_modify_post_process(void)
|
|
|
|
{
|
2010-11-17 05:35:16 +08:00
|
|
|
set_all_modules_text_ro();
|
2009-02-18 06:57:30 +08:00
|
|
|
set_kernel_text_ro();
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
union ftrace_code_union {
|
2008-06-22 02:17:27 +08:00
|
|
|
char code[MCOUNT_INSN_SIZE];
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
struct {
|
|
|
|
char e8;
|
|
|
|
int offset;
|
|
|
|
} __attribute__((packed));
|
|
|
|
};
|
|
|
|
|
2008-10-23 21:33:08 +08:00
|
|
|
static int ftrace_calc_offset(long ip, long addr)
|
2008-05-13 03:20:43 +08:00
|
|
|
{
|
|
|
|
return (int)(addr - ip);
|
|
|
|
}
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2008-11-15 08:21:19 +08:00
|
|
|
static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
|
2008-05-13 03:20:43 +08:00
|
|
|
{
|
|
|
|
static union ftrace_code_union calc;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2008-05-13 03:20:43 +08:00
|
|
|
calc.e8 = 0xe8;
|
2008-06-22 02:17:27 +08:00
|
|
|
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
|
2008-05-13 03:20:43 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* No locking needed, this must be called via kstop_machine
|
|
|
|
* which in essence is like running on a uniprocessor machine.
|
|
|
|
*/
|
|
|
|
return calc.code;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
}
|
|
|
|
|
2009-10-29 10:46:57 +08:00
|
|
|
static inline int
|
|
|
|
within(unsigned long addr, unsigned long start, unsigned long end)
|
|
|
|
{
|
|
|
|
return addr >= start && addr < end;
|
|
|
|
}
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
static unsigned long text_ip_addr(unsigned long ip)
|
2008-10-31 04:08:32 +08:00
|
|
|
{
|
2009-10-29 10:46:57 +08:00
|
|
|
/*
|
|
|
|
* On x86_64, kernel text mappings are mapped read-only with
|
|
|
|
* CONFIG_DEBUG_RODATA. So we use the kernel identity mapping instead
|
|
|
|
* of the kernel text mapping to modify the kernel text.
|
|
|
|
*
|
|
|
|
* For 32bit kernels, these mappings are same and we can use
|
|
|
|
* kernel identity mapping to modify code.
|
|
|
|
*/
|
|
|
|
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
|
2012-11-17 05:57:32 +08:00
|
|
|
ip = (unsigned long)__va(__pa_symbol(ip));
|
2009-10-29 10:46:57 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
return ip;
|
2008-10-31 04:08:32 +08:00
|
|
|
}
|
|
|
|
|
2011-04-19 06:19:51 +08:00
|
|
|
static const unsigned char *ftrace_nop_replace(void)
|
2008-11-11 14:03:45 +08:00
|
|
|
{
|
2011-04-19 06:19:51 +08:00
|
|
|
return ideal_nops[NOP_ATOMIC5];
|
2008-11-11 14:03:45 +08:00
|
|
|
}
|
|
|
|
|
2008-11-15 08:21:19 +08:00
|
|
|
static int
|
2012-05-31 01:36:38 +08:00
|
|
|
ftrace_modify_code_direct(unsigned long ip, unsigned const char *old_code,
|
2011-05-13 01:33:40 +08:00
|
|
|
unsigned const char *new_code)
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
{
|
2008-08-21 00:55:07 +08:00
|
|
|
unsigned char replaced[MCOUNT_INSN_SIZE];
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Note: Due to modules and __init, code can
|
|
|
|
* disappear and change, we need to protect against faulting
|
2008-10-23 21:33:00 +08:00
|
|
|
* as well as code changing. We do this by using the
|
2008-10-23 21:33:01 +08:00
|
|
|
* probe_kernel_* functions.
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
*
|
|
|
|
* No real locking needed, this code is run through
|
2008-08-21 00:55:07 +08:00
|
|
|
* kstop_machine, or before SMP starts.
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
*/
|
2008-10-23 21:33:00 +08:00
|
|
|
|
|
|
|
/* read the text we want to modify */
|
2008-10-23 21:33:01 +08:00
|
|
|
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
|
2008-10-23 21:32:59 +08:00
|
|
|
return -EFAULT;
|
2008-08-21 00:55:07 +08:00
|
|
|
|
2008-10-23 21:33:00 +08:00
|
|
|
/* Make sure it is what we expect it to be */
|
2008-08-21 00:55:07 +08:00
|
|
|
if (memcmp(replaced, old_code, MCOUNT_INSN_SIZE) != 0)
|
2008-10-23 21:32:59 +08:00
|
|
|
return -EINVAL;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
ip = text_ip_addr(ip);
|
|
|
|
|
2008-10-23 21:33:00 +08:00
|
|
|
/* replace the text with the new text */
|
2014-02-12 09:19:44 +08:00
|
|
|
if (probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE))
|
2008-10-23 21:32:59 +08:00
|
|
|
return -EPERM;
|
2008-08-21 00:55:07 +08:00
|
|
|
|
|
|
|
sync_core();
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
|
2008-08-21 00:55:07 +08:00
|
|
|
return 0;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
}
|
|
|
|
|
2008-11-15 08:21:19 +08:00
|
|
|
int ftrace_make_nop(struct module *mod,
|
|
|
|
struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
2011-05-13 01:33:40 +08:00
|
|
|
unsigned const char *new, *old;
|
2008-11-15 08:21:19 +08:00
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_call_replace(ip, addr);
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
|
2012-05-31 01:36:38 +08:00
|
|
|
/*
|
|
|
|
* On boot up, and when modules are loaded, the MCOUNT_ADDR
|
|
|
|
* is converted to a nop, and will never become MCOUNT_ADDR
|
|
|
|
* again. This code is either running before SMP (on boot up)
|
|
|
|
* or before the code will ever be executed (module load).
|
|
|
|
* We do not want to use the breakpoint version in this case,
|
|
|
|
* just modify the code directly.
|
|
|
|
*/
|
|
|
|
if (addr == MCOUNT_ADDR)
|
|
|
|
return ftrace_modify_code_direct(rec->ip, old, new);
|
|
|
|
|
|
|
|
/* Normal cases use add_brk_on_nop */
|
|
|
|
WARN_ONCE(1, "invalid use of ftrace_make_nop");
|
|
|
|
return -EINVAL;
|
2008-11-15 08:21:19 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
2011-05-13 01:33:40 +08:00
|
|
|
unsigned const char *new, *old;
|
2008-11-15 08:21:19 +08:00
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_nop_replace();
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
|
2012-05-31 01:36:38 +08:00
|
|
|
/* Should only be called when module is loaded */
|
|
|
|
return ftrace_modify_code_direct(rec->ip, old, new);
|
2008-05-13 03:20:43 +08:00
|
|
|
}
|
|
|
|
|
2012-05-31 01:26:37 +08:00
|
|
|
/*
|
|
|
|
* The modifying_ftrace_code is used to tell the breakpoint
|
|
|
|
* handler to call ftrace_int3_handler(). If it fails to
|
|
|
|
* call this handler for a breakpoint added by ftrace, then
|
|
|
|
* the kernel may crash.
|
|
|
|
*
|
|
|
|
* As atomic_writes on x86 do not need a barrier, we do not
|
|
|
|
* need to add smp_mb()s for this to work. It is also considered
|
|
|
|
* that we can not read the modifying_ftrace_code before
|
|
|
|
* executing the breakpoint. That would be quite remarkable if
|
|
|
|
* it could do that. Here's the flow that is required:
|
|
|
|
*
|
|
|
|
* CPU-0 CPU-1
|
|
|
|
*
|
|
|
|
* atomic_inc(mfc);
|
|
|
|
* write int3s
|
|
|
|
* <trap-int3> // implicit (r)mb
|
|
|
|
* if (atomic_read(mfc))
|
|
|
|
* call ftrace_int3_handler()
|
|
|
|
*
|
|
|
|
* Then when we are finished:
|
|
|
|
*
|
|
|
|
* atomic_dec(mfc);
|
|
|
|
*
|
|
|
|
* If we hit a breakpoint that was not set by ftrace, it does not
|
|
|
|
* matter if ftrace_int3_handler() is called or not. It will
|
|
|
|
* simply be ignored. But it is crucial that a ftrace nop/caller
|
|
|
|
* breakpoint is handled. No other user should ever place a
|
|
|
|
* breakpoint on an ftrace nop/caller location. It must only
|
|
|
|
* be done by this code.
|
|
|
|
*/
|
|
|
|
atomic_t modifying_ftrace_code __read_mostly;
|
2011-08-16 21:57:10 +08:00
|
|
|
|
2012-05-31 01:36:38 +08:00
|
|
|
static int
|
|
|
|
ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
|
|
|
|
unsigned const char *new_code);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
/*
|
|
|
|
* Should never be called:
|
|
|
|
* As it is only called by __ftrace_replace_code() which is called by
|
|
|
|
* ftrace_replace_code() that x86 overrides, and by ftrace_update_code()
|
|
|
|
* which is called to turn mcount into nops or nops into function calls
|
|
|
|
* but not to convert a function from not using regs to one that uses
|
|
|
|
* regs, which ftrace_modify_call() is for.
|
|
|
|
*/
|
|
|
|
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
|
|
|
|
unsigned long addr)
|
|
|
|
{
|
|
|
|
WARN_ON(1);
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
static unsigned long ftrace_update_func;
|
|
|
|
|
|
|
|
static int update_ftrace_func(unsigned long ip, void *new)
|
2012-05-31 01:36:38 +08:00
|
|
|
{
|
2014-02-12 09:19:44 +08:00
|
|
|
unsigned char old[MCOUNT_INSN_SIZE];
|
2012-05-31 01:36:38 +08:00
|
|
|
int ret;
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
memcpy(old, (void *)ip, MCOUNT_INSN_SIZE);
|
|
|
|
|
|
|
|
ftrace_update_func = ip;
|
|
|
|
/* Make sure the breakpoints see the ftrace_update_func update */
|
|
|
|
smp_wmb();
|
2012-05-31 01:36:38 +08:00
|
|
|
|
|
|
|
/* See comment above by declaration of modifying_ftrace_code */
|
|
|
|
atomic_inc(&modifying_ftrace_code);
|
|
|
|
|
|
|
|
ret = ftrace_modify_code(ip, old, new);
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
atomic_dec(&modifying_ftrace_code);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_update_ftrace_func(ftrace_func_t func)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_call);
|
|
|
|
unsigned char *new;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, (unsigned long)func);
|
|
|
|
ret = update_ftrace_func(ip, new);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
/* Also update the regs callback function */
|
|
|
|
if (!ret) {
|
|
|
|
ip = (unsigned long)(&ftrace_regs_call);
|
|
|
|
new = ftrace_call_replace(ip, (unsigned long)func);
|
2014-02-12 09:19:44 +08:00
|
|
|
ret = update_ftrace_func(ip, new);
|
2012-05-01 04:20:23 +08:00
|
|
|
}
|
|
|
|
|
2012-05-31 01:36:38 +08:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-10-23 20:58:16 +08:00
|
|
|
static int is_ftrace_caller(unsigned long ip)
|
|
|
|
{
|
2014-02-12 09:19:44 +08:00
|
|
|
if (ip == ftrace_update_func)
|
2013-10-23 20:58:16 +08:00
|
|
|
return 1;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-08-16 21:57:10 +08:00
|
|
|
/*
|
|
|
|
* A breakpoint was added to the code address we are about to
|
|
|
|
* modify, and this is the handle that will just skip over it.
|
|
|
|
* We are either changing a nop into a trace call, or a trace
|
|
|
|
* call to a nop. While the change is taking place, we treat
|
|
|
|
* it just like it was a nop.
|
|
|
|
*/
|
|
|
|
int ftrace_int3_handler(struct pt_regs *regs)
|
|
|
|
{
|
2013-10-23 20:58:16 +08:00
|
|
|
unsigned long ip;
|
|
|
|
|
2011-08-16 21:57:10 +08:00
|
|
|
if (WARN_ON_ONCE(!regs))
|
|
|
|
return 0;
|
|
|
|
|
2013-10-23 20:58:16 +08:00
|
|
|
ip = regs->ip - 1;
|
|
|
|
if (!ftrace_location(ip) && !is_ftrace_caller(ip))
|
2011-08-16 21:57:10 +08:00
|
|
|
return 0;
|
|
|
|
|
|
|
|
regs->ip += MCOUNT_INSN_SIZE - 1;
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int ftrace_write(unsigned long ip, const char *val, int size)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* On x86_64, kernel text mappings are mapped read-only with
|
|
|
|
* CONFIG_DEBUG_RODATA. So we use the kernel identity mapping instead
|
|
|
|
* of the kernel text mapping to modify the kernel text.
|
|
|
|
*
|
|
|
|
* For 32bit kernels, these mappings are same and we can use
|
|
|
|
* kernel identity mapping to modify code.
|
|
|
|
*/
|
|
|
|
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
|
2012-11-17 05:57:32 +08:00
|
|
|
ip = (unsigned long)__va(__pa_symbol(ip));
|
2011-08-16 21:57:10 +08:00
|
|
|
|
|
|
|
return probe_kernel_write((void *)ip, val, size);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_break(unsigned long ip, const char *old)
|
|
|
|
{
|
|
|
|
unsigned char replaced[MCOUNT_INSN_SIZE];
|
|
|
|
unsigned char brk = BREAKPOINT_INSTRUCTION;
|
|
|
|
|
|
|
|
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
/* Make sure it is what we expect it to be */
|
|
|
|
if (memcmp(replaced, old, MCOUNT_INSN_SIZE) != 0)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
if (ftrace_write(ip, &brk, 1))
|
|
|
|
return -EPERM;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_brk_on_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned const char *old;
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_call_replace(ip, addr);
|
|
|
|
|
|
|
|
return add_break(rec->ip, old);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static int add_brk_on_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned const char *old;
|
|
|
|
|
|
|
|
old = ftrace_nop_replace();
|
|
|
|
|
|
|
|
return add_break(rec->ip, old);
|
|
|
|
}
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
/*
|
|
|
|
* If the record has the FTRACE_FL_REGS set, that means that it
|
|
|
|
* wants to convert to a callback that saves all regs. If FTRACE_FL_REGS
|
|
|
|
* is not not set, then it wants to convert to the normal callback.
|
|
|
|
*/
|
|
|
|
static unsigned long get_ftrace_addr(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
if (rec->flags & FTRACE_FL_REGS)
|
|
|
|
return (unsigned long)FTRACE_REGS_ADDR;
|
|
|
|
else
|
|
|
|
return (unsigned long)FTRACE_ADDR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The FTRACE_FL_REGS_EN is set when the record already points to
|
|
|
|
* a function that saves all the regs. Basically the '_EN' version
|
|
|
|
* represents the current state of the function.
|
|
|
|
*/
|
|
|
|
static unsigned long get_ftrace_old_addr(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
if (rec->flags & FTRACE_FL_REGS_EN)
|
|
|
|
return (unsigned long)FTRACE_REGS_ADDR;
|
|
|
|
else
|
|
|
|
return (unsigned long)FTRACE_ADDR;
|
|
|
|
}
|
|
|
|
|
2011-08-16 21:57:10 +08:00
|
|
|
static int add_breakpoints(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = ftrace_test_record(rec, enable);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
ftrace_addr = get_ftrace_addr(rec);
|
2011-08-16 21:57:10 +08:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return add_brk_on_nop(rec);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL_REGS:
|
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
|
|
|
ftrace_addr = get_ftrace_old_addr(rec);
|
|
|
|
/* fall through */
|
2011-08-16 21:57:10 +08:00
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return add_brk_on_call(rec, ftrace_addr);
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* On error, we need to remove breakpoints. This needs to
|
|
|
|
* be done caefully. If the address does not currently have a
|
|
|
|
* breakpoint, we know we are done. Otherwise, we look at the
|
|
|
|
* remaining 4 bytes of the instruction. If it matches a nop
|
|
|
|
* we replace the breakpoint with the nop. Otherwise we replace
|
|
|
|
* it with the call instruction.
|
|
|
|
*/
|
|
|
|
static int remove_breakpoint(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned char ins[MCOUNT_INSN_SIZE];
|
|
|
|
unsigned char brk = BREAKPOINT_INSTRUCTION;
|
|
|
|
const unsigned char *nop;
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
/* If we fail the read, just give up */
|
|
|
|
if (probe_kernel_read(ins, (void *)ip, MCOUNT_INSN_SIZE))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
/* If this does not have a breakpoint, we are done */
|
|
|
|
if (ins[0] != brk)
|
|
|
|
return -1;
|
|
|
|
|
|
|
|
nop = ftrace_nop_replace();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the last 4 bytes of the instruction do not match
|
|
|
|
* a nop, then we assume that this is a call to ftrace_addr.
|
|
|
|
*/
|
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0) {
|
|
|
|
/*
|
|
|
|
* For extra paranoidism, we check if the breakpoint is on
|
|
|
|
* a call that would actually jump to the ftrace_addr.
|
|
|
|
* If not, don't touch the breakpoint, we make just create
|
|
|
|
* a disaster.
|
|
|
|
*/
|
2012-05-01 04:20:23 +08:00
|
|
|
ftrace_addr = get_ftrace_addr(rec);
|
|
|
|
nop = ftrace_call_replace(ip, ftrace_addr);
|
|
|
|
|
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) == 0)
|
|
|
|
goto update;
|
|
|
|
|
|
|
|
/* Check both ftrace_addr and ftrace_old_addr */
|
|
|
|
ftrace_addr = get_ftrace_old_addr(rec);
|
2011-08-16 21:57:10 +08:00
|
|
|
nop = ftrace_call_replace(ip, ftrace_addr);
|
|
|
|
|
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0)
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
update:
|
ftrace/x86: Run a sync after fixup on failure
If a failure occurs while enabling a trace, it bails out and will remove
the tracepoints to be back to what the code originally was. But the fix
up had some bugs in it. By injecting a failure in the code, the fix up
ran to completion, but shortly afterward the system rebooted.
There was two bugs here.
The first was that there was no final sync run across the CPUs after the
fix up was done, and before the ftrace int3 handler flag was reset. That
means that other CPUs could still see the breakpoint and trigger on it
long after the flag was cleared, and the int3 handler would think it was
a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
because the ftrace int3 handler flag would have prevented the int3 handler
from going further.
Here's a description of the issue:
CPU0 CPU1
---- ----
remove_breakpoint();
modifying_ftrace_code = 0;
[still sees breakpoint]
<takes trap>
[sees modifying_ftrace_code as zero]
[no breakpoint handler]
[goto failed case]
[trap exception - kernel breakpoint, no
handler]
BUG()
The second bug was that the removal of the breakpoints required the
"within()" logic updates instead of accessing the ip address directly.
As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
the removal of the breakpoint is a modification of the kernel text.
The ftrace_write() includes the "within()" logic, where as, the
probe_kernel_write() does not. This prevented the breakpoint from being
removed at all.
Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz
Reported-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-21 23:43:12 +08:00
|
|
|
return ftrace_write(ip, nop, 1);
|
2011-08-16 21:57:10 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_code(unsigned long ip, unsigned const char *new)
|
|
|
|
{
|
|
|
|
/* skip breakpoint */
|
|
|
|
ip++;
|
|
|
|
new++;
|
|
|
|
if (ftrace_write(ip, new, MCOUNT_INSN_SIZE - 1))
|
|
|
|
return -EPERM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
return add_update_code(ip, new);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
return add_update_code(ip, new);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = ftrace_test_record(rec, enable);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
ftrace_addr = get_ftrace_addr(rec);
|
2011-08-16 21:57:10 +08:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL_REGS:
|
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
2011-08-16 21:57:10 +08:00
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return add_update_call(rec, ftrace_addr);
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return add_update_nop(rec);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
|
|
|
|
if (ftrace_write(ip, new, 1))
|
|
|
|
return -EPERM;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
|
|
|
|
if (ftrace_write(ip, new, 1))
|
|
|
|
return -EPERM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = ftrace_update_record(rec, enable);
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
ftrace_addr = get_ftrace_addr(rec);
|
2011-08-16 21:57:10 +08:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
2012-05-01 04:20:23 +08:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL_REGS:
|
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
2011-08-16 21:57:10 +08:00
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return finish_update_call(rec, ftrace_addr);
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return finish_update_nop(rec);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void do_sync_core(void *data)
|
|
|
|
{
|
|
|
|
sync_core();
|
|
|
|
}
|
|
|
|
|
|
|
|
static void run_sync(void)
|
|
|
|
{
|
|
|
|
int enable_irqs = irqs_disabled();
|
|
|
|
|
|
|
|
/* We may be called with interrupts disbled (on bootup). */
|
|
|
|
if (enable_irqs)
|
|
|
|
local_irq_enable();
|
|
|
|
on_each_cpu(do_sync_core, NULL, 1);
|
|
|
|
if (enable_irqs)
|
|
|
|
local_irq_disable();
|
|
|
|
}
|
|
|
|
|
2012-04-27 21:13:18 +08:00
|
|
|
void ftrace_replace_code(int enable)
|
2011-08-16 21:57:10 +08:00
|
|
|
{
|
|
|
|
struct ftrace_rec_iter *iter;
|
|
|
|
struct dyn_ftrace *rec;
|
|
|
|
const char *report = "adding breakpoints";
|
|
|
|
int count = 0;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = add_breakpoints(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
|
|
|
count++;
|
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
report = "updating code";
|
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = add_update(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
report = "removing breakpoints";
|
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = finish_update(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
remove_breakpoints:
|
|
|
|
ftrace_bug(ret, rec ? rec->ip : 0);
|
|
|
|
printk(KERN_WARNING "Failed on %s (%d):\n", report, count);
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
remove_breakpoint(rec);
|
|
|
|
}
|
ftrace/x86: Run a sync after fixup on failure
If a failure occurs while enabling a trace, it bails out and will remove
the tracepoints to be back to what the code originally was. But the fix
up had some bugs in it. By injecting a failure in the code, the fix up
ran to completion, but shortly afterward the system rebooted.
There was two bugs here.
The first was that there was no final sync run across the CPUs after the
fix up was done, and before the ftrace int3 handler flag was reset. That
means that other CPUs could still see the breakpoint and trigger on it
long after the flag was cleared, and the int3 handler would think it was
a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
because the ftrace int3 handler flag would have prevented the int3 handler
from going further.
Here's a description of the issue:
CPU0 CPU1
---- ----
remove_breakpoint();
modifying_ftrace_code = 0;
[still sees breakpoint]
<takes trap>
[sees modifying_ftrace_code as zero]
[no breakpoint handler]
[goto failed case]
[trap exception - kernel breakpoint, no
handler]
BUG()
The second bug was that the removal of the breakpoints required the
"within()" logic updates instead of accessing the ip address directly.
As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
the removal of the breakpoint is a modification of the kernel text.
The ftrace_write() includes the "within()" logic, where as, the
probe_kernel_write() does not. This prevented the breakpoint from being
removed at all.
Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz
Reported-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-21 23:43:12 +08:00
|
|
|
run_sync();
|
2011-08-16 21:57:10 +08:00
|
|
|
}
|
|
|
|
|
2012-05-31 01:36:38 +08:00
|
|
|
static int
|
|
|
|
ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
|
|
|
|
unsigned const char *new_code)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = add_break(ip, old_code);
|
|
|
|
if (ret)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
ret = add_update_code(ip, new_code);
|
|
|
|
if (ret)
|
|
|
|
goto fail_update;
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
ret = ftrace_write(ip, new_code, 1);
|
|
|
|
if (ret) {
|
|
|
|
ret = -EPERM;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
out:
|
2014-02-25 00:12:20 +08:00
|
|
|
run_sync();
|
2012-05-31 01:36:38 +08:00
|
|
|
return ret;
|
|
|
|
|
|
|
|
fail_update:
|
ftrace/x86: Run a sync after fixup on failure
If a failure occurs while enabling a trace, it bails out and will remove
the tracepoints to be back to what the code originally was. But the fix
up had some bugs in it. By injecting a failure in the code, the fix up
ran to completion, but shortly afterward the system rebooted.
There was two bugs here.
The first was that there was no final sync run across the CPUs after the
fix up was done, and before the ftrace int3 handler flag was reset. That
means that other CPUs could still see the breakpoint and trigger on it
long after the flag was cleared, and the int3 handler would think it was
a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
because the ftrace int3 handler flag would have prevented the int3 handler
from going further.
Here's a description of the issue:
CPU0 CPU1
---- ----
remove_breakpoint();
modifying_ftrace_code = 0;
[still sees breakpoint]
<takes trap>
[sees modifying_ftrace_code as zero]
[no breakpoint handler]
[goto failed case]
[trap exception - kernel breakpoint, no
handler]
BUG()
The second bug was that the removal of the breakpoints required the
"within()" logic updates instead of accessing the ip address directly.
As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
the removal of the breakpoint is a modification of the kernel text.
The ftrace_write() includes the "within()" logic, where as, the
probe_kernel_write() does not. This prevented the breakpoint from being
removed at all.
Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz
Reported-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-21 23:43:12 +08:00
|
|
|
ftrace_write(ip, old_code, 1);
|
2012-05-31 01:36:38 +08:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2011-08-16 21:57:10 +08:00
|
|
|
void arch_ftrace_update_code(int command)
|
|
|
|
{
|
2012-05-31 01:26:37 +08:00
|
|
|
/* See comment above by declaration of modifying_ftrace_code */
|
|
|
|
atomic_inc(&modifying_ftrace_code);
|
2011-08-16 21:57:10 +08:00
|
|
|
|
2012-04-27 21:13:18 +08:00
|
|
|
ftrace_modify_all_code(command);
|
2011-08-16 21:57:10 +08:00
|
|
|
|
2012-05-31 01:26:37 +08:00
|
|
|
atomic_dec(&modifying_ftrace_code);
|
2011-08-16 21:57:10 +08:00
|
|
|
}
|
|
|
|
|
2008-05-13 03:20:43 +08:00
|
|
|
int __init ftrace_dyn_arch_init(void *data)
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
{
|
ftrace: use only 5 byte nops for x86
Mathieu Desnoyers revealed a bug in the original code. The nop that is
used to relpace the mcount caller can be a two part nop. This runs the
risk where a process can be preempted after executing the first nop, but
before the second part of the nop.
The ftrace code calls kstop_machine to keep multiple CPUs from executing
code that is being modified, but it does not protect against a task preempting
in the middle of a two part nop.
If the above preemption happens and the tracer is enabled, after the
kstop_machine runs, all those nops will be calls to the trace function.
If the preempted process that was preempted between the two nops is executed
again, it will execute half of the call to the trace function, and this
might crash the system.
This patch instead uses what both the latest Intel and AMD spec suggests.
That is the P6_NOP5 sequence of "0x0f 0x1f 0x44 0x00 0x00".
Note, some older CPUs and QEMU might fault on this nop, so this nop
is executed with fault handling first. If it detects a fault, it will then
use the code "0x66 0x66 0x66 0x66 0x90". If that faults, it will then
default to a simple "jmp 1f; .byte 0x00 0x00 0x00; 1:". The jmp is
not optimal but will do if the first two can not be executed.
TODO: Examine the cpuid to determine the nop to use.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 06:05:05 +08:00
|
|
|
/* The return code is retured via data */
|
|
|
|
*(unsigned long *)data = 0;
|
2008-05-13 03:20:43 +08:00
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 03:20:42 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2008-11-11 14:03:45 +08:00
|
|
|
#endif
|
2008-11-16 13:02:06 +08:00
|
|
|
|
2008-11-26 04:07:04 +08:00
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
2008-11-16 13:02:06 +08:00
|
|
|
|
2008-11-26 13:16:24 +08:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
|
|
extern void ftrace_graph_call(void);
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
|
2008-11-26 13:16:24 +08:00
|
|
|
{
|
2014-02-12 09:19:44 +08:00
|
|
|
static union ftrace_code_union calc;
|
2008-11-26 13:16:24 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
/* Jmp not a call (ignore the .e8) */
|
|
|
|
calc.e8 = 0xe9;
|
|
|
|
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
|
2008-11-26 13:16:24 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
/*
|
|
|
|
* ftrace external locks synchronize the access to the static variable.
|
|
|
|
*/
|
|
|
|
return calc.code;
|
|
|
|
}
|
2008-11-26 13:16:24 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
static int ftrace_mod_jmp(unsigned long ip, void *func)
|
|
|
|
{
|
|
|
|
unsigned char *new;
|
2008-11-26 13:16:24 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
new = ftrace_jmp_replace(ip, (unsigned long)func);
|
2008-11-26 13:16:24 +08:00
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
return update_ftrace_func(ip, new);
|
2008-11-26 13:16:24 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_enable_ftrace_graph_caller(void)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_graph_call);
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
return ftrace_mod_jmp(ip, &ftrace_graph_caller);
|
2008-11-26 13:16:24 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_disable_ftrace_graph_caller(void)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_graph_call);
|
|
|
|
|
2014-02-12 09:19:44 +08:00
|
|
|
return ftrace_mod_jmp(ip, &ftrace_stub);
|
2008-11-26 13:16:24 +08:00
|
|
|
}
|
|
|
|
|
2008-11-16 13:02:06 +08:00
|
|
|
#endif /* !CONFIG_DYNAMIC_FTRACE */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Hook the return address and push it in the stack of return addrs
|
|
|
|
* in current thread info.
|
|
|
|
*/
|
function-graph: add stack frame test
In case gcc does something funny with the stack frames, or the return
from function code, we would like to detect that.
An arch may implement passing of a variable that is unique to the
function and can be saved on entering a function and can be tested
when exiting the function. Usually the frame pointer can be used for
this purpose.
This patch also implements this for x86. Where it passes in the stack
frame of the parent function, and will test that frame on exit.
There was a case in x86_32 with optimize for size (-Os) where, for a
few functions, gcc would align the stack frame and place a copy of the
return address into it. The function graph tracer modified the copy and
not the actual return address. On return from the funtion, it did not go
to the tracer hook, but returned to the parent. This broke the function
graph tracer, because the return of the parent (where gcc did not do
this funky manipulation) returned to the location that the child function
was suppose to. This caused strange kernel crashes.
This test detected the problem and pointed out where the issue was.
This modifies the parameters of one of the functions that the arch
specific code calls, so it includes changes to arch code to accommodate
the new prototype.
Note, I notice that the parsic arch implements its own push_return_trace.
This is now a generic function and the ftrace_push_return_trace should be
used instead. This patch does not touch that code.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-19 00:45:08 +08:00
|
|
|
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
|
|
|
|
unsigned long frame_pointer)
|
2008-11-16 13:02:06 +08:00
|
|
|
{
|
|
|
|
unsigned long old;
|
|
|
|
int faulted;
|
2008-11-26 07:57:25 +08:00
|
|
|
struct ftrace_graph_ent trace;
|
2008-11-16 13:02:06 +08:00
|
|
|
unsigned long return_hooker = (unsigned long)
|
|
|
|
&return_to_handler;
|
|
|
|
|
2008-12-06 10:43:41 +08:00
|
|
|
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
|
2008-11-16 13:02:06 +08:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Protect against fault, even if it shouldn't
|
|
|
|
* happen. This tool is too much intrusive to
|
|
|
|
* ignore such a protection.
|
|
|
|
*/
|
|
|
|
asm volatile(
|
2009-02-11 00:53:23 +08:00
|
|
|
"1: " _ASM_MOV " (%[parent]), %[old]\n"
|
|
|
|
"2: " _ASM_MOV " %[return_hooker], (%[parent])\n"
|
2008-11-16 13:02:06 +08:00
|
|
|
" movl $0, %[faulted]\n"
|
2009-02-11 02:07:13 +08:00
|
|
|
"3:\n"
|
2008-11-16 13:02:06 +08:00
|
|
|
|
|
|
|
".section .fixup, \"ax\"\n"
|
2009-02-11 02:07:13 +08:00
|
|
|
"4: movl $1, %[faulted]\n"
|
|
|
|
" jmp 3b\n"
|
2008-11-16 13:02:06 +08:00
|
|
|
".previous\n"
|
|
|
|
|
2009-02-11 02:07:13 +08:00
|
|
|
_ASM_EXTABLE(1b, 4b)
|
|
|
|
_ASM_EXTABLE(2b, 4b)
|
2008-11-16 13:02:06 +08:00
|
|
|
|
2009-05-14 01:52:19 +08:00
|
|
|
: [old] "=&r" (old), [faulted] "=r" (faulted)
|
2009-02-11 00:53:23 +08:00
|
|
|
: [parent] "r" (parent), [return_hooker] "r" (return_hooker)
|
2008-11-16 13:02:06 +08:00
|
|
|
: "memory"
|
|
|
|
);
|
|
|
|
|
2008-12-03 12:50:02 +08:00
|
|
|
if (unlikely(faulted)) {
|
|
|
|
ftrace_graph_stop();
|
|
|
|
WARN_ON(1);
|
2008-11-16 13:02:06 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2008-11-26 07:57:25 +08:00
|
|
|
trace.func = self_addr;
|
2011-02-12 09:36:02 +08:00
|
|
|
trace.depth = current->curr_ret_stack + 1;
|
2008-11-26 07:57:25 +08:00
|
|
|
|
2008-12-03 12:50:05 +08:00
|
|
|
/* Only trace if the calling function expects to */
|
|
|
|
if (!ftrace_graph_entry(&trace)) {
|
|
|
|
*parent = old;
|
2011-02-12 09:36:02 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ftrace_push_return_trace(old, self_addr, &trace.depth,
|
|
|
|
frame_pointer) == -EBUSY) {
|
|
|
|
*parent = old;
|
|
|
|
return;
|
2008-12-03 12:50:05 +08:00
|
|
|
}
|
2008-11-16 13:02:06 +08:00
|
|
|
}
|
2008-11-26 04:07:04 +08:00
|
|
|
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|