linux/kernel/trace
Mark Rutland cbad0fb2d8 ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS
Architectures without dynamic ftrace trampolines incur an overhead when
multiple ftrace_ops are enabled with distinct filters. in these cases,
each call site calls a common trampoline which uses
ftrace_ops_list_func() to iterate over all enabled ftrace functions, and
so incurs an overhead relative to the size of this list (including RCU
protection overhead).

Architectures with dynamic ftrace trampolines avoid this overhead for
call sites which have a single associated ftrace_ops. In these cases,
the dynamic trampoline is customized to branch directly to the relevant
ftrace function, avoiding the list overhead.

On some architectures it's impractical and/or undesirable to implement
dynamic ftrace trampolines. For example, arm64 has limited branch ranges
and cannot always directly branch from a call site to an arbitrary
address (e.g. from a kernel text address to an arbitrary module
address). Calls from modules to core kernel text can be indirected via
PLTs (allocated at module load time) to address this, but the same is
not possible from calls from core kernel text.

Using an indirect branch from a call site to an arbitrary trampoline is
possible, but requires several more instructions in the function
prologue (or immediately before it), and/or comes with far more complex
requirements for patching.

Instead, this patch adds a new option, where an architecture can
associate each call site with a pointer to an ftrace_ops, placed at a
fixed offset from the call site. A shared trampoline can recover this
pointer and call ftrace_ops::func() without needing to go via
ftrace_ops_list_func(), avoiding the associated overhead.

This avoids issues with branch range limitations, and avoids the need to
allocate and manipulate dynamic trampolines, making it far simpler to
implement and maintain, while having similar performance
characteristics.

Note that this allows for dynamic ftrace_ops to be invoked directly from
an architecture's ftrace_caller trampoline, whereas existing code forces
the use of ftrace_ops_get_list_func(), which is in part necessary to
permit the ftrace_ops to be freed once unregistered *and* to avoid
branch/address-generation range limitation on some architectures (e.g.
where ops->func is a module address, and may be outside of the direct
branch range for callsites within the main kernel image).

The CALL_OPS approach avoids this problems and is safe as:

* The existing synchronization in ftrace_shutdown() using
  ftrace_shutdown() using synchronize_rcu_tasks_rude() (and
  synchronize_rcu_tasks()) ensures that no tasks hold a stale reference
  to an ftrace_ops (e.g. in the middle of the ftrace_caller trampoline,
  or while invoking ftrace_ops::func), when that ftrace_ops is
  unregistered.

  Arguably this could also be relied upon for the existing scheme,
  permitting dynamic ftrace_ops to be invoked directly when ops->func is
  in range, but this will require additional logic to handle branch
  range limitations, and is not handled by this patch.

* Each callsite's ftrace_ops pointer literal can hold any valid kernel
  address, and is updated atomically. As an architecture's ftrace_caller
  trampoline will atomically load the ops pointer then dereference
  ops->func, there is no risk of invoking ops->func with a mismatches
  ops pointer, and updates to the ops pointer do not require special
  care.

A subsequent patch will implement architectures support for arm64. There
should be no functional change as a result of this patch alone.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:49:42 +00:00
..
rv rv/monitors: Move monitor structure in rodata 2022-12-20 11:46:40 -05:00
blktrace.c blktrace: Fix output non-blktrace event when blk_classic option enabled 2022-12-08 09:26:11 -07:00
bpf_trace.c bpf: Introduce might_sleep field in bpf_func_proto 2022-11-24 12:27:13 -08:00
bpf_trace.h
error_report-traces.c
fgraph.c arm64 fixes for 5.19-rc1: 2022-06-03 14:05:34 -07:00
fprobe.c tracing/fprobe: Fix to check whether fprobe is registered correctly 2022-11-04 08:50:07 +09:00
ftrace_internal.h
ftrace.c ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS 2023-01-24 11:49:42 +00:00
Kconfig ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS 2023-01-24 11:49:42 +00:00
kprobe_event_gen_test.c tracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test_exit() 2022-11-18 10:15:34 +09:00
Makefile rv: Add Runtime Verification (RV) interface 2022-07-30 14:01:28 -04:00
pid_list.c tracing: Cleanup double word in comment 2022-04-26 17:58:50 -04:00
pid_list.h tracing: Create a sparse bitmask for pid filtering 2021-10-05 17:38:45 -04:00
power-traces.c
preemptirq_delay_test.c
rethook.c rethook: fix a potential memleak in rethook_alloc() 2022-11-18 10:15:34 +09:00
ring_buffer_benchmark.c ring_buffer: Remove unused "event" parameter 2022-11-23 19:08:30 -05:00
ring_buffer.c ring-buffer: Handle resize in early boot up 2022-12-10 13:36:04 -05:00
rpm-traces.c
synth_event_gen_test.c tracing: Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event() 2022-11-17 17:51:38 -05:00
trace_benchmark.c tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_benchmark.h tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_boot.c tracing: Initialize integer variable to prevent garbage return value 2022-05-26 21:13:00 -04:00
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Free buffers when a used dynamic event is removed 2022-11-23 19:07:12 -05:00
trace_dynevent.h
trace_entries.h
trace_eprobe.c tracing: Fix race where eprobes can be called before the event 2022-11-24 00:42:18 +09:00
trace_event_perf.c tracing/perf: Use strndup_user instead of kzalloc/strncpy_from_user 2022-11-23 19:08:31 -05:00
trace_events_filter_test.h
trace_events_filter.c tracing/filter: Call filter predicate functions directly via a switch statement 2022-09-26 13:01:10 -04:00
trace_events_hist.c tracing/hist: Fix issue of losting command info in error_log 2022-12-10 13:36:04 -05:00
trace_events_inject.c tracing: Support __rel_loc relative dynamic data location attribute 2021-12-06 15:37:21 -05:00
trace_events_synth.c tracing: Fix issue of missing one synthetic field 2022-12-10 13:36:04 -05:00
trace_events_trigger.c tracing: Do not synchronize freeing of trigger filter on boot up 2022-12-14 08:50:56 -05:00
trace_events_user.c Tracing updates for 6.2: 2022-12-15 18:01:16 -08:00
trace_events.c tracing: remove unnecessary trace_trigger ifdef 2022-12-10 13:36:04 -05:00
trace_export.c
trace_functions_graph.c tracing: in_irq() cleanup 2021-10-13 18:19:41 -04:00
trace_functions.c ftrace: disable preemption when recursion locked 2021-10-27 11:21:49 -04:00
trace_hwlat.c trace/hwlat: make use of the helper function kthread_run_on_cpu() 2022-01-15 16:30:24 +02:00
trace_irqsoff.c
trace_kdb.c
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_kprobe.c trace/kprobe: remove duplicated calls of ring_buffer_event_data 2022-12-09 23:48:05 -05:00
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing/osnoise: Add preempt and/or irq disabled options 2022-12-10 13:36:05 -05:00
trace_output.c tracing: Fix some checker warnings 2022-12-10 13:36:05 -05:00
trace_output.h
trace_preemptirq.c tracing: hold caller_addr to hardirq_{enable,disable}_ip 2022-09-06 22:26:00 -04:00
trace_printk.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_probe_kernel.h tracing: Add "(fault)" name injection to kernel probes 2022-10-12 13:50:20 -04:00
trace_probe_tmpl.h tracing/probes: Add symstr type for dynamic events 2022-12-15 08:48:57 +09:00
trace_probe.c Trace probes updates for 6.2: 2022-12-21 18:57:24 -08:00
trace_probe.h tracing/probes: Reject symbol/symstr type for uprobe 2022-12-15 09:00:20 +09:00
trace_recursion_record.c tracing: Use trace_create_file() to simplify creation of tracefs entries 2022-05-26 21:12:52 -04:00
trace_sched_switch.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_sched_wakeup.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_selftest_dynamic.c
trace_selftest.c x86/ftrace: Make it call depth tracking aware 2022-10-17 16:41:19 +02:00
trace_seq.c
trace_stack.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.h
trace_synth.h tracing: synth events: increase max fields count 2021-09-08 15:29:16 -04:00
trace_syscalls.c tracing: Remove unused __bad_type_size() method 2022-11-17 20:21:06 -05:00
trace_uprobe.c tracing/probes: Reject symbol/symstr type for uprobe 2022-12-15 09:00:20 +09:00
trace.c Trace probes updates for 6.2: 2022-12-21 18:57:24 -08:00
trace.h tracing: Fix some checker warnings 2022-12-10 13:36:05 -05:00
tracing_map.c tracing: Remove unused variable 'dups' 2022-10-03 12:20:31 -04:00
tracing_map.h