o MAINTAINERS files additions and changes.
o Fix hotplug warning in nohz code.
o Tick dependency changes by Zqiang.
o Lazy-RCU shrinker fixes by Zqiang.
o rcu-tasks stall reporting improvements by Neeraj.
o Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
name for robustness.
o Documentation Updates:
o Significant changes to srcu_struct size.
o Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
o rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
o Other misc changes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
+Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
=Rgbj
-----END PGP SIGNATURE-----
Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux
Pull RCU updates from Joel Fernandes:
- Updates and additions to MAINTAINERS files, with Boqun being added to
the RCU entry and Zqiang being added as an RCU reviewer.
I have also transitioned from reviewer to maintainer; however, Paul
will be taking over sending RCU pull-requests for the next merge
window.
- Resolution of hotplug warning in nohz code, achieved by fixing
cpu_is_hotpluggable() through interaction with the nohz subsystem.
Tick dependency modifications by Zqiang, focusing on fixing usage of
the TICK_DEP_BIT_RCU_EXP bitmask.
- Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
kernels, fixed by Zqiang.
- Improvements to rcu-tasks stall reporting by Neeraj.
- Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
increased robustness, affecting several components like mac802154,
drbd, vmw_vmci, tracing, and more.
A report by Eric Dumazet showed that the API could be unknowingly
used in an atomic context, so we'd rather make sure they know what
they're asking for by being explicit:
https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/
- Documentation updates, including corrections to spelling,
clarifications in comments, and improvements to the srcu_size_state
comments.
- Better srcu_struct cache locality for readers, by adjusting the size
of srcu_struct in support of SRCU usage by Christoph Hellwig.
- Teach lockdep to detect deadlocks between srcu_read_lock() vs
synchronize_srcu() contributed by Boqun.
Previously lockdep could not detect such deadlocks, now it can.
- Integration of rcutorture and rcu-related tools, targeted for v6.4
from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
module parameter, and more
- Miscellaneous changes, various code cleanups and comment improvements
* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
checkpatch: Error out if deprecated RCU API used
mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
rcu/trace: use strscpy() to instead of strncpy()
tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
...
This commit saves a line of code by switching from strncpy() to strscpy()
by permitting the later NUL assignment to be removed. While in the area,
save another line by taking advantage of 100 characters.
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Fix the rcutorturename field so that its size is correctly reported in
the text format embedded in trace.dat files. As it stands, it is
reported as being of size 1:
field:char rcutorturename[8]; offset:8; size:1; signed:0;
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: stable@vger.kernel.org
Fixes: 04ae87a520 ("ftrace: Rework event_create_dir()")
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
[ boqun: Add "Cc" and "Fixes" tags per Steven ]
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit saves a few lines by checking first for an empty callback
list. If the callback list is empty, then that CPU is taken care of,
regardless of its online or nocb state. Also simplify tracing accordingly
and fold a few lines together.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Now that ->nocb_timer and ->nocb_bypass_timer have become quite similar,
this commit merges them together. A new RCU_NOCB_WAKE_BYPASS wake level
is introduced. As a result, timers perform all kinds of deferred wake
ups but other deferred wakeup callsites only handle non-bypass wakeups
in order not to wake up rcuo too early.
The timer also unconditionally executes a full barrier so as to order
timer_pending() and callback enqueue although the path performing
RCU_NOCB_WAKE_FORCE that makes use of it is debatable. It should also
test against the rdp leader instead of the current rdp.
This unconditional full barrier shouldn't bring visible overhead since
these timers almost never fire.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
New feature:
The "func-no-repeats" option in tracefs/options directory. When set
the function tracer will detect if the current function being traced
is the same as the previous one, and instead of recording it, it will
keep track of the number of times that the function is repeated in a row.
And when another function is recorded, it will write a new event that
shows the function that repeated, the number of times it repeated and
the time stamp of when the last repeated function occurred.
Enhancements:
In order to implement the above "func-no-repeats" option, the ring
buffer timestamp can now give the accurate timestamp of the event
as it is being recorded, instead of having to record an absolute
timestamp for all events. This helps the histogram code which no longer
needs to waste ring buffer space.
New validation logic to make sure all trace events that access
dereferenced pointers do so in a safe way, and will warn otherwise.
Fixes:
No longer limit the PIDs of tasks that are recorded for "saved_cmdlines"
to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger
range. This caused the mapping of PIDs to the task names to be dropped
for all tasks with a PID greater than 32768.
Change trace_clock_global() to never block. This caused a deadlock.
Clean ups:
Typos, prototype fixes, and removing of duplicate or unused code.
Better management of ftrace_page allocations.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYI/1vBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qiL0AP9EemIC5TDh2oihqLRNeUjdTu0ryEoM
HRFqxozSF985twD/bfkt86KQC8rLHwxTbxQZ863bmdaC6cMGFhWiF+H/MAs=
=psYt
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New feature:
- A new "func-no-repeats" option in tracefs/options directory.
When set the function tracer will detect if the current function
being traced is the same as the previous one, and instead of
recording it, it will keep track of the number of times that the
function is repeated in a row. And when another function is
recorded, it will write a new event that shows the function that
repeated, the number of times it repeated and the time stamp of
when the last repeated function occurred.
Enhancements:
- In order to implement the above "func-no-repeats" option, the ring
buffer timestamp can now give the accurate timestamp of the event
as it is being recorded, instead of having to record an absolute
timestamp for all events. This helps the histogram code which no
longer needs to waste ring buffer space.
- New validation logic to make sure all trace events that access
dereferenced pointers do so in a safe way, and will warn otherwise.
Fixes:
- No longer limit the PIDs of tasks that are recorded for
"saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows
for a much larger range. This caused the mapping of PIDs to the
task names to be dropped for all tasks with a PID greater than
32768.
- Change trace_clock_global() to never block. This caused a deadlock.
Clean ups:
- Typos, prototype fixes, and removing of duplicate or unused code.
- Better management of ftrace_page allocations"
* tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits)
tracing: Restructure trace_clock_global() to never block
tracing: Map all PIDs to command lines
ftrace: Reuse the output of the function tracer for func_repeats
tracing: Add "func_no_repeats" option for function tracing
tracing: Unify the logic for function tracing options
tracing: Add method for recording "func_repeats" events
tracing: Add "last_func_repeats" to struct trace_array
tracing: Define new ftrace event "func_repeats"
tracing: Define static void trace_print_time()
ftrace: Simplify the calculation of page number for ftrace_page->records some more
ftrace: Store the order of pages allocated in ftrace_page
tracing: Remove unused argument from "ring_buffer_time_stamp()
tracing: Remove duplicate struct declaration in trace_events.h
tracing: Update create_system_filter() kernel-doc comment
tracing: A minor cleanup for create_system_filter()
kernel: trace: Mundane typo fixes in the file trace_events_filter.c
tracing: Fix various typos in comments
scripts/recordmcount.pl: Make vim and emacs indent the same
scripts/recordmcount.pl: Make indent spacing consistent
tracing: Add a verifier to check string pointers for trace events
...
This commit adds a trace event which allows tracing the beginnings of RCU
CPU stall warnings on systems where sysctl_panic_on_rcu_stall is disabled.
The first parameter is the name of RCU flavor like other trace events.
The second parameter indicates whether this is a stall of an expedited
grace period, a self-detected stall of a normal grace period, or a stall
of a normal grace period detected by some CPU other than the one that
is stalled.
RCU CPU stall warnings are often caused by external-to-RCU issues,
for example, in interrupt handling or task scheduling. Therefore,
this event uses TRACE_EVENT, not TRACE_EVENT_RCU, to avoid requiring
those interested in tracing RCU CPU stalls to rebuild their kernels
with CONFIG_RCU_TRACE=y.
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds tracing to track how the segcb list changes before/after
acceleration, during queuing and during dequeuing.
This tracing helped discover an optimization that avoided needless GP
requests when no callbacks were accelerated. The tracing overhead is
minimal as each segment's length is now stored in the respective segment.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
GP numbers start from -300 and gp_seq numbers start of -1200 (for a
shift of 2). These negative numbers are printed as unsigned long which
not only takes up more text space, but is rather confusing to the reader
as they have to constantly expend energy to truncate the number. Just
print the negative numbering directly.
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The following changes are introduced:
1. Rename rcu_invoke_kfree_callback() to rcu_invoke_kvfree_callback(),
as well as the associated trace events, so the rcu_kfree_callback(),
becomes rcu_kvfree_callback(). The reason is to be aligned with kvfree()
notation.
2. Rename __is_kfree_rcu_offset to __is_kvfree_rcu_offset. All RCU
paths use kvfree() now instead of kfree(), thus rename it.
3. Rename kfree_call_rcu() to the kvfree_call_rcu(). The reason is,
it is capable of freeing vmalloc() memory now. Do the same with
__kfree_rcu() macro, it becomes __kvfree_rcu(), the goal is the
same.
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The different strings used for describing the polarity are
Start, End and StillNonIdle. Since StillIdle is not used in any trace
point for rcu_dyntick, it can be removed and StillNonIdle can be added
in the description. Because StillNonIdle is used in a few tracepoints for
rcu_dyntick.
Similarly, USER, IDLE and IRQ are used for describing context in
the rcu_dyntick tracepoints. Since, "KERNEL" is not used for any
of the rcu_dyntick tracepoints, remove it from the description.
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, rcu_barrier() ignores offline CPUs, However, it is possible
for an offline no-CBs CPU to have callbacks queued, and rcu_barrier()
must wait for those callbacks. This commit therefore makes rcu_barrier()
directly invoke the rcu_barrier_func() with interrupts disabled for such
CPUs. This requires passing the CPU number into this function so that
it can entrain the rcu_barrier() callback onto the correct CPU's callback
list, given that the code must instead execute on the current CPU.
While in the area, this commit fixes a bug where the first CPU's callback
might have been invoked before rcu_segcblist_entrain() returned, which
would also result in an early wakeup.
Fixes: 5d6742b377 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ paulmck: Apply optimization feedback from Boqun Feng. ]
Cc: <stable@vger.kernel.org> # 5.5.x
The event is given three parameters, first one is the name
of RCU flavour, second one is the number of elements in array
for free and last one is an address of the array holding
pointers to be freed by the kfree_bulk() function.
To enable the trace event your kernel has to be build with
CONFIG_RCU_TRACE=y, after that it is possible to track the
events using ftrace subsystem.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit removes kfree_rcu() special-casing and the lazy-callback
handling from Tree RCU. It moves some of this special casing to Tiny RCU,
the removal of which will be the subject of later commits.
This results in a nice negative delta.
Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently PREEMPT_RCU and TREE_RCU are mutually exclusive Kconfig
options. But PREEMPT_RCU actually specifies a kind of TREE_RCU,
namely a preemptible TREE_RCU. This commit therefore makes PREEMPT_RCU
be a modifer to the TREE_RCU Kconfig option. This has the benefit of
simplifying several of the #if expressions that formerly needed to
check both, but now need only check one or the other.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This fixes a data-race where `atomic_t dynticks` is copied by value. The
copy is performed non-atomically, resulting in a data-race if `dynticks`
is updated concurrently.
This data-race was found with KCSAN:
==================================================================
BUG: KCSAN: data-race in dyntick_save_progress_counter / rcu_irq_enter
write to 0xffff989dbdbe98e0 of 4 bytes by task 10 on cpu 3:
atomic_add_return include/asm-generic/atomic-instrumented.h:78 [inline]
rcu_dynticks_snap kernel/rcu/tree.c:310 [inline]
dyntick_save_progress_counter+0x43/0x1b0 kernel/rcu/tree.c:984
force_qs_rnp+0x183/0x200 kernel/rcu/tree.c:2286
rcu_gp_fqs kernel/rcu/tree.c:1601 [inline]
rcu_gp_fqs_loop+0x71/0x880 kernel/rcu/tree.c:1653
rcu_gp_kthread+0x22c/0x3b0 kernel/rcu/tree.c:1799
kthread+0x1b5/0x200 kernel/kthread.c:255
<snip>
read to 0xffff989dbdbe98e0 of 4 bytes by task 154 on cpu 7:
rcu_nmi_enter_common kernel/rcu/tree.c:828 [inline]
rcu_irq_enter+0xda/0x240 kernel/rcu/tree.c:870
irq_enter+0x5/0x50 kernel/softirq.c:347
<snip>
Reported by Kernel Concurrency Sanitizer on:
CPU: 7 PID: 154 Comm: kworker/7:1H Not tainted 5.3.0+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Workqueue: kblockd blk_mq_run_work_fn
==================================================================
Signed-off-by: Marco Elver <elver@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently the RCU callbacks for no-CBs CPUs are queued on a series of
ad-hoc linked lists, which means that these callbacks cannot benefit
from "drive-by" grace periods, thus suffering needless delays prior
to invocation. In addition, the no-CBs grace-period kthreads first
wait for callbacks to appear and later wait for a new grace period,
which means that callbacks appearing during a grace-period wait can
be delayed. These delays increase memory footprint, and could even
result in an out-of-memory condition.
This commit therefore enqueues RCU callbacks from no-CBs CPUs on the
rcu_segcblist structure that is already used by non-no-CBs CPUs. It also
restructures the no-CBs grace-period kthread to be checking for incoming
callbacks while waiting for grace periods. Also, instead of waiting
for a new grace period, it waits for the closest grace period that will
cause some of the callbacks to be safe to invoke. All of these changes
reduce callback latency and thus the number of outstanding callbacks,
in turn reducing the probability of an out-of-memory condition.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
The nocb_cb_wait() function traces a "FollowerSleep" trace_rcu_nocb_wake()
event, which never was documented and is now misleading. This commit
therefore changes "FollowerSleep" to "CBSleep", documents this, and
updates the documentation for "Sleep" as well.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
- Removing of non-DYNAMIC_FTRACE from 32bit x86
- Removing of mcount support from x86
- Emulating a call from int3 on x86_64, fixes live kernel patching
- Consolidated Tracing Error logs file
Minor updates:
- Removal of klp_check_compiler_support()
- kdb ftrace dumping output changes
- Accessing and creating ftrace instances from inside the kernel
- Clean up of #define if macro
- Introduction of TRACE_EVENT_NOP() to disable trace events based on config
options
And other minor fixes and clean ups
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXNxMZxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qq4PAP44kP6VbwL8CHyI2A3xuJ6Hwxd+2Z2r
ip66RtzyJ+2iCgEA2QCuWUlEt2bLpF9a8IQ4N9tWenSeW2i7gunPb+tioQw=
=RVQo
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"The major changes in this tracing update includes:
- Removal of non-DYNAMIC_FTRACE from 32bit x86
- Removal of mcount support from x86
- Emulating a call from int3 on x86_64, fixes live kernel patching
- Consolidated Tracing Error logs file
Minor updates:
- Removal of klp_check_compiler_support()
- kdb ftrace dumping output changes
- Accessing and creating ftrace instances from inside the kernel
- Clean up of #define if macro
- Introduction of TRACE_EVENT_NOP() to disable trace events based on
config options
And other minor fixes and clean ups"
* tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
x86: Hide the int3_emulate_call/jmp functions from UML
livepatch: Remove klp_check_compiler_support()
ftrace/x86: Remove mcount support
ftrace/x86_32: Remove support for non DYNAMIC_FTRACE
tracing: Simplify "if" macro code
tracing: Fix documentation about disabling options using trace_options
tracing: Replace kzalloc with kcalloc
tracing: Fix partial reading of trace event's id file
tracing: Allow RCU to run between postponed startup tests
tracing: Fix white space issues in parse_pred() function
tracing: Eliminate const char[] auto variables
ring-buffer: Fix mispelling of Calculate
tracing: probeevent: Fix to make the type of $comm string
tracing: probeevent: Do not accumulate on ret variable
tracing: uprobes: Re-enable $comm support for uprobe events
ftrace/x86_64: Emulate call function while updating in breakpoint handler
x86_64: Allow breakpoints to emulate call instructions
x86_64: Add gap to int3 to allow for call emulation
tracing: kdb: Allow ftdump to skip all but the last few entries
tracing: Add trace_total_entries() / trace_total_entries_cpu()
...
When CONFIG_RCU_TRACE is not set, all these tracepoints are defined as
do-nothing macro.
We'd better make those inline functions that take proper arguments.
As RCU_TRACE() is defined as do-nothing marco as well when
CONFIG_RCU_TRACE is not set, so we can clean it up.
Link: http://lkml.kernel.org/r/1553602391-11926-4-git-send-email-laoar.shao@gmail.com
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The ->rcu_qs_ctr counter was intended to allow providing a lightweight
report of a quiescent state to all RCU flavors. But now that there is
only one flavor of RCU in any one running kernel, there is no point in
having this feature. This commit therefore removes the ->rcu_qs_ctr
field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field
from the rcu_data structure. This results in the "rqc" option to the
rcu_fqs trace event no longer being used, so this commit also removes the
"rqc" description from the header comment.
While in the neighborhood, this commit also causes the forward-progress
request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval
later in the grace period than the first setting of .rcu_urgent_qs.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because rcu_barrier() is a one-line wrapper function for _rcu_barrier()
and because nothing else calls _rcu_barrier(), this commit inlines
_rcu_barrier() into rcu_barrier().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that quiescent states for newly offlined CPUs are reported either
when that CPU goes offline or at the end of grace-period initialization,
the CPU-hotplug failsafe in the force-quiescent-state code path is no
longer needed.
This commit therefore removes this failsafe.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The name 'c' is used for variables and parameters holding the requested
grace-period sequence number. However it is no longer very meaningful
given the conversions from ->gpnum and (especially) ->completed to
->gp_seq. This commit therefore renames 'c' to 'gp_seq_req'.
Previous patch discussion is at:
https://patchwork.kernel.org/patch/10396579/
Signed-off-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The old grace-period start code would acquire only the leaf's rcu_node
structure's ->lock if that structure believed that a grace period was
in progress. The new code advances to the leaf's parent in this case,
needlessly acquiring then leaf's parent's ->lock. This commit therefore
checks the grace-period state after marking the leaf with the need for
the specified grace period, and if the leaf believes that a grace period
is in progress, takes an early exit.
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Add "Startedleaf" tracing as suggested by Joel Fernandes. ]
This commit makes the rcu_quiescent_state_report tracepoint use ->gp_seq
instead of ->gpnum.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit makes the rcu_unlock_preempted_task tracepoint use ->gp_seq
instead of ->gpnum.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit makes the rcu_future_grace_period tracepoint use gp_seq
instead of ->gpnum and ->completed.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit makes the rcu_grace_period tracepoint use gp_seq instead
of ->gpnum or ->completed. It also introduces a "cpuofl-bgp" string to
less obscurely indicate when a CPU has gone offline while a grace period
is waiting on it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reworking grace-period initiation and funnel locking added new
rcu_future_grace_period() trace events, so this commit updates the
rcu_future_grace_period() trace event's header comment accordingly.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
If a CPU is transitioning to or from offline state, an expedited
grace period may undergo a timed wait. This timed wait can unduly
delay grace periods, so this commit adds a trace statement to make
it visible.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit adds more tracing of expedited grace periods to enable
improved debugging of slowdowns.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The trace event rcu_nocb_wake is only used when CONFIG_RCU_NOCB_CPU is
defined. But the trace event is defined regardless. As defined trace
events take up memory, it is a waste to have it defined when not used.
Surround the trace event with an #ifdef to have it only defined when it
is used.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Commit c0f4dfd4f9 ("rcu: Make RCU_FAST_NO_HZ take advantage of
numbered callbacks") removed the only instances of trace_rcu_prep_idle,
but did not remove the TRACE_EVENT() that creates it. As defined trace
events take up memory within the kernel even when they are not used,
this is a waste of space. Remove the obsolete event.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The code that checks for non-idle non-nohz_idle-usermode tasks invoking
rcu_eqs_enter() and rcu_eqs_exit() prints a considerable quantity of
helpful information. However, these checks fire rarely, so the extra
complexity is no longer worth it. This commit therefore replaces this
debug code with simple WARN_ON_ONCE() statements.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because the ->dynticks_nesting field now only contains the process-based
nesting level instead of a value encoding both the process nesting level
and the irq "nesting" level, we no longer need a long long, even on
32-bit systems. This commit therefore changes both the ->dynticks_nesting
and ->dynticks_nmi_nesting fields to long.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>