mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-26 05:34:13 +08:00
dea499781a
42251 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Zheng Yejian
|
dea499781a |
tracing: Fix warning in trace_buffered_event_disable()
Warning happened in trace_buffered_event_disable() at
WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace:
? __warn+0xa5/0x1b0
? trace_buffered_event_disable+0x189/0x1b0
__ftrace_event_enable_disable+0x19e/0x3e0
free_probe_data+0x3b/0xa0
unregister_ftrace_function_probe_func+0x6b8/0x800
event_enable_func+0x2f0/0x3d0
ftrace_process_regex.isra.0+0x12d/0x1b0
ftrace_filter_write+0xe6/0x140
vfs_write+0x1c9/0x6f0
[...]
The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
```
#!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was set;
# 2) trace_buffered_event_enable() was called first time;
echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
# 2. Enable the event registered, then:
# 1) SOFT_DISABLED_BIT was cleared;
# 2) trace_buffered_event_disable() was called first time;
echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
# set again!!!
cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was cleared again;
# 2) trace_buffered_event_disable() was called second time!!!
echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled.
Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes:
|
||
Gaosheng Cui
|
6c95d71bad |
tracing: Fix kernel-doc warnings in trace_seq.c
Fix kernel-doc warning: kernel/trace/trace_seq.c:142: warning: Function parameter or member 'args' not described in 'trace_seq_vprintf' Link: https://lkml.kernel.org/r/20230724140827.1023266-5-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Gaosheng Cui
|
bd7217f30c |
tracing: Fix kernel-doc warnings in trace_events_trigger.c
Fix kernel-doc warnings: kernel/trace/trace_events_trigger.c:59: warning: Function parameter or member 'buffer' not described in 'event_triggers_call' kernel/trace/trace_events_trigger.c:59: warning: Function parameter or member 'event' not described in 'event_triggers_call' Link: https://lkml.kernel.org/r/20230724140827.1023266-4-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Gaosheng Cui
|
b32c789f7d |
tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c
Fix kernel-doc warning: kernel/trace/trace_events_synth.c:1257: warning: Function parameter or member 'mod' not described in 'synth_event_gen_cmd_array_start' Link: https://lkml.kernel.org/r/20230724140827.1023266-3-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Gaosheng Cui
|
151e34d1c6 |
ring-buffer: Fix kernel-doc warnings in ring_buffer.c
Fix kernel-doc warnings: kernel/trace/ring_buffer.c:954: warning: Function parameter or member 'cpu' not described in 'ring_buffer_wake_waiters' kernel/trace/ring_buffer.c:3383: warning: Excess function parameter 'event' description in 'ring_buffer_unlock_commit' kernel/trace/ring_buffer.c:5359: warning: Excess function parameter 'cpu' description in 'ring_buffer_reset_online_cpus' Link: https://lkml.kernel.org/r/20230724140827.1023266-2-cuigaosheng1@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Zheng Yejian
|
2d093282b0 |
ring-buffer: Fix wrong stat of cpu_buffer->read
When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set
to 0 in order to make sure any read iterators reset themselves. However,
this will mess 'entries' stating, see following steps:
# cd /sys/kernel/tracing/
# 1. Enlarge ring buffer prepare for later reducing:
# echo 20 > per_cpu/cpu0/buffer_size_kb
# 2. Write a log into ring buffer of cpu0:
# taskset -c 0 echo "hello1" > trace_marker
# 3. Read the log:
# cat per_cpu/cpu0/trace_pipe
<...>-332 [000] ..... 62.406844: tracing_mark_write: hello1
# 4. Stop reading and see the stats, now 0 entries, and 1 event readed:
# cat per_cpu/cpu0/stats
entries: 0
[...]
read events: 1
# 5. Reduce the ring buffer
# echo 7 > per_cpu/cpu0/buffer_size_kb
# 6. Now entries became unexpected 1 because actually no entries!!!
# cat per_cpu/cpu0/stats
entries: 1
[...]
read events: 0
To fix it, introduce 'page_removed' field to count total removed pages
since last reset, then use it to let read iterators reset themselves
instead of changing the 'read' pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <vnagarnaik@google.com>
Fixes:
|
||
Mohamed Khalfella
|
4b8b390516 |
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
Commit |
||
Chen Lin
|
8a96c0288d |
ring-buffer: Do not swap cpu_buffer during resize process
When ring_buffer_swap_cpu was called during resize process, the cpu buffer was swapped in the middle, resulting in incorrect state. Continuing to run in the wrong state will result in oops. This issue can be easily reproduced using the following two scripts: /tmp # cat test1.sh //#! /bin/sh for i in `seq 0 100000` do echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 done /tmp # cat test2.sh //#! /bin/sh for i in `seq 0 100000` do echo irqsoff > /sys/kernel/debug/tracing/current_tracer sleep 1 echo nop > /sys/kernel/debug/tracing/current_tracer sleep 1 done /tmp # ./test1.sh & /tmp # ./test2.sh & A typical oops log is as follows, sometimes with other different oops logs. [ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8 [ 231.713375] Modules linked in: [ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 231.716750] Hardware name: linux,dummy-virt (DT) [ 231.718152] Workqueue: events update_pages_handler [ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 231.721171] pc : rb_update_pages+0x378/0x3f8 [ 231.722212] lr : rb_update_pages+0x25c/0x3f8 [ 231.723248] sp : ffff800082b9bd50 [ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0 [ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a [ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000 [ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510 [ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002 [ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558 [ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001 [ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000 [ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208 [ 231.744196] Call trace: [ 231.744892] rb_update_pages+0x378/0x3f8 [ 231.745893] update_pages_handler+0x1c/0x38 [ 231.746893] process_one_work+0x1f0/0x468 [ 231.747852] worker_thread+0x54/0x410 [ 231.748737] kthread+0x124/0x138 [ 231.749549] ret_from_fork+0x10/0x20 [ 231.750434] ---[ end trace 0000000000000000 ]--- [ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 233.721696] Mem abort info: [ 233.721935] ESR = 0x0000000096000004 [ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits [ 233.722596] SET = 0, FnV = 0 [ 233.722805] EA = 0, S1PTW = 0 [ 233.723026] FSC = 0x04: level 0 translation fault [ 233.723458] Data abort info: [ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000 [ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 233.726720] Modules linked in: [ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 233.727777] Hardware name: linux,dummy-virt (DT) [ 233.728225] Workqueue: events update_pages_handler [ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 233.729054] pc : rb_update_pages+0x1a8/0x3f8 [ 233.729334] lr : rb_update_pages+0x154/0x3f8 [ 233.729592] sp : ffff800082b9bd50 [ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418 [ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003 [ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58 [ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001 [ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000 [ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c [ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0 [ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000 [ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000 [ 233.734418] Call trace: [ 233.734593] rb_update_pages+0x1a8/0x3f8 [ 233.734853] update_pages_handler+0x1c/0x38 [ 233.735148] process_one_work+0x1f0/0x468 [ 233.735525] worker_thread+0x54/0x410 [ 233.735852] kthread+0x124/0x138 [ 233.736064] ret_from_fork+0x10/0x20 [ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060) [ 233.736959] ---[ end trace 0000000000000000 ]--- After analysis, the seq of the error is as follows [1-5]: int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, int cpu_id) { for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //1. get cpu_buffer, aka cpu_buffer(A) ... ... schedule_work_on(cpu, &cpu_buffer->update_pages_work); //2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to // update_pages_handler, do the update process, set 'update_done' in // complete(&cpu_buffer->update_done) and to wakeup resize process. //----> //3. Just at this moment, ring_buffer_swap_cpu is triggered, //cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer. //ring_buffer_swap_cpu is called as the 'Call trace' below. Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x18/0x28 dump_stack+0x12c/0x188 ring_buffer_swap_cpu+0x2f8/0x328 update_max_tr_single+0x180/0x210 check_critical_timing+0x2b4/0x2c8 tracer_hardirqs_on+0x1c0/0x200 trace_hardirqs_on+0xec/0x378 el0_svc_common+0x64/0x260 do_el0_svc+0x90/0xf8 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x180/0x1c0 //<---- /* wait for all the updates to complete */ for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //4. get cpu_buffer, cpu_buffer(B) is used in the following process, //the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong. //for example, cpu_buffer(A)->update_done will leave be set 1, and will //not 'wait_for_completion' at the next resize round. if (!cpu_buffer->nr_pages_to_update) continue; if (cpu_online(cpu)) wait_for_completion(&cpu_buffer->update_done); cpu_buffer->nr_pages_to_update = 0; } ... } //5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong, //Continuing to run in the wrong state, then oops occurs. Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
YueHaibing
|
1faf7e4a0b |
tracing: Remove unused extern declaration tracing_map_set_field_descr()
Since commit
|
||
Linus Torvalds
|
f61a89ca11 |
- Remove a cgroup from under a polling process properly
- Fix the idle sibling selection -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmS0N8AACgkQEsHwGGHe VUq4qhAAhLlRz7V0txbvj65xrQAuF7RiqWQlIm6NYX+eWb5wpqasy8mTpVr37Lhg JJIdSlI0sw3vIgnjgmuLU6e6MKO2pDWCqppv7CB4YjTT6/ifyIWvFotHTtfOBDjy eTV/wEpfDKOKHJDdWHRdUjiDgktZs1gJVV8e65/ViuWoCEV3ARy7tS4liORiwNpr RTuG6C3lDAfw6RKWCvBxDV15XhMDNYYQzN1bwedTAryP8jAFFUKce2de6HK6qyUF pNqzX0eGkN+6TG13uKLJIsAfdydWHWvkXWqFzVZsS7Upu4lyAyjslHK5UEF/MgCy 4c4/xaqcDTKn8dtQycQ3FMwujbamHdnct9or4aDsEa9PIIAcab/gKxojNPlBSkCt 0FpBjJUP2X/DBGpMaNX02fwQhhXwj0dN/4OWI5kXrL2sctqzueMSfTUKpjsLZMAx VTngoXVcVlLxqB2LqIzjhsWG+awlw/IKEKotjhX4vyra9L50CRbMLJo7hJdHD16J dmGQmsJRlGyQYEOgoyI7BU/Bqs/+iE1r5fAcVgjWWUPotl7XKDgcwTpELTPj5xzO b7ALkUXRcHf2NdfaLDZeyJzT/Z1mq1znPTsDqi94f1ASbmAd0iXLFNHtyJxXJabZ /aTLUj/GmmaCw2R02S4l67uC/BHChToloNVCGsq2KfBYcUunod0= =bSNb -----END PGP SIGNATURE----- Merge tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Remove a cgroup from under a polling process properly - Fix the idle sibling selection * tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/psi: use kernfs polling functions for PSI trigger polling sched/fair: Use recent_used_cpu to test p->cpus_ptr |
||
Linus Torvalds
|
6eede0686f |
hardening fixes for v6.5-rc2
- Remove LTO-only suffixes from promoted global function symbols (Yonghong Song) - Remove unused .text..refcount section from vmlinux.lds.h (Petr Pavlu) - Add missing __always_inline to sparc __arch_xchg() (Arnd Bergmann) - Claim maintainership of string routines -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmSzO7IWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJim9EACS3Jor73jTPAKKzbjiPdJXCZpW EesWC+a4nJ/hV4+pqIMUqpF/TRaJd1PZNJPMblI/wSGO4VcI6AywVLrX67a1wFyO pJUp3O8lXj7g5uA3cFmCMM1KKtQqZb/qZuAaP7BfnFaATjRNjejr0CHzcqqY45Jm fMo/it1k0zDUAHKx112T7v+lMFxGxzbysPYbXzU62zf74gPPmilk+ZnfEvUV3Qe7 sRjV1Zk0VXwqSx0ED4tPIpJEHCCy0uZpQdo/vhghjYxFecIkM9T8XaKyEY2rqv+a u4YaMc/31TXi33ajfCdi1WxbNrRB6WU2tCjPrMbzdoRvTi1zaxc8GSl1mcftmdJk /f5Pxq1Jcf7ZRC7Ujbag9pfncITHtMXWRb84boV0JdncxmdMUmyZGuUUhU+Ma8m4 yqdHMstaMvdtcn9FigVep2r0Fa+Eh8H+jNZuixdaReZHEYbjmrPM1P8zLtDYOSfk TZnnt2KrxeudgkqpNAFibWNkv9waKfx/oUU4ljwRbjjzQdUofvqyV15s9f5mcBLU 98EDHTZ/XbURrzbgLAui/oFHZNuuBqdjZCaHD/1SQoemPF6zmkY3Hv+8YYQitk7a i+VaSoukE/S1vIBCasCi+2qgOOCxH4orZ2Ewll9iw2VVO4x8o8UVHIPYI5JIT6F9 Xx1CeYrKfWDdDUXKVA== =HUoI -----END PGP SIGNATURE----- Merge tag 'hardening-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Remove LTO-only suffixes from promoted global function symbols (Yonghong Song) - Remove unused .text..refcount section from vmlinux.lds.h (Petr Pavlu) - Add missing __always_inline to sparc __arch_xchg() (Arnd Bergmann) - Claim maintainership of string routines * tag 'hardening-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: sparc: mark __arch_xchg() as __always_inline MAINTAINERS: Foolishly claim maintainership of string routines kallsyms: strip LTO-only suffixes from promoted global functions vmlinux.lds.h: Remove a reference to no longer used sections .text..refcount |
||
Linus Torvalds
|
4b4eef57e6 |
Probe fixes for 6.5-rc1, the 2nd set:
- fprobe: Add a comment why fprobe will be skipped if another kprobe is running in fprobe_kprobe_handler(). - probe-events: Fix some issues related to fetch-argument . Fix double counting of the string length for user-string and symstr. This will require longer buffer in the array case. . Fix not to count error code (minus value) for the total used length in array argument. This makes the total used length shorter. . Fix to update dynamic used data size counter only if fetcharg uses the dynamic size data. This may mis-count the used dynamic data size and corrupt data. . Revert "tracing: Add "(fault)" name injection to kernel probes" because that did not work correctly with a bug, and we agreed the current '(fault)' output (instead of '"(fault)"' like a string) explains what happened more clearly. . Fix to record 0-length (means fault access) data_loc data in fetch function itself, instead of store_trace_args(). If we record an array of string, this will fix to save fault access data on each entry of the array correctly. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmSxSlYACgkQ2/sHvwUr PxupyAgApFDi9YGsmrVbXmIN5y+yGMyio2H6xR7XkX+L02nvDY6uVqL/jgT8pHfI AeGZEA+EqwxIfWpYBfztsFej+Gl3Elfvu14OSxwaafUlW3mgZFQqw1ZR0HvzXoKJ 8Iw6WOXjhLe3/QLy43UY8JQGOKI07i3gh71wa0W0huOyiwwHuuVwPSY9QJJ2ulSg OWFSuMFO8IxYimp0BpFu/vrfa8CdgWLc24tgJ5EpZtzu6L0A2I/FMZjnBukxnP9s rjAXv0uRuSFvvF7/RGCqrLza12525qyHx7d5IWUq5shd3bCnaUOnAieF//MoJaR3 q8McDJK//EPbUvCWgESuuyPS05smyQ== =iumA -----END PGP SIGNATURE----- Merge tag 'probes-fixes-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probe fixes from Masami Hiramatsu: - fprobe: Add a comment why fprobe will be skipped if another kprobe is running in fprobe_kprobe_handler(). - probe-events: Fix some issues related to fetch-arguments: - Fix double counting of the string length for user-string and symstr. This will require longer buffer in the array case. - Fix not to count error code (minus value) for the total used length in array argument. This makes the total used length shorter. - Fix to update dynamic used data size counter only if fetcharg uses the dynamic size data. This may mis-count the used dynamic data size and corrupt data. - Revert "tracing: Add "(fault)" name injection to kernel probes" because that did not work correctly with a bug, and we agreed the current '(fault)' output (instead of '"(fault)"' like a string) explains what happened more clearly. - Fix to record 0-length (means fault access) data_loc data in fetch function itself, instead of store_trace_args(). If we record an array of string, this will fix to save fault access data on each entry of the array correctly. * tag 'probes-fixes-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails Revert "tracing: Add "(fault)" name injection to kernel probes" tracing/probes: Fix to update dynamic data counter if fetcharg uses it tracing/probes: Fix not to count error code to total length tracing/probes: Fix to avoid double count of the string length on the array fprobes: Add a comment why fprobe_kprobe_handler exits if kprobe is running |
||
Linus Torvalds
|
bde7f15027 |
Power management fixes for 6.5-rc2
- Unbreak the /sys/power/resume interface after recent changes (Azat Khuzhin). - Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai Yang). - Remove __init from cpufreq callbacks in the sparc driver, because they may be called after initialization too (Viresh Kumar). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmSxhIcSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxfdQP/1+MxGLh2XVVKvB9QTI0/ofYPxPfoTuf 5Sf2lOa7rNeauc2xnQFM6EMOeme4ckTUbrO7AkZVACYVqbKJ92IUBJfo3R2Ar+1Z 9TogwG+YOX3OjR8QtiGHLwA5fvOgbt89JaDn2ZCWcS+gHARZ3VMgdbDt+C3MUldV UVr/5kAkWefv39PIYHCwAJU7Xhn97B5nW58KgpkHuxOcHDKe0VfdxLcKBsyoc6QE IGMVV2WtnoyEdM1aNfZ37+3NksiIdZMA6OvM5C/7HOs6IqJaFxVUxm4333sM5AW9 y5LPYSBbedVxICdLkUUq8W5MDRNCTPXgC3gexEu0XtWdAV9AG+9aNeZytT7KGrLO xe4vbl6s1LnxC6YBw2bB+U/DbLtxQrAP1nYZj6yxhbHVsnTTZg7Qvevz7nAdPlOL 3FsutIT+9OQprWXxYBRv3AumF+hpG1bm8Zutyaqb2vwRwMbbXWTpzRry4ydp6bTj VB2YWeQOxCKl+dC5jXM1wfPjbsqWQvvGZVh1VIzlcZgzqALWf+F5fTMUKuY963kd V6fR1YmKg+Xxb+BU9mEjaUMfH5Yr8Mv9Gpf7D87MTsNTluFjAmRWk5a0ahVAXcwe n1jFxBNUsuJuvz2KwWrVZExT2xqJ8kVfnOxNcevtaXK+uk8+jE94lwjJn0P9v8w8 e+0QABkgUYY2 =b/om -----END PGP SIGNATURE----- Merge tag 'pm-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix hibernation (after recent changes), frequency QoS and the sparc cpufreq driver. Specifics: - Unbreak the /sys/power/resume interface after recent changes (Azat Khuzhin). - Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai Yang). - Remove __init from cpufreq callbacks in the sparc driver, because they may be called after initialization too (Viresh Kumar)" * tag 'pm-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: sparc: Don't mark cpufreq callbacks with __init PM: QoS: Restore support for default value on frequency QoS PM: hibernate: Fix writing maj:min to /sys/power/resume |
||
Rafael J. Wysocki
|
d121758da6 |
Merge branches 'pm-sleep' and 'pm-qos'
Merge a PM QoS fix and a hibernation fix for 6.5-rc2. - Unbreak the /sys/power/resume interface after recent changes (Azat Khuzhin). - Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai Yang). * pm-sleep: PM: hibernate: Fix writing maj:min to /sys/power/resume * pm-qos: PM: QoS: Restore support for default value on frequency QoS |
||
Masami Hiramatsu (Google)
|
797311bce5 |
tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Fix to record 0-length data to data_loc in fetch_store_string*() if it fails
to get the string data.
Currently those expect that the data_loc is updated by store_trace_args() if
it returns the error code. However, that does not work correctly if the
argument is an array of strings. In that case, store_trace_args() only clears
the first entry of the array (which may have no error) and leaves other
entries. So it should be cleared by fetch_store_string*() itself.
Also, 'dyndata' and 'maxlen' in store_trace_args() should be updated
only if it is used (ret > 0 and argument is a dynamic data.)
Link: https://lore.kernel.org/all/168908496683.123124.4761206188794205601.stgit@devnote2/
Fixes:
|
||
Linus Torvalds
|
b1983d427a |
Networking fixes for 6.5-rc2, including fixes from netfilter,
wireless and ebpf Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmSv1YISHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkpQgP/1msj0MlIWJnMgzPiMonDSe746JGTg/j YengEjqcy3ozC4COBEeyBO6ilt6I+Wrb5H5jimn9h2djB+D7htWNaejQaqJrBxph F4lUC6OJqd2ncI3tXAG2BSX1duzDr6B7yL7d5InFIczw8vNh+chsyX0sjlzU12bt ppjcSb+Ffc796DB0ItJkBqluxcpjyXE15ZWTTV4GEHK6RoRdxNIGjd7NgvD8podB Q/464bHs1jJYkAavuobiOXV2fuxWLTs77E0Vloizoo+42UiRFMLJk+RX98PhSIMa eejkxfm+H6+6Qi2omYepvf7vDN3GtLjxbr5C3mTdWPuL4QbNY8agVJ7sS4XnL5/v B7EAjyGQK9SmD36zTu7QL/Ul6fSnRq8jz20B0mDa0imAWzi58A+jqbQAMoVOMSS+ Uv4yKJpIUyx7mUI77+EX3U9r1wytw5eniatTDU+GAsQb2CJ43CqDmn/7RcmGacBo P1q+il9JW4kzUQrisUSxmQDfpBvQi5wiygiEdUNI5FEhq6/iKe/lrJnmJZpaLkd5 P3oEKjapamAmcyrEr/7VD1Mb4jrRfpB7zVn/5OyvywbcLQxA+531iPpy4r4W6cWH 1MRLBVVHKyb3jfm8J3T4lpDEzd03+MiPS8JiKMUYYNUYkY8tYp92muwC7z2sGI4M 6eR2MeKD4vds =cELX -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, wireless and ebpf. Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames" * tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) selftests: tc-testing: add test for qfq with stab overhead net/sched: sch_qfq: account for stab overhead in qfq_enqueue selftests: tc-testing: add tests for qfq mtu sanity check net/sched: sch_qfq: reintroduce lmax bound check for MTU wifi: cfg80211: fix receiving mesh packets without RFC1042 header wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set() net: txgbe: fix eeprom calculation error net/sched: make psched_mtu() RTNL-less safe net: ena: fix shift-out-of-bounds in exponential backoff netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() net/sched: flower: Ensure both minimum and maximum ports are specified MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER docs: netdev: update the URL of the status page wifi: iwlwifi: remove 'use_tfh' config to fix crash xdp: use trusted arguments in XDP hints kfuncs bpf: cpumap: Fix memory leak in cpu_map_update_elem wifi: airo: avoid uninitialized warning in airo_get_rate() octeontx2-pf: Add additional check for MCAM rules net: dsa: Removed unneeded of_node_put in felix_parse_ports_node net: fec: use netdev_err_once() instead of netdev_err() ... |
||
Linus Torvalds
|
ebc27aacee |
Tracing fixes and clean ups:
- Fix some missing-prototype warnings - Fix user events struct args (did not include size of struct) When creating a user event, the "struct" keyword is to denote that the size of the field will be passed in. But the parsing failed to handle this case. - Add selftest to struct sizes for user events - Fix sample code for direct trampolines. The sample code for direct trampolines attached to handle_mm_fault(). But the prototype changed and the direct trampoline sample code was not updated. Direct trampolines needs to have the arguments correct otherwise it can fail or crash the system. - Remove unused ftrace_regs_caller_ret() prototype. - Quiet false positive of FORTIFY_SOURCE Due to backward compatibility, the structure used to save stack traces in the kernel had a fixed size of 8. This structure is exported to user space via the tracing format file. A change was made to allow more than 8 functions to be recorded, and user space now uses the size field to know how many functions are actually in the stack. But the structure still has size of 8 (even though it points into the ring buffer that has the required amount allocated to hold a full stack. This was fine until the fortifier noticed that the memcpy(&entry->caller, stack, size) was greater than the 8 functions and would complain at runtime about it. Hide this by using a pointer to the stack location on the ring buffer instead of using the address of the entry structure caller field. - Fix a deadloop in reading trace_pipe that was caused by a mismatch between ring_buffer_empty() returning false which then asked to read the data, but the read code uses rb_num_of_entries() that returned zero, and causing a infinite "retry". - Fix a warning caused by not using all pages allocated to store ftrace functions, where this can happen if the linker inserts a bunch of "NULL" entries, causing the accounting of how many pages needed to be off. - Fix histogram synthetic event crashing when the start event is removed and the end event is still using a variable from it. - Fix memory leak in freeing iter->temp in tracing_release_pipe() -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZLBF6hQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qkswAP4mhdoFFfNosM7+Sh/R4t31IxKZApm9 M2Hf9jgvJ7b65AD/VV1XfO6skw2+5Yn9S4UyNE2MQaYxPwWpONcNFUzZ3Q8= =Nb+7 -----END PGP SIGNATURE----- Merge tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix some missing-prototype warnings - Fix user events struct args (did not include size of struct) When creating a user event, the "struct" keyword is to denote that the size of the field will be passed in. But the parsing failed to handle this case. - Add selftest to struct sizes for user events - Fix sample code for direct trampolines. The sample code for direct trampolines attached to handle_mm_fault(). But the prototype changed and the direct trampoline sample code was not updated. Direct trampolines needs to have the arguments correct otherwise it can fail or crash the system. - Remove unused ftrace_regs_caller_ret() prototype. - Quiet false positive of FORTIFY_SOURCE Due to backward compatibility, the structure used to save stack traces in the kernel had a fixed size of 8. This structure is exported to user space via the tracing format file. A change was made to allow more than 8 functions to be recorded, and user space now uses the size field to know how many functions are actually in the stack. But the structure still has size of 8 (even though it points into the ring buffer that has the required amount allocated to hold a full stack. This was fine until the fortifier noticed that the memcpy(&entry->caller, stack, size) was greater than the 8 functions and would complain at runtime about it. Hide this by using a pointer to the stack location on the ring buffer instead of using the address of the entry structure caller field. - Fix a deadloop in reading trace_pipe that was caused by a mismatch between ring_buffer_empty() returning false which then asked to read the data, but the read code uses rb_num_of_entries() that returned zero, and causing a infinite "retry". - Fix a warning caused by not using all pages allocated to store ftrace functions, where this can happen if the linker inserts a bunch of "NULL" entries, causing the accounting of how many pages needed to be off. - Fix histogram synthetic event crashing when the start event is removed and the end event is still using a variable from it - Fix memory leak in freeing iter->temp in tracing_release_pipe() * tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak of iter->temp when reading trace_pipe tracing/histograms: Add histograms to hist_vars if they have referenced variables tracing: Stop FORTIFY_SOURCE complaining about stack trace caller ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() ring-buffer: Fix deadloop issue on reading trace_pipe tracing: arm64: Avoid missing-prototype warnings selftests/user_events: Test struct size match cases tracing/user_events: Fix struct arg size match check x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret() arm64: ftrace: Add direct call trampoline samples support samples: ftrace: Save required argument registers in sample trampolines |
||
Masami Hiramatsu (Google)
|
4ed8f337de |
Revert "tracing: Add "(fault)" name injection to kernel probes"
This reverts commit |
||
Masami Hiramatsu (Google)
|
e38e2c6a9e |
tracing/probes: Fix to update dynamic data counter if fetcharg uses it
Fix to update dynamic data counter ('dyndata') and max length ('maxlen')
only if the fetcharg uses the dynamic data. Also get out arg->dynamic
from unlikely(). This makes dynamic data address wrong if
process_fetch_insn() returns error on !arg->dynamic case.
Link: https://lore.kernel.org/all/168908494781.123124.8160245359962103684.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/20230710233400.5aaf024e@gandalf.local.home/
Fixes:
|
||
Masami Hiramatsu (Google)
|
b41326b5e0 |
tracing/probes: Fix not to count error code to total length
Fix not to count the error code (which is minus value) to the total
used length of array, because it can mess up the return code of
process_fetch_insn_bottom(). Also clear the 'ret' value because it
will be used for calculating next data_loc entry.
Link: https://lore.kernel.org/all/168908493827.123124.2175257289106364229.stgit@devnote2/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/8819b154-2ba1-43c3-98a2-cbde20892023@moroto.mountain/
Fixes:
|
||
Masami Hiramatsu (Google)
|
66bcf65d6c |
tracing/probes: Fix to avoid double count of the string length on the array
If an array is specified with the ustring or symstr, the length of the
strings are accumlated on both of 'ret' and 'total', which means the
length is double counted.
Just set the length to the 'ret' value for avoiding double counting.
Link: https://lore.kernel.org/all/168908492917.123124.15076463491122036025.stgit@devnote2/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/8819b154-2ba1-43c3-98a2-cbde20892023@moroto.mountain/
Fixes:
|
||
Masami Hiramatsu (Google)
|
d5f28bb1ce |
fprobes: Add a comment why fprobe_kprobe_handler exits if kprobe is running
Add a comment the reason why fprobe_kprobe_handler() exits if any other kprobe is running. Link: https://lore.kernel.org/all/168874788299.159442.2485957441413653858.stgit@devnote2/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/all/20230706120916.3c6abf15@gandalf.local.home/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Zheng Yejian
|
d5a8218963 |
tracing: Fix memory leak of iter->temp when reading trace_pipe
kmemleak reports:
unreferenced object 0xffff88814d14e200 (size 256):
comm "cat", pid 336, jiffies 4294871818 (age 779.490s)
hex dump (first 32 bytes):
04 00 01 03 00 00 00 00 08 00 00 00 00 00 00 00 ................
0c d8 c8 9b ff ff ff ff 04 5a ca 9b ff ff ff ff .........Z......
backtrace:
[<ffffffff9bdff18f>] __kmalloc+0x4f/0x140
[<ffffffff9bc9238b>] trace_find_next_entry+0xbb/0x1d0
[<ffffffff9bc9caef>] trace_print_lat_context+0xaf/0x4e0
[<ffffffff9bc94490>] print_trace_line+0x3e0/0x950
[<ffffffff9bc95499>] tracing_read_pipe+0x2d9/0x5a0
[<ffffffff9bf03a43>] vfs_read+0x143/0x520
[<ffffffff9bf04c2d>] ksys_read+0xbd/0x160
[<ffffffff9d0f0edf>] do_syscall_64+0x3f/0x90
[<ffffffff9d2000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
when reading file 'trace_pipe', 'iter->temp' is allocated or relocated
in trace_find_next_entry() but not freed before 'trace_pipe' is closed.
To fix it, free 'iter->temp' in tracing_release_pipe().
Link: https://lore.kernel.org/linux-trace-kernel/20230713141435.1133021-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Mohamed Khalfella
|
6018b585e8 |
tracing/histograms: Add histograms to hist_vars if they have referenced variables
Hist triggers can have referenced variables without having direct
variables fields. This can be the case if referenced variables are added
for trigger actions. In this case the newly added references will not
have field variables. Not taking such referenced variables into
consideration can result in a bug where it would be possible to remove
hist trigger with variables being refenced. This will result in a bug
that is easily reproducable like so
$ cd /sys/kernel/tracing
$ echo 'synthetic_sys_enter char[] comm; long id' >> synthetic_events
$ echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
$ echo 'hist:keys=common_pid.execname,id.syscall:onmatch(raw_syscalls.sys_enter).synthetic_sys_enter($comm, id)' >> events/raw_syscalls/sys_enter/trigger
$ echo '!hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
[ 100.263533] ==================================================================
[ 100.264634] BUG: KASAN: slab-use-after-free in resolve_var_refs+0xc7/0x180
[ 100.265520] Read of size 8 at addr ffff88810375d0f0 by task bash/439
[ 100.266320]
[ 100.266533] CPU: 2 PID: 439 Comm: bash Not tainted 6.5.0-rc1 #4
[ 100.267277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
[ 100.268561] Call Trace:
[ 100.268902] <TASK>
[ 100.269189] dump_stack_lvl+0x4c/0x70
[ 100.269680] print_report+0xc5/0x600
[ 100.270165] ? resolve_var_refs+0xc7/0x180
[ 100.270697] ? kasan_complete_mode_report_info+0x80/0x1f0
[ 100.271389] ? resolve_var_refs+0xc7/0x180
[ 100.271913] kasan_report+0xbd/0x100
[ 100.272380] ? resolve_var_refs+0xc7/0x180
[ 100.272920] __asan_load8+0x71/0xa0
[ 100.273377] resolve_var_refs+0xc7/0x180
[ 100.273888] event_hist_trigger+0x749/0x860
[ 100.274505] ? kasan_save_stack+0x2a/0x50
[ 100.275024] ? kasan_set_track+0x29/0x40
[ 100.275536] ? __pfx_event_hist_trigger+0x10/0x10
[ 100.276138] ? ksys_write+0xd1/0x170
[ 100.276607] ? do_syscall_64+0x3c/0x90
[ 100.277099] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 100.277771] ? destroy_hist_data+0x446/0x470
[ 100.278324] ? event_hist_trigger_parse+0xa6c/0x3860
[ 100.278962] ? __pfx_event_hist_trigger_parse+0x10/0x10
[ 100.279627] ? __kasan_check_write+0x18/0x20
[ 100.280177] ? mutex_unlock+0x85/0xd0
[ 100.280660] ? __pfx_mutex_unlock+0x10/0x10
[ 100.281200] ? kfree+0x7b/0x120
[ 100.281619] ? ____kasan_slab_free+0x15d/0x1d0
[ 100.282197] ? event_trigger_write+0xac/0x100
[ 100.282764] ? __kasan_slab_free+0x16/0x20
[ 100.283293] ? __kmem_cache_free+0x153/0x2f0
[ 100.283844] ? sched_mm_cid_remote_clear+0xb1/0x250
[ 100.284550] ? __pfx_sched_mm_cid_remote_clear+0x10/0x10
[ 100.285221] ? event_trigger_write+0xbc/0x100
[ 100.285781] ? __kasan_check_read+0x15/0x20
[ 100.286321] ? __bitmap_weight+0x66/0xa0
[ 100.286833] ? _find_next_bit+0x46/0xe0
[ 100.287334] ? task_mm_cid_work+0x37f/0x450
[ 100.287872] event_triggers_call+0x84/0x150
[ 100.288408] trace_event_buffer_commit+0x339/0x430
[ 100.289073] ? ring_buffer_event_data+0x3f/0x60
[ 100.292189] trace_event_raw_event_sys_enter+0x8b/0xe0
[ 100.295434] syscall_trace_enter.constprop.0+0x18f/0x1b0
[ 100.298653] syscall_enter_from_user_mode+0x32/0x40
[ 100.301808] do_syscall_64+0x1a/0x90
[ 100.304748] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 100.307775] RIP: 0033:0x7f686c75c1cb
[ 100.310617] Code: 73 01 c3 48 8b 0d 65 3c 10 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 21 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 3c 10 00 f7 d8 64 89 01 48
[ 100.317847] RSP: 002b:00007ffc60137a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000021
[ 100.321200] RAX: ffffffffffffffda RBX: 000055f566469ea0 RCX: 00007f686c75c1cb
[ 100.324631] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 000000000000000a
[ 100.328104] RBP: 00007ffc60137ac0 R08: 00007f686c818460 R09: 000000000000000a
[ 100.331509] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
[ 100.334992] R13: 0000000000000007 R14: 000000000000000a R15: 0000000000000007
[ 100.338381] </TASK>
We hit the bug because when second hist trigger has was created
has_hist_vars() returned false because hist trigger did not have
variables. As a result of that save_hist_vars() was not called to add
the trigger to trace_array->hist_vars. Later on when we attempted to
remove the first histogram find_any_var_ref() failed to detect it is
being used because it did not find the second trigger in hist_vars list.
With this change we wait until trigger actions are created so we can take
into consideration if hist trigger has variable references. Also, now we
check the return value of save_hist_vars() and fail trigger creation if
save_hist_vars() fails.
Link: https://lore.kernel.org/linux-trace-kernel/20230712223021.636335-1-mkhalfella@purestorage.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Yonghong Song
|
8cc32a9bbf |
kallsyms: strip LTO-only suffixes from promoted global functions
Commit |
||
Steven Rostedt (Google)
|
bec3c25c24 |
tracing: Stop FORTIFY_SOURCE complaining about stack trace caller
The stack_trace event is an event created by the tracing subsystem to store stack traces. It originally just contained a hard coded array of 8 words to hold the stack, and a "size" to know how many entries are there. This is exported to user space as: name: kernel_stack ID: 4 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int size; offset:8; size:4; signed:1; field:unsigned long caller[8]; offset:16; size:64; signed:0; print fmt: "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n",i (void *)REC->caller[0], (void *)REC->caller[1], (void *)REC->caller[2], (void *)REC->caller[3], (void *)REC->caller[4], (void *)REC->caller[5], (void *)REC->caller[6], (void *)REC->caller[7] Where the user space tracers could parse the stack. The library was updated for this specific event to only look at the size, and not the array. But some older users still look at the array (note, the older code still checks to make sure the array fits inside the event that it read. That is, if only 4 words were saved, the parser would not read the fifth word because it will see that it was outside of the event size). This event was changed a while ago to be more dynamic, and would save a full stack even if it was greater than 8 words. It does this by simply allocating more ring buffer to hold the extra words. Then it copies in the stack via: memcpy(&entry->caller, fstack->calls, size); As the entry is struct stack_entry, that is created by a macro to both create the structure and export this to user space, it still had the caller field of entry defined as: unsigned long caller[8]. When the stack is greater than 8, the FORTIFY_SOURCE code notices that the amount being copied is greater than the source array and complains about it. It has no idea that the source is pointing to the ring buffer with the required allocation. To hide this from the FORTIFY_SOURCE logic, pointer arithmetic is used: ptr = ring_buffer_event_data(event); entry = ptr; ptr += offsetof(typeof(*entry), caller); memcpy(ptr, fstack->calls, size); Link: https://lore.kernel.org/all/20230612160748.4082850-1-svens@linux.ibm.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230712105235.5fc441aa@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reported-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Zheng Yejian
|
26efd79c46 |
ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
As comments in ftrace_process_locs(), there may be NULL pointers in mcount_loc section: > Some architecture linkers will pad between > the different mcount_loc sections of different > object files to satisfy alignments. > Skip any NULL pointers. After commit |
||
Linus Torvalds
|
9a3236ce48 |
Probes fixes and clean ups for v6.5-rc1:
- Fix fprobe's rethook release timing issue(1). Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Fix fprobe's rethook access timing issue(2). Stop rethook before ftrace_ops is unregistered so that the rethook is NOT keep using after exiting the unregister_fprobe(). - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. -----BEGIN PGP SIGNATURE----- iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmStawEbHG1hc2FtaS5o aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bMBkIAJYun4zeXsFeUUNVZMP8 UlcyBt/uiB1Ch/t1T1wc55plWIDAvUfN+FEltwhb6MJsQgWEjKJxNcH+oquQeqSH OkUvV6a8BR73FWbCt1Tm2MQKEG1RHC1R4JCj5GCzP93rQSCnmvr0c1yb6+JKQRx4 aPgWUjDm9vhYlOXS6heHo0hf0MbXDl1kqHjvMU2MXUMk7NtQ2JEo1Whikf7Drl6D ufNqV54GmtuJhgIWAqSk+qWresRvXy0/5i4ONK1Kmq+pdhssjZ4KFsWFQTIkQzdU nP8t2EfOp3aglx7ANvEJ+COfFNi0aMnHVBo7BbzsOy7Cq7IZTfD6zOTYSfylGGdA Uw4= =RBbc -----END PGP SIGNATURE----- Merge tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Fix fprobe's rethook release issues: - Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Stop rethook before ftrace_ops is unregistered so that the rethook is NOT used after exiting unregister_fprobe() - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. * tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() kernel: kprobes: Remove unnecessary ‘0’ values kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock kernel/trace: Fix cleanup logic of enable_trace_eprobe fprobe: Release rethook after the ftrace_ops is unregistered |
||
Zheng Yejian
|
7e42907f3a |
ring-buffer: Fix deadloop issue on reading trace_pipe
Soft lockup occurs when reading file 'trace_pipe':
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488]
[...]
RIP: 0010:ring_buffer_empty_cpu+0xed/0x170
RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb
RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218
RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f
R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901
R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000
[...]
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__find_next_entry+0x1a8/0x4b0
? peek_next_entry+0x250/0x250
? down_write+0xa5/0x120
? down_write_killable+0x130/0x130
trace_find_next_entry_inc+0x3b/0x1d0
tracing_read_pipe+0x423/0xae0
? tracing_splice_read_pipe+0xcb0/0xcb0
vfs_read+0x16b/0x490
ksys_read+0x105/0x210
? __ia32_sys_pwrite64+0x200/0x200
? switch_fpu_return+0x108/0x220
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Through the vmcore, I found it's because in tracing_read_pipe(),
ring_buffer_empty_cpu() found some buffer is not empty but then it
cannot read anything due to "rb_num_of_entries() == 0" always true,
Then it infinitely loop the procedure due to user buffer not been
filled, see following code path:
tracing_read_pipe() {
... ...
waitagain:
tracing_wait_pipe() // 1. find non-empty buffer here
trace_find_next_entry_inc() // 2. loop here try to find an entry
__find_next_entry()
ring_buffer_empty_cpu(); // 3. find non-empty buffer
peek_next_entry() // 4. but peek always return NULL
ring_buffer_peek()
rb_buffer_peek()
rb_get_reader_page()
// 5. because rb_num_of_entries() == 0 always true here
// then return NULL
// 6. user buffer not been filled so goto 'waitgain'
// and eventually leads to an deadloop in kernel!!!
}
By some analyzing, I found that when resetting ringbuffer, the 'entries'
of its pages are not all cleared (see rb_reset_cpu()). Then when reducing
the ringbuffer, and if some reduced pages exist dirty 'entries' data, they
will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which
cause wrong 'overrun' count and eventually cause the deadloop issue.
To fix it, we need to clear every pages in rb_reset_cpu().
Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Arnd Bergmann
|
7d8b31b73c |
tracing: arm64: Avoid missing-prototype warnings
These are all tracing W=1 warnings in arm64 allmodconfig about missing prototypes: kernel/trace/trace_kprobe_selftest.c:7:5: error: no previous prototype for 'kprobe_trace_selftest_target' [-Werror=missing-pro totypes] kernel/trace/ftrace.c:329:5: error: no previous prototype for '__register_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:372:5: error: no previous prototype for '__unregister_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:4130:15: error: no previous prototype for 'arch_ftrace_match_adjust' [-Werror=missing-prototypes] kernel/trace/fgraph.c:243:15: error: no previous prototype for 'ftrace_return_to_handler' [-Werror=missing-prototypes] kernel/trace/fgraph.c:358:6: error: no previous prototype for 'ftrace_graph_sleep_time_control' [-Werror=missing-prototypes] arch/arm64/kernel/ftrace.c:460:6: error: no previous prototype for 'prepare_ftrace_return' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2172:5: error: no previous prototype for 'syscall_trace_enter' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2195:6: error: no previous prototype for 'syscall_trace_exit' [-Werror=missing-prototypes] Move the declarations to an appropriate header where they can be seen by the caller and callee, and make sure the headers are included where needed. Link: https://lore.kernel.org/linux-trace-kernel/20230517125215.930689-1-arnd@kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Florent Revest <revest@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [ Fixed ftrace_return_to_handler() to handle CONFIG_HAVE_FUNCTION_GRAPH_RETVAL case ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Pu Lehui
|
4369016497 |
bpf: cpumap: Fix memory leak in cpu_map_update_elem
Syzkaller reported a memory leak as follows:
BUG: memory leak
unreferenced object 0xff110001198ef748 (size 192):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 32 bytes):
00 00 00 00 4a 19 00 00 80 ad e3 e4 fe ff c0 00 ....J...........
00 b2 d3 0c 01 00 11 ff 28 f5 8e 19 01 00 11 ff ........(.......
backtrace:
[<ffffffffadd28087>] __cpu_map_entry_alloc+0xf7/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak
unreferenced object 0xff110001198ef528 (size 192):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffffadd281f0>] __cpu_map_entry_alloc+0x260/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak
unreferenced object 0xff1100010fd93d68 (size 8):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<ffffffffade5db3e>] kvmalloc_node+0x11e/0x170
[<ffffffffadd28280>] __cpu_map_entry_alloc+0x2f0/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
In the cpu_map_update_elem flow, when kthread_stop is called before
calling the threadfn of rcpu->kthread, since the KTHREAD_SHOULD_STOP bit
of kthread has been set by kthread_stop, the threadfn of rcpu->kthread
will never be executed, and rcpu->refcnt will never be 0, which will
lead to the allocated rcpu, rcpu->queue and rcpu->queue->queue cannot be
released.
Calling kthread_stop before executing kthread's threadfn will return
-EINTR. We can complete the release of memory resources in this state.
Fixes:
|
||
Chungkai Yang
|
3a8395b565 |
PM: QoS: Restore support for default value on frequency QoS
Commit |
||
Azat Khuzhin
|
c9e4bf607d |
PM: hibernate: Fix writing maj:min to /sys/power/resume
resume_store() first calls lookup_bdev() and after tries to handle
maj:min, but it does not reset the error before, hence if you will write
maj:min you will get ENOENT:
# echo 259:2 >| /sys/power/resume
bash: echo: write error: No such file or directory
This also should fix hiberation via systemd, since it uses this way.
Fixes:
|
||
Beau Belgrave
|
d0a3022f30 |
tracing/user_events: Fix struct arg size match check
When users register an event the name of the event and it's argument are
checked to ensure they match if the event already exists. Normally all
arguments are in the form of "type name", except for when the type
starts with "struct ". In those cases, the size of the struct is passed
in addition to the name, IE: "struct my_struct a 20" for an argument
that is of type "struct my_struct" with a field name of "a" and has the
size of 20 bytes.
The current code does not honor the above case properly when comparing
a match. This causes the event register to fail even when the same
string was used for events that contain a struct argument within them.
The example above "struct my_struct a 20" generates a match string of
"struct my_struct a" omitting the size field.
Add the struct size of the existing field when generating a comparison
string for a struct field to ensure proper match checking.
Link: https://lkml.kernel.org/r/20230629235049.581-2-beaub@linux.microsoft.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Masami Hiramatsu (Google)
|
195b9cb5b2 |
fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free()
Ensure running fprobe_exit_handler() has finished before calling rethook_free() in the unregister_fprobe() so that caller can free the fprobe right after unregister_fprobe(). unregister_fprobe() ensured that all running fprobe_entry/exit_handler() have finished by calling unregister_ftrace_function() which synchronizes RCU. But commit |
||
Li zeming
|
ed9492dfef |
kernel: kprobes: Remove unnecessary ‘0’ values
it is assigned first, so it does not need to initialize the assignment. Link: https://lore.kernel.org/all/20230711185353.3218-1-zeming@nfschina.com/ Signed-off-by: Li zeming <zeming@nfschina.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> |
||
Li zeming
|
e1164787f2 |
kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr
The 'correct_ret_addr' pointer is always set in the later code, no need to initialize it at definition time. Link: https://lore.kernel.org/all/20230704194359.3124-1-zeming@nfschina.com/ Signed-off-by: Li zeming <zeming@nfschina.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> |
||
Ze Gao
|
5f0c584daf |
fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock
Unlock ftrace recursion lock when fprobe_kprobe_handler() is failed
because of some running kprobe.
Link: https://lore.kernel.org/all/20230703092336.268371-1-zegao@tencent.com/
Fixes:
|
||
Tzvetomir Stoyanov (VMware)
|
cf0a624dc7 |
kernel/trace: Fix cleanup logic of enable_trace_eprobe
The enable_trace_eprobe() function enables all event probes, attached
to given trace probe. If an error occurs in enabling one of the event
probes, all others should be roll backed. There is a bug in that roll
back logic - instead of all event probes, only the failed one is
disabled.
Link: https://lore.kernel.org/all/20230703042853.1427493-1-tz.stoyanov@gmail.com/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes:
|
||
Suren Baghdasaryan
|
aff037078e |
sched/psi: use kernfs polling functions for PSI trigger polling
Destroying psi trigger in cgroup_file_release causes UAF issues when a cgroup is removed from under a polling process. This is happening because cgroup removal causes a call to cgroup_file_release while the actual file is still alive. Destroying the trigger at this point would also destroy its waitqueue head and if there is still a polling process on that file accessing the waitqueue, it will step on the freed pointer: do_select vfs_poll do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy wake_up_pollfree(&t->event_wait) // vfs_poll is unblocked synchronize_rcu kfree(t) poll_freewait -> UAF access to the trigger's waitqueue head Patch [1] fixed this issue for epoll() case using wake_up_pollfree(), however the same issue exists for synchronous poll() case. The root cause of this issue is that the lifecycles of the psi trigger's waitqueue and of the file associated with the trigger are different. Fix this by using kernfs_generic_poll function when polling on cgroup-specific psi triggers. It internally uses kernfs_open_node->poll waitqueue head with its lifecycle tied to the file's lifecycle. This also renders the fix in [1] obsolete, so revert it. [1] commit |
||
Miaohe Lin
|
ae2ad293d6 |
sched/fair: Use recent_used_cpu to test p->cpus_ptr
When checking whether a recently used CPU can be a potential idle
candidate, recent_used_cpu should be used to test p->cpus_ptr as
p->recent_used_cpu is not equal to recent_used_cpu and candidate
decision is made based on recent_used_cpu here.
Fixes:
|
||
Linus Torvalds
|
f71f64210d |
dma-mapping fixes for Linux 6.5
- swiotlb area sizing fixes (Petr Tesarik) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmSq57sLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOh0xAAkwklIxxzXvNlgjvy2hdgPWImPS8tGPDSIsqA9TDD WDZq89nz/ndnchdPObDvyJXmfBgqa0qCHqopBVPqMKv5a1pKZhrRXYlbajGQQwji MIqefTLZ/VGw7bDpEivOt+yadwQ3KxuxWWs7/JPKLLReSJ22H8P+jkrK7P7kBL5X YaMtZG9d86fvFHnQHKdAOlF1iCvnoZDHPcvaZbI6m5mMSZ+HIYqK5pP1MTUEAbIU MX4ZSI7/mL0q67+kZuM/NCSeq1pH0Cd0D2DGm+8k/y87G81GS6E5Wgg+xO7vYiXf 5ChuwlAO9K9KhH7NIRkKhkad/Ii89ENXSyv5gmPRoKYK5FXajnHSlJTUrZojV6XC Pbsd9ATVzV0rY61EPyh6G1a+Ttp/pwMp+W0I2fi032GVAePQ/vhB9x9O+2O/3QiC v80nUSatkciZncWqkguhp3NONsRmLKep3CCQnEAA/gLs27B0avdQeslnqbOOUQKd Si+djIoel8ONjQ+mW8eFCsVYMH1xFSo0aGsgGe0y2cyBE3DN1AW9eRnOXWT4C1PR UyDlx8ACs87ojec+YRQFYk2/PbsU7CQiH1pteXvBHcbhiVUAvrtXtg6ANQ+7066P IIduAZmlHcHb1BhyrSQbAtRllVLIp/l9IAkCSY9SvL0tjh/B5CaRBD5m2Taow5I/ KUI= =4Lfc -----END PGP SIGNATURE----- Merge tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool |
||
Linus Torvalds
|
a9943ad3dd |
- Optimize IRQ domain's name assignment
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSqa70ACgkQEsHwGGHe VUrTlhAAtfvA+mCeWVwn42827YlJeMvFV6RedERSk8x5CO/IJgHx2YFB0UXlznuM xQSIC7pkY2He62ggb+s6m+Pb1M3bxxJDaRSRnQGHOqXuSNuzokqjwvVMaSBE9s5W fgvdbZ4CDEynW7awiDq1PaMcJFzbc6RMbBDspNDke2ikZgGvqqWl3Gx2DHhGyNfQ UFJHETejxhfheU7Y9c06P2ToTyzc0RcaPFQfFmYIse+c6iTcjnqr9NRHCIklSygN F3VjSEXC1MkAZIAuV9pjkl3z01xfvDe1WqxH3jID/vM97AY08DqmBQC31VZOBmmV shndEUpc2yi7ued0WU1XLf1sMyEHm5rgF/19gRhFR9sxu4Zm9+JxOofo0URwqtsV Z3d0CxpDrF0ATJs2zTstUmOG0AD6SBfC+BQZi3M6rohXgR0z04dBnyTgCnxAs60u d5byFza5nZ/sX7KZXgVlKfH5ej0AcZuyFM/qMKqiTqzz5C4dv0Hq7y718aaecxmz W9A3olDGhC48pr9250TthEMhBzRdXvKgKQBUhC2TY/0hUBHyvmHeU//ql54XYhgh XVx3qI7yieZC3d27hZN0WTCG6qSCJoq8LROmGDIbx16Uj/cm1AHzJTJ+5CbCOafZ 41gPBoNCNYNXlgYNBXpVcPRtfcK98fGl3jpPOk0llaSa6IW+Lcc= =ATsc -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace() |
||
Suren Baghdasaryan
|
fb49c45532 |
fork: lock VMAs of the parent process when forking
When forking a child process, the parent write-protects anonymous pages
and COW-shares them with the child being forked using copy_present_pte().
We must not take any concurrent page faults on the source vma's as they
are being processed, as we expect both the vma and the pte's behind it
to be stable. For example, the anon_vma_fork() expects the parents
vma->anon_vma to not change during the vma copy.
A concurrent page fault on a page newly marked read-only by the page
copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
source vma, defeating the anon_vma_clone() that wasn't done because the
parent vma originally didn't have an anon_vma, but we now might end up
copying a pte entry for a page that has one.
Before the per-vma lock based changes, the mmap_lock guaranteed
exclusion with concurrent page faults. But now we need to do a
vma_start_write() to make sure no concurrent faults happen on this vma
while it is being processed.
This fix can potentially regress some fork-heavy workloads. Kernel
build time did not show noticeable regression on a 56-core machine while
a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes:
|
||
Linus Torvalds
|
8066178f53 |
Tracing fixes for 6.5:
- Fix bad git merge of #endif in arm64 code A merge of the arm64 tree caused #endif to go into the wrong place - Fix crash on lseek of write access to tracefs/error_log Opening error_log as write only, and then doing an lseek() causes a kernel panic, because the lseek() handle expects a "seq_file" to exist (which is not done on write only opens). Use tracing_lseek() that tests for this instead of calling the default seq lseek handler. - Check for negative instead of -E2BIG for error on strscpy() returns Instead of testing for -E2BIG from strscpy(), to be more robust, check for less than zero, which will make sure it catches any error that strscpy() may someday return. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZKbQUhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qpWaAP9zQ1eLQSfMt0dHH01OBSJvc2mMd4QJ VZtWZ+xTSvk+4gD/axDzDS7Qisfrrli+1oQSPwVik2SXiz0SPJqJ25m9zw4= =xMlg -----END PGP SIGNATURE----- Merge tag 'trace-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix bad git merge of #endif in arm64 code A merge of the arm64 tree caused #endif to go into the wrong place - Fix crash on lseek of write access to tracefs/error_log Opening error_log as write only, and then doing an lseek() causes a kernel panic, because the lseek() handle expects a "seq_file" to exist (which is not done on write only opens). Use tracing_lseek() that tests for this instead of calling the default seq lseek handler. - Check for negative instead of -E2BIG for error on strscpy() returns Instead of testing for -E2BIG from strscpy(), to be more robust, check for less than zero, which will make sure it catches any error that strscpy() may someday return. * tag 'trace-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/boot: Test strscpy() against less than zero for error arm64: ftrace: fix build error with CONFIG_FUNCTION_GRAPH_TRACER=n tracing: Fix null pointer dereference in tracing_err_log_open() |
||
Linus Torvalds
|
7b82e90411 |
asm-generic updates for 6.5
These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmSl138ACgkQYKtH/8kJ UieqWxAA2WjNVfyuieYckglOVE0PZPs2fzCwyzTY5iUTH3gE5cBFWJDWcg2EnouG v3X3htEQcowYWaCF9+rypQXaGiSx4WXi2Bjxnz3D/BcreqWPI4eSQ0fpGG5SURTY 2zYF72GTt4JGR++l+7/R9MZwPbwYDT9BsD5tkel8PxnyVLM6/c5xFvbjzRSKFE8x SMN1jGZ62ITLNf/8coAOEPNxBYtDT6yQyu7P2sx5cd65LAQq9yLKjFklnBBovgWT OoCIZAdGkhcNwOh1LjyHcdNdpfNJGceKyqKPqty07IhCQuF2jxiyFYFzuBbeyQfE S0itN8o/MIfUmxaQl3e8dPAVb1RlNVr1zfQ6y4tUtWNdkNL2WwSnSQSRHrBfHxCQ QCF++PMeFcLhGwMYtqdNJ7XGLQ0PsjD74pRf0vo+vjmqDk2BJsJBP57VU+8MJn5r SoxqnJ0WxLvm1TfrNKusV7zMNWquc2duJDW40zsOssP4itjYELSI6qa56qmzlqmX zKmRx6mxAlx9RRK8FHXFYHbz3p93vv8z9vTOZV3AjIjjED960CLknUAwCC8FoJyz 9b5wyMXsLQHQjGt8luAvPc6OiU0EiU9a4SPK+feWcv27serFvnjJlRTS/yG2Z3zd BYsUgsXHypsdoud+aE7MeCy7fE8n3mhoyMQQRBkOMFJ7RsG6wAE= =S/he -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers" * tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: syscalls: Remove file path comments from headers tools arch: Remove uapi bitsperlong.h of hexagon and microblaze asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch m68k/mm: Make pfn accessors static inlines arm64: memory: Make virt_to_pfn() a static inline ARM: mm: Make virt_to_pfn() a static inline asm-generic/page.h: Make pfn accessors static inlines xen/netback: Pass (void *) to virt_to_page() netfs: Pass a pointer to virt_to_page() cifs: Pass a pointer to virt_to_page() in cifsglob cifs: Pass a pointer to virt_to_page() riscv: mm: init: Pass a pointer to virt_to_page() ARC: init: Pass a pointer to virt_to_pfn() in init m68k: Pass a pointer to virt_to_pfn() virt_to_page() fs/proc/kcore.c: Pass a pointer to virt_addr_valid() |
||
Kumar Kartikeya Dwivedi
|
5415ccd50a |
bpf: Fix max stack depth check for async callbacks
The check_max_stack_depth pass happens after the verifier's symbolic
execution, and attempts to walk the call graph of the BPF program,
ensuring that the stack usage stays within bounds for all possible call
chains. There are two cases to consider: bpf_pseudo_func and
bpf_pseudo_call. In the former case, the callback pointer is loaded into
a register, and is assumed that it is passed to some helper later which
calls it (however there is no way to be sure), but the check remains
conservative and accounts the stack usage anyway. For this particular
case, asynchronous callbacks are skipped as they execute asynchronously
when their corresponding event fires.
The case of bpf_pseudo_call is simpler and we know that the call is
definitely made, hence the stack depth of the subprog is accounted for.
However, the current check still skips an asynchronous callback even if
a bpf_pseudo_call was made for it. This is erroneous, as it will miss
accounting for the stack usage of the asynchronous callback, which can
be used to breach the maximum stack depth limit.
Fix this by only skipping asynchronous callbacks when the instruction is
not a pseudo call to the subprog.
Fixes:
|
||
Linus Torvalds
|
6843306689 |
Including fixes from bluetooth, bpf and wireguard.
Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmSlu+MACgkQMUZtbf5S Irslgw//S7jf/GL8V6y8VL3te+/OPOZnLDTzFFOdy64/y97FE6XIacJUpyWRhtmz oSzcSNHETPW9U+xSGa2ZQlKhAXt6n9iRNvUegql+VBb13Iz+l7AdTeoxRv/YuwDo 5lTOIB6cBw+ATd0oxS6wr8SyUlcvktUKBfTAItjbVM55aXfIUpXIa84+F7avJgIA XP1u/3PHhwItmwo/hXhHH0+P0QA8ix1q2SvRB7DAlQLBsTuQhaKjXWQkYYTKw/Nt dtvh8iQSs/YXaHMjTa5CK28HOD8+ywIizr+uJ9VaNqIzV0W5JE9IE8P4NFpBcY7t kGjTYODOph7dkNmZ5RLj3N+B6CyC57OXDzoo/tr8940UytCLVj9EVyduarLGLx57 edqK9cUz5kWejyGoyZ4Pvlo/SKvCQ2HKMeiAJ0/nNpTJMFuygMoqGsaD6ttzkXMj fZLPjRUK3axd+15ZzhLEf8HyL5Qh+qPqqX9p7NljfMKwhxMWJ5fuICJfdGOSdMJR ndL+wPfRPFQwszZ4pbTY2Ivn29mo8ScBOSOEgQs2mOny+zFzTzmqNWz/jcFfQnjS cylxBEHrgudT2FuCImZ/v66TM5yakHXqIdpTGG+zsvJWQqjM96Z3I7WRvi0g9d75 n84il+j34mnzl90j2xEutqUiK7BQ9ZpZBsutPVTKBIHKWWiortI= =9yzk -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf and wireguard. Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind" * tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) nfp: clean mc addresses in application firmware when closing port selftests: mptcp: pm_nl_ctl: fix 32-bit support selftests: mptcp: depend on SYN_COOKIES selftests: mptcp: userspace_pm: report errors with 'remove' tests selftests: mptcp: userspace_pm: use correct server port selftests: mptcp: sockopt: return error if wrong mark selftests: mptcp: sockopt: use 'iptables-legacy' if available selftests: mptcp: connect: fail if nft supposed to work mptcp: do not rely on implicit state check in mptcp_listen() mptcp: ensure subflow is unhashed before cleaning the backlog s390/qeth: Fix vipa deletion octeontx-af: fix hardware timestamp configuration net: dsa: sja1105: always enable the send_meta options net: dsa: tag_sja1105: fix MAC DA patching from meta frames net: Replace strlcpy with strscpy pptp: Fix fib lookup calls. mlxsw: spectrum_router: Fix an IS_ERR() vs NULL check net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX xsk: Honor SO_BINDTODEVICE on bind ptp: Make max_phase_adjustment sysfs device attribute invisible when not supported ... |
||
Steven Rostedt (Google)
|
fddca7db4a |
tracing/boot: Test strscpy() against less than zero for error
Instead of checking for -E2BIG, it is better to just check for less than zero of strscpy() for error. Testing for -E2BIG is not very robust, and the calling code does not really care about the error code, just that there was an error. One of the updates to convert strlcpy() to strscpy() had a v2 version that changed the test from testing against -E2BIG to less than zero, but I took the v1 version that still tested for -E2BIG. Link: https://lore.kernel.org/linux-trace-kernel/20230615180420.400769-1-azeemshaikh38@gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230704100807.707d1605@rorschach.local.home Cc: Mark Rutland <mark.rutland@arm.com> Cc: Azeem Shaikh <azeemshaikh38@gmail.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Mateusz Stachyra
|
02b0095e2f |
tracing: Fix null pointer dereference in tracing_err_log_open()
Fix an issue in function 'tracing_err_log_open'.
The function doesn't call 'seq_open' if the file is opened only with
write permissions, which results in 'file->private_data' being left as null.
If we then use 'lseek' on that opened file, 'seq_lseek' dereferences
'file->private_data' in 'mutex_lock(&m->lock)', resulting in a kernel panic.
Writing to this node requires root privileges, therefore this bug
has very little security impact.
Tracefs node: /sys/kernel/tracing/error_log
Example Kernel panic:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
Call trace:
mutex_lock+0x30/0x110
seq_lseek+0x34/0xb8
__arm64_sys_lseek+0x6c/0xb8
invoke_syscall+0x58/0x13c
el0_svc_common+0xc4/0x10c
do_el0_svc+0x24/0x98
el0_svc+0x24/0x88
el0t_64_sync_handler+0x84/0xe4
el0t_64_sync+0x1b4/0x1b8
Code: d503201f aa0803e0 aa1f03e1 aa0103e9 (c8e97d02)
---[ end trace 561d1b49c12cf8a5 ]---
Kernel panic - not syncing: Oops: Fatal exception
Link: https://lore.kernel.org/linux-trace-kernel/20230703155237eucms1p4dfb6a19caa14c79eb6c823d127b39024@eucms1p4
Link: https://lore.kernel.org/linux-trace-kernel/20230704102706eucms1p30d7ecdcc287f46ad67679fc8491b2e0f@eucms1p3
Cc: stable@vger.kernel.org
Fixes:
|