Currently left_margin does not match its documentation:
...
/* Return the size of the left margin space, this is the space used to
display things like breakpoint markers. */
int left_margin () const
{ return box_width () + TUI_EXECINFO_SIZE + extra_margin (); }
...
It is stated that the left margin is reserved to display things, but
the box_width is not used for that.
Fix this by dropping box_width () from the left_margin calculation.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
In tui_source_window::set_contents we have:
...
/* Take hilite (window border) into account, when
calculating the number of lines. */
int nlines = height - 2;
...
The '2' represents the total size of the window border (or box, in can_box
terms), in this case one line at the top and one line at the bottom.
Likewise, '1' is used to represent the width of the window border.
Introduce new functions:
- tui_win_info::box_width () and
- tui_win_info::box_size ()
that can be used instead instead of these hardcoded constants.
Implement these such that they return 0 when can_box () == false.
Tested patch completeness by making all windows unboxed:
...
@@ -85,7 +85,7 @@ struct tui_win_info
/* Return true if this window can be boxed. */
virtual bool can_box () const
{
- return true;
+ return false;
}
int box_width () const
...
and test-driving TUI.
This required eliminating an assert in
tui_source_window_base::show_source_content, I've included that part as well.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
Currently the call to prefresh in tui_source_window_base::refresh_window looks
like:
...
prefresh (m_pad.get (), 0, pad_x, y + 1, x + left_margin,
y + m_content.size (), x + left_margin + view_width - 1);
...
This is hard to parse. It's not obvious what the arguments mean, and there's
repetition in the argument calculation.
Fix this by rewriting the call as follows:
- use sminrow, smincol, smaxrow and smaxcol variables for the last
4 arguments, and
- calculate the smaxrow and smaxcol variables based on the sminrow and
smincol variables.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
The original intention of the test appears to be checking to make sure
setting a breakpoint in an inlined function didn't set multiple
breakpoints where one of them was at address 0.
The gdb.ada/inline-section-gc.exp test may pass or fail depending on the
version of gnat. Per the discussion on IRC, the ada inlining appears to
have some target dependencies. In this test there are two functions,
callee and caller. Function calee is inlined into caller. The test sets
a breakpoint in function callee. The reported location where the
breakpoint is set may be at the requested location in callee or the
location in caller after callee has been inlined. The test needs to
accept either location as correct provided the breakpoint address is not
zero.
This patch checks to see if the reported breakpoint is in function callee
or function caller and fails if the breakpoint address is 0x0. The line
number where the breakpoint is set will match the requested line if the
breakpoint location is reported is callee.adb. If the breakpoint is
reported in caller.adb, the line number in caller is the breakpoint
location in callee where it is inlined into caller.
This patch fixes the single regression failure for the test on PowerPC.
It does not introduce any failures on X86-64.
If your target has no support for TARGET_WAITKIND_NO_RESUMED events
(and no way to support them, such as the yet-unsubmitted AMDGPU
target), and you step over thread exit with scheduler-locking on, this
is what you get:
(gdb) n
[Thread ... exited]
*hang*
Getting back the prompt by typing Ctrl-C may not even work, since no
inferior thread is running to receive the SIGINT. Even if it works,
it seems unnecessarily harsh. If you started an execution command for
which there's a clear thread of interest (step, next, until, etc.),
and that thread disappears, then I think it's more user friendly if
GDB just detects the situation and aborts the command, giving back the
prompt.
That is what this commit implements. It does this by explicitly
requesting the target to report thread exit events whenever the main
resumed thread has a thread_fsm. Note that unlike stepping over a
breakpoint, we don't need to enable clone events in this case.
With this patch, we get:
(gdb) n
[Thread 0x7ffff7d89700 (LWP 3961883) exited]
Command aborted, thread exited.
(gdb)
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I901ab64c91d10830590b2dac217b5264635a2b95
This commit documents in both manual and NEWS:
- the new remote clone event stop reply,
- the new QThreadOptions packet and its current defined options,
- the associated "set/show remote thread-events-packet" command,
- and the associated QThreadOptions qSupported feature.
Approved-By: Eli Zaretskii <eliz@gnu.org>
Change-Id: Ic1c8de1fefba95729bbd242969284265de42427e
Add new gdb.threads/step-over-thread-exit.exp and
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
testcases, exercising stepping over thread exit syscall. These make
use of lib/my-syscalls.S to define the exit syscall.
Co-authored-by: Pedro Alves <pedro@palves.net>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: Ie8b2c5747db99b7023463a897a8390d9e814a9c9
Refactor the syscall assembly code in gdb/testsuite/lib/my-syscalls.S
behind a SYSCALL macro so that it's easy to add new syscalls without
duplicating code.
Note that the way the macro is implemented, it only works correctly
for syscalls with up to 3 arguments, and, if the syscall doesn't
return (the macro doesn't bother to save/restore callee-saved
registers).
The following patch will want to use the macro to define a wrapper for
the "exit" syscall, so the limitations continue to be sufficient.
Change-Id: I8acf1463b11a084d6b4579aaffb49b5d0dea3bba
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
If scheduler-locking is in effect, e.g., with "set scheduler-locking
on", and you step over a function that spawns a new thread, the new
thread is allowed to run free, at least until some event is hit, at
which point, whether the new thread is re-resumed depends on a number
of seemingly random factors. E.g., if the target is all-stop, and the
parent thread hits a breakpoint, and GDB decides the breakpoint isn't
interesting to report to the user, then the parent thread is resumed,
but the new thread is left stopped.
I think that letting the new threads run with scheduler-locking
enabled is a defect. This commit fixes that, making use of the new
clone events on Linux, and of target_thread_events() on targets where
new threads have no connection to the thread that spawned them.
Testcase and documentation changes included.
Approved-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie12140138b37534b7fc1d904da34f0f174aa11ce
Running the
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
added later in the series against gdbserver, after the
TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
into an infinite loop in stop_all_threads, leading to a timeout:
FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)
The is really a latent bug, and it is about the fact that
stop_all_threads stops listening to events from a target as soon as it
sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows
how to handle delayed no-resumed events, but stop_all_threads was
never taught to.
In more detail, here's what happens with that testcase:
#1 - Multiple threads report breakpoint hits to gdb.
#2 - gdb picks one events, and it's for thread 1. All other stops are
left pending. thread 1 needs to move past a breakpoint, so gdb
stops all threads to start an inline step over for thread 1.
While stopping threads, some of the threads that were still
running report events that are also left pending.
#2 - gdb steps thread 1
#3 - Thread 1 exits while stepping (it steps over an exit syscall),
gdbserver reports thread exit for thread 1
#4 - Thread 1 was the last resumed thread, so gdbserver also reports
no-resumed:
[remote] Notification received: Stop:w0;p3445d0.3445d3
[remote] Sending packet: $vStopped#55
[remote] Packet received: N
[remote] Sending packet: $vStopped#55
[remote] Packet received: OK
#5 - gdb processes the thread exit for thread 1, finishes the step
over and restarts threads.
#6 - gdb picks the next event to process out of one of the resumed
threads with pending events:
[infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11#7 - This is again a breakpoint hit and the breakpoint needs to be
stepped over too, so gdb starts a step-over dance again.
#8 - We reach stop_all_threads, which finds that some threads need to
be stopped.
#9 - wait_one finally consumes the no-resumed event queue by #4.
Seeing this, wait_one disable target async, to stop listening for
events out of the remote target.
#10 - We still haven't seen all the stops expected, so
stop_all_threads tries another iteration.
#11 - Because the remote target is no longer async, and there are no
other targets, wait_one return no-resumed immediately without
polling the remote target.
#12 - We still haven't seen all the stops expected, so
stop_all_threads tries another iteration. goto #11, looping
forever.
Fix this by explicitly enabling/re-enabling target async on targets
that can async, before waiting for stops.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f
GDB doesn't handle correctly the case where a thread steps over a
breakpoint (using either in-line or displaced stepping), and the
executed instruction causes the thread to exit.
Using the test program included later in the series, this is what it
looks like with displaced-stepping, on x86-64 Linux, where we have two
displaced-step buffers:
$ ./gdb -q -nx --data-directory=data-directory build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit -ex "b my_exit_syscall" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2915510)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915510)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915524)]
[Thread 0x7ffff7c5f640 (LWP 2915510) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915524)]
Thread 3 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[New Thread 0x7ffff7c5f640 (LWP 2915616)]
[Thread 0x7ffff7c5f640 (LWP 2915524) exited]
[Switching to Thread 0x7ffff7c5f640 (LWP 2915616)]
Thread 4 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
... hangs ...
The first two times we do "continue", we displaced-step the syscall
instruction that causes the thread to exit. When the thread exits,
the main thread, waiting on pthread_join, is unblocked. It spawns a
new thread, which hits the breakpoint on the syscall again. However,
infrun was never notified that the displaced-stepping threads are done
using the displaced-step buffer, so now both buffers are marked as
used. So when we do the third continue, there are no buffers
available to displaced-step the syscall, so the thread waits forever
for its turn.
When trying the same but with in-line step over (displaced-stepping
disabled):
$ ./gdb -q -nx --data-directory=data-directory \
build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit \
-ex "b my_exit_syscall" -ex "set displaced-stepping off" -ex r
Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
[New Thread 0x7ffff7c5f640 (LWP 2928290)]
[Switching to Thread 0x7ffff7c5f640 (LWP 2928290)]
Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
68 syscall
(gdb) c
Continuing.
[Thread 0x7ffff7c5f640 (LWP 2928290) exited]
No unwaited-for children left.
(gdb) i th
Id Target Id Frame
1 Thread 0x7ffff7c60740 (LWP 2928285) "step-over-threa" 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
The current thread <Thread ID 2> has terminated. See `help thread'.
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7c60740 (LWP 2928285))]
#0 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
(gdb) c
Continuing.
^C^C
... hangs ...
The "continue" causes an in-line step to occur, meaning the main
thread is stopped while we step the syscall. The stepped thread exits
when executing the syscall, the linux-nat target notices there are no
more resumed threads to be waited for, so returns
TARGET_WAITKIND_NO_RESUMED, which causes the prompt to return. But
infrun never clears the in-line step over info. So if we try
continuing the main thread, GDB doesn't resume it, because it thinks
there's an in-line step in progress that we need to wait for to
finish, and we are stuck there.
To fix this, infrun needs to be informed when a thread doing a
displaced or in-line step over exits. We can do that with the new
target_set_thread_options mechanism which is optimal for only enabling
exit events of the thread that needs it; or, if that is not supported,
by using target_thread_events, which enables thread exit events for
all threads. This is done by this commit.
This patch then modifies handle_inferior_event in infrun.c to clean up
any step-over the exiting thread might have been doing at the time of
the exit. The cases to consider are:
- the exiting thread was doing an in-line step-over with an all-stop
target
- the exiting thread was doing an in-line step-over with a non-stop
target
- the exiting thread was doing a displaced step-over with a non-stop
target
The displaced-stepping buffer implementation in displaced-stepping.c
is modified to account for the fact that it's possible that we
"finish" a displaced step after a thread exit event. The buffer that
the exiting thread was using is marked as available again and the
original instructions under the scratch pad are restored. However, it
skips applying the fixup, which wouldn't make sense since the thread
does not exist anymore.
Another case that needs handling is if a displaced-stepping thread
exits, and the event is reported while we are in stop_all_threads. We
should call displaced_step_finish in the handle_one function, in that
case. It was already called in other code paths, just not the "thread
exited" path.
This commit doesn't make infrun ask the target to report the
TARGET_WAITKIND_THREAD_EXITED events yet, that'll be done later in the
series.
Note that "stop_print_frame = false;" line is moved to normal_stop,
because TARGET_WAITKIND_THREAD_EXITED can also end up with the event
transmorphed into TARGET_WAITKIND_NO_RESUMED. Moving it to
normal_stop keeps it centralized.
Co-authored-by: Simon Marchi <simon.marchi@efficios.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I745c6955d7ef90beb83bcf0ff1d1ac8b9b6285a5
This implements support for the new GDB_THREAD_OPTION_EXIT thread
option for native Linux.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ia69fc0b9b96f9af7de7cefc1ddb1fba9bbb0bb90
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
When stepping over a breakpoint with displaced stepping, GDB needs to
be informed if the stepped thread exits, otherwise the displaced
stepping buffer that was allocated to that thread leaks, and this can
result in deadlock, with other threads waiting for their turn to
displaced step, but their turn never comes.
Similarly, when stepping over a breakpoint in line, GDB also needs to
be informed if the stepped thread exits, so that is can clear the step
over state and re-resume threads.
This commit makes it possible for GDB to ask the target to report
thread exit events for a given thread, using the new "thread options"
mechanism introduced by a previous patch.
This only adds the core bits. Following patches in the series will
teach the Linux backends (native & gdbserver) to handle the
GDB_THREAD_OPTION_EXIT option, and then a later patch will make use of
these thread exit events to clean up displaced stepping and inline
stepping state properly.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I96b719fdf7fee94709e98bb3a90751d8134f3a38
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Currently, infrun assumes that when TARGET_WAITKIND_THREAD_EXITED is
reported, the corresponding GDB thread has already been removed from
the GDB thread list.
Later in the series, that will no longer work, as infrun will need to
refer to the thread's thread_info when it processes
TARGET_WAITKIND_THREAD_EXITED.
As preparation, this patch makes deleting the GDB thread
responsibility of infrun, instead of the target.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I013d87f61ffc9aaca49f0d6ce2a43e3ea69274de
Currently, GDB does not understand the THREAD_EXITED stop reply in
remote all-stop mode. There's no good reason for this, it just
happened that THREAD_EXITED was only ever reported in non-stop mode so
far. This patch teaches GDB to parse that event in all-stop RSP too.
There is no need to add a qSupported feature for this, because the
server won't send a THREAD_EXITED event unless GDB explicitly asks for
it, with QThreadEvents, or with the GDB_THREAD_OPTION_EXIT
QThreadOptions option added in the next patch.
Change-Id: Ide5d12391adf432779fe4c79526801c4a5630966
Now that gdb/19675 is fixed for both native and gdbserver GNU/Linux,
remove the gdb/19675 kfails.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I95c1c38ca370100675d303cd3c8995860bef465d
This patch teaches the Linux GDBserver backend to report clone events
to GDB, when GDB has requested them with the GDB_THREAD_OPTION_CLONE
thread option, via the new QThreadOptions packet.
This shuffles code in linux_process_target::handle_extended_wait
around to a more logical order when we now have to handle and
potentially report all of fork/vfork/clone.
Raname lwp_info::fork_relative -> lwp_info::relative as the field is
no longer only about (v)fork.
With this, gdb.threads/stepi-over-clone.exp now cleanly passes against
GDBserver, so remove the native-target-only requirement from that
testcase.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I3a19bc98801ec31e5c6fdbe1ebe17df855142bb2
This commit teaches the native Linux target about the
GDB_THREAD_OPTION_CLONE thread option. It's actually simpler to just
continue reporting all clone events unconditionally to the core.
There's never any harm in reporting a clone event when the option is
disabled. All we need to do is to report support for the option,
otherwise GDB falls back to use target_thread_events().
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: If90316e2dcd0c61d0fefa0d463c046011698acf9
A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED
event kind, and made the Linux target report clone events.
A following patch will teach Linux GDBserver to do the same thing.
However, for remote debugging, it wouldn't be ideal for GDBserver to
report every clone event to GDB, when GDB only cares about such events
in some specific situations. Reporting clone events all the time
would be potentially chatty. We don't enable thread create/exit
events all the time for the same reason. Instead we have the
QThreadEvents packet. QThreadEvents is target-wide, though.
This patch makes GDB instead explicitly request that the target
reports clone events or not, on a per-thread basis.
In order to be able to do that with GDBserver, we need a new remote
protocol feature. Since a following patch will want to enable thread
exit events on per-thread basis too, the packet introduced here is
more generic than just for clone events. It lets you enable/disable a
set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS.
IOW, this commit introduces a new QThreadOptions packet, that lets you
specify a set of per-thread event options you want to enable. The
packet accepts a list of options/thread-id pairs, similarly to vCont,
processed left to right, with the options field being a number
interpreted as a bit mask of options. The only option defined in this
commit is GDB_THREAD_OPTION_CLONE (0x1), which ask the remote target
to report clone events. Another patch later in the series will
introduce another option.
For example, this packet sets option "1" (clone events) on thread
p1000.2345:
QThreadOptions;1:p1000.2345
and this clears options for all threads of process 1000, and then sets
option "1" (clone events) on thread p1000.2345:
QThreadOptions;0:p1000.-1;1:p1000.2345
This clears options of all threads of all processes:
QThreadOptions;0
The target reports the set of supported options by including
"QThreadOptions=<supported options>" in its qSupported response.
infrun is then tweaked to enable GDB_THREAD_OPTION_CLONE when stepping
over a breakpoint.
Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit
their parent's thread options. This is so that GDB can send e.g.,
"QThreadOptions;0;1:TID" without worrying about threads it doesn't
know about yet.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie41e5093b2573f14cf6ac41b0b5804eba75be37e
The previous patch taught GDB about a new
TARGET_WAITKIND_THREAD_CLONED event kind, and made the Linux target
report clone events.
A following patch will teach Linux GDBserver to do the same thing.
But before we get there, we need to teach the remote protocol about
TARGET_WAITKIND_THREAD_CLONED. That's what this patch does. Clone is
very similar to vfork and fork, and the new stop reply is likewise
handled similarly. The stub reports "T05clone:...".
GDBserver core is taught to handle TARGET_WAITKIND_THREAD_CLONED and
forward it to GDB in this patch, but no backend actually emits it yet.
That will be done in a following patch.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: If271f20320d864f074d8ac0d531cc1a323da847f
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
(A good chunk of the problem statement in the commit log below is
Andrew's, adjusted for a different solution, and for covering
displaced stepping too. The testcase is mostly Andrew's too.)
This commit addresses bugs gdb/19675 and gdb/27830, which are about
stepping over a breakpoint set at a clone syscall instruction, one is
about displaced stepping, and the other about in-line stepping.
Currently, when a new thread is created through a clone syscall, GDB
sets the new thread running. With 'continue' this makes sense
(assuming no schedlock):
- all-stop mode, user issues 'continue', all threads are set running,
a newly created thread should also be set running.
- non-stop mode, user issues 'continue', other pre-existing threads
are not affected, but as the new thread is (sort-of) a child of the
thread the user asked to run, it makes sense that the new threads
should be created in the running state.
Similarly, if we are stopped at the clone syscall, and there's no
software breakpoint at this address, then the current behaviour is
fine:
- all-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). While stepping the thread
of interest all the other threads will be allowed to continue. A
newly created thread will be set running, and then stopped once the
thread of interest has completed its step.
- non-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). Other threads might be
running or stopped, but as with the continue case above, the new
thread will be created running. The only possible issue here is
that the new thread will be left running after the initial thread
has completed its stepi. The user would need to manually select
the thread and interrupt it, this might not be what the user
expects. However, this is not something this commit tries to
change.
The problem then is what happens when we try to step over a clone
syscall if there is a breakpoint at the syscall address.
- For both all-stop and non-stop modes, with in-line stepping:
+ user issues 'stepi',
+ [non-stop mode only] GDB stops all threads. In all-stop mode all
threads are already stopped.
+ GDB removes s/w breakpoint at syscall address,
+ GDB single steps just the thread of interest, all other threads
are left stopped,
+ New thread is created running,
+ Initial thread completes its step,
+ [non-stop mode only] GDB resumes all threads that it previously
stopped.
There are two problems in the in-line stepping scenario above:
1. The new thread might pass through the same code that the initial
thread is in (i.e. the clone syscall code), in which case it will
fail to hit the breakpoint in clone as this was removed so the
first thread can single step,
2. The new thread might trigger some other stop event before the
initial thread reports its step completion. If this happens we
end up triggering an assertion as GDB assumes that only the
thread being stepped should stop. The assert looks like this:
infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
- For both all-stop and non-stop modes, with displaced stepping:
+ user issues 'stepi',
+ GDB starts the displaced step, moves thread's PC to the
out-of-line scratch pad, maybe adjusts registers,
+ GDB single steps the thread of interest, [non-stop mode only] all
other threads are left as they were, either running or stopped.
In all-stop, all other threads are left stopped.
+ New thread is created running,
+ Initial thread completes its step, GDB re-adjusts its PC,
restores/releases scratchpad,
+ [non-stop mode only] GDB resumes the thread, now past its
breakpoint.
+ [all-stop mode only] GDB resumes all threads.
There is one problem with the displaced stepping scenario above:
3. When the parent thread completed its step, GDB adjusted its PC,
but did not adjust the child's PC, thus that new child thread
will continue execution in the scratch pad, invoking undefined
behavior. If you're lucky, you see a crash. If unlucky, the
inferior gets silently corrupted.
What is needed is for GDB to have more control over whether the new
thread is created running or not. Issue #1 above requires that the
new thread not be allowed to run until the breakpoint has been
reinserted. The only way to guarantee this is if the new thread is
held in a stopped state until the single step has completed. Issue #3
above requires that GDB is informed of when a thread clones itself,
and of what is the child's ptid, so that GDB can fixup both the parent
and the child.
When looking for solutions to this problem I considered how GDB
handles fork/vfork as these have some of the same issues. The main
difference between fork/vfork and clone is that the clone events are
not reported back to core GDB. Instead, the clone event is handled
automatically in the target code and the child thread is immediately
set running.
Note we have support for requesting thread creation events out of the
target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported
for the new/child thread. That would be sufficient to address in-line
stepping (issue #1), but not for displaced-stepping (issue #3). To
handle displaced-stepping, we need an event that is reported to the
_parent_ of the clone, as the information about the displaced step is
associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED
includes no indication of which thread is the parent that spawned the
new child. In fact, for some targets, like e.g., Windows, it would be
impossible to know which thread that was, as thread creation there
doesn't work by "cloning".
The solution implemented here is to model clone on fork/vfork, and
introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is
similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
that we end up with a new thread in the same process, instead of a new
thread of a new process. Like FORKED and VFORKED, THREAD_CLONED
waitstatuses have a child_ptid property, and the child is held stopped
until GDB explicitly resumes it. This addresses the in-line stepping
case (issues #1 and #2).
The infrun code that handles displaced stepping fixup for the child
after a fork/vfork event is thus reused for THREAD_CLONE, with some
minimal conditions added, addressing the displaced stepping case
(issue #3).
The native Linux backend is adjusted to unconditionally report
TARGET_WAITKIND_THREAD_CLONED events to the core.
Following the follow_fork model in core GDB, we introduce a
target_follow_clone target method, which is responsible for making the
new clone child visible to the rest of GDB.
Subsequent patches will add clone events support to the remote
protocol and gdbserver.
displaced_step_in_progress_thread becomes unused with this patch, but
a new use will reappear later in the series. To avoid deleting it and
readding it back, this patch marks it with attribute unused, and the
latter patch removes the attribute again. We need to do this because
the function is static, and with no callers, the compiler would warn,
(error with -Werror), breaking the build.
This adds a new gdb.threads/stepi-over-clone.exp testcase, which
exercises stepping over a clone syscall, with displaced stepping vs
inline stepping, and all-stop vs non-stop. We already test stepping
over clone syscalls with gdb.base/step-over-syscall.exp, but this test
uses pthreads, while the other test uses raw clone, and this one is
more thorough. The testcase passes on native GNU/Linux, but fails
against GDBserver. GDBserver will be fixed by a later patch in the
series.
Co-authored-by: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
I noticed that on an Ubuntu 20.04 system, after a following patch
("Step over clone syscall w/ breakpoint,
TARGET_WAITKIND_THREAD_CLONED"), the gdb.threads/step-over-exec.exp
was passing cleanly, but still, we'd end up with four new unexpected
GDB core dumps:
=== gdb Summary ===
# of unexpected core files 4
# of expected passes 48
That said patch is making the pre-existing
gdb.threads/step-over-exec.exp testcase (almost silently) expose a
latent problem in gdb/linux-nat.c, resulting in a GDB crash when:
#1 - a non-leader thread execs
#2 - the post-exec program stops somewhere
#3 - you kill the inferior
Instead of #3 directly, the testcase just returns, which ends up in
gdb_exit, tearing down GDB, which kills the inferior, and is thus
equivalent to #3 above.
Vis (after said patch is applied):
$ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true
...
(top-gdb) r
...
(gdb) b main
...
(gdb) r
...
Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69
69 argv0 = argv[0];
(gdb) c
Continuing.
[New Thread 0x7ffff7d89700 (LWP 2506975)]
Other going in exec.
Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28
28 foo ();
(gdb) k
...
Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
393 return m_suspend.waitstatus_pending_p;
(top-gdb) bt
#0 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
#1 0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345
#2 0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564
#3 0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284
#4 0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278
#5 0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247
#6 0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864
#7 0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590
#8 0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911
...
The root of the problem is that when a non-leader LWP execs, it just
changes its tid to the tgid, replacing the pre-exec leader thread,
becoming the new leader. There's no thread exit event for the execing
thread. It's as if the old pre-exec LWP vanishes without trace. The
ptrace man page says:
"PTRACE_O_TRACEEXEC (since Linux 2.5.46)
Stop the tracee at the next execve(2). A waitpid(2) by the
tracer will return a status value such that
status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
If the execing thread is not a thread group leader, the thread
ID is reset to thread group leader's ID before this stop.
Since Linux 3.0, the former thread ID can be retrieved with
PTRACE_GETEVENTMSG."
When the core of GDB processes an exec events, it deletes all the
threads of the inferior. But, that is too late -- deleting the thread
does not delete the corresponding LWP, so we end leaving the pre-exec
non-leader LWP stale in the LWP list. That's what leads to the crash
above -- linux_nat_target::kill iterates over all LWPs, and after the
patch in question, that code will look for the corresponding
thread_info for each LWP. For the pre-exec non-leader LWP still
listed, won't find one.
This patch fixes it, by deleting the pre-exec non-leader LWP (and
thread) from the LWP/thread lists as soon as we get an exec event out
of ptrace.
GDBserver does not need an equivalent fix, because it is already doing
this, as side effect of mourning the pre-exec process, in
gdbserver/linux-low.cc:
else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events)
{
...
/* Delete the execing process and all its threads. */
mourn (proc);
switch_to_thread (nullptr);
The crash with gdb.threads/step-over-exec.exp is not observable on
newer systems, which postdate the glibc change to move "libpthread.so"
internals to "libc.so.6", because right after the exec, GDB traps a
load event for "libc.so.6", which leads to GDB trying to open
libthread_db for the post-exec inferior, and, on such systems that
succeeds. When we load libthread_db, we call
linux_stop_and_wait_all_lwps, which, as the name suggests, stops all
lwps, and then waits to see their stops. While doing this, GDB
detects that the pre-exec stale LWP is gone, and deletes it.
If we use "catch exec" to stop right at the exec before the
"libc.so.6" load event ever happens, and issue "kill" right there,
then GDB crashes on newer systems as well. So instead of tweaking
gdb.threads/step-over-exec.exp to cover the fix, add a new
gdb.threads/threads-after-exec.exp testcase that uses "catch exec".
The test also uses the new "maint info linux-lwps" command if testing
on Linux native, which also exposes the stale LWP problem with an
unfixed GDB.
Also tweak a comment in infrun.c:follow_exec referring to how
linux-nat.c used to behave, as it would become stale otherwise.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I21ec18072c7750f3a972160ae6b9e46590376643
This adds a maintenance command that lets you list all the LWPs under
control of the linux-nat target.
For example:
(gdb) maint info linux-lwps
LWP Ptid Thread ID
560948.561047.0 None
560948.560948.0 1.1
This shows that "560948.561047.0" LWP doesn't map to any thread_info
object, which is bogus. We'll be using this in a testcase in a
following patch.
Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: Ic4e9e123385976e5cd054391990124b7a20fb3f5
When building gdb with -O2, we run into:
...
gdb/tui/tui-disasm.c: In function ‘CORE_ADDR tui_find_disassembly_address \
(gdbarch*, CORE_ADDR, int)’:
gdb/tui/tui-disasm.c:293:7: warning: ‘last_addr’ may be used uninitialized \
in this function [-Wmaybe-uninitialized]
if (last_addr < pc)
^~
...
The warning triggers since commit 72535eb14b ("[gdb/tui] Fix segfault in
tui_find_disassembly_address").
Fix the warning by ensuring that last_addr is initialized at the point of
use:
...
+ last_addr = asm_lines.back ().addr;
if (last_addr < pc)
...
Tested on x86_64-linux.
In tui_find_disassembly_address we find an assert:
...
if (asm_lines.size () < max_lines)
{
if (!possible_new_low.has_value ())
return new_low;
/* Take the best possible match we have. */
new_low = *possible_new_low;
next_addr = tui_disassemble (gdbarch, asm_lines, new_low, max_lines);
last_addr = asm_lines.back ().addr;
gdb_assert (asm_lines.size () >= max_lines);
}
...
The comment right above:
...
/* If we failed to disassemble the required number of lines then the
following walk forward is not going to work, it assumes that
ASM_LINES contains exactly MAX_LINES entries. Instead we should
consider falling back to a previous possible start address in
POSSIBLE_NEW_LOW. */
...
claims that the more strict asm_lines.size () == max_line is required.
Update the assert to reflect this, and move it to after the if because it's
supposed to hold in general, not just when entering the if.
Tested on x86_64-linux.
defs.h declares re_comp, but it shouldn't. If this is needed, it
should be picked up from xregex.h via gdb_regex.h.
Tested by rebuilding.
Reviewed-by: Kevin Buettner <kevinb@redhat.com>
The test currently fails for IEEE 128-bit floating point types. PowerPC
supports the IBM double 128-bit floating point format and IEEE 128-bit
format. The IBM double 128-bit floating point format uses two 64-bit
floating point registers to store the 128-bit value. The IEEE 128-bit
floating point format stores the value in a single 128-bit vector-scalar
register (vsr).
The various floating point values, 32-bit float, 64-bit double, IBM double
128-bit float and IEEE 128-bit floating point numbers are all mapped to the
DWARF fpr numbers. The issue is the IEEE 128-bit floating point values are
actually stored in a vsr not the fprs. This patch changes the register
mapping for the vsrs from the fpr to the vsr registers so the value is
properly accessed by GDB. The functions rs6000_linux_register_to_value,
rs6000_linux_value_to_register, rs6000_linux_value_from_register check if
the value is an IEEE 128-bit floating point value and adjust the register
number as needed. The test in function rs6000_convert_register_p is fixed
so it is only true for floating point values.
This patch fixes three regression tests in gdb.base/store.exp.
The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE
with no regressions.
Overview of issues fixed by the patch.
The primary issue this patch fixes is the DWARF register mapping for
Linux. The changes in ppc-linux-tdep.c fix the DWARF register mapping
issues. The register mapping issue is responsible for two of the
five regression bugs seen in gdb.base/store.exp.
Once the register mapping was fixed, an underlying issue with the unwinding
of the signal trampoline in common-code in ifrun.c was found. This
underlying bug is best described by Ulrich in the following description.
The unwinder bug shows up on platforms where the kernel uses a trampoline
to dispatch "calls to" the signal handler (not just *returns from* the
signal handler). Many platforms use a trampoline for signal return, and
that is working fine, but the only platform I'm (Ulrich) aware of that
uses a trampoline for signal handler calls is (recent kernels for)
PowerPC. I believe the rationale for using a trampoline here
is to improve performance by avoiding unbalancing of the
branch predictor's call/return stack.
However, on PowerPC the bug is dormant as well as it is hidden
by *another* bug that prevents correct unwinding out of the
signal trampoline. This is because the custom CFI for the
trampoline uses a register number (VSCR) that is not ever used
by compiler-generated CFI, and that particular register is
mapped to an invalid number by the current PowerPC DWARF mapper.
The underlying unwinder bug is exposed by the "new" regression failures
in gdb.base/sigstep.exp. These failures were previously masked by
the fact that GDB was not seeing a valid frame when it tried to unwind
the frames. The sigstep.exp test is specifically testing stepping into
a signal handler. With the correct DWARF register mapping in place,
specifically the VSCR mapping, the signal trampoline code now unwinds to a
valid frame exposing the pre-existing bug in how the signal handler on
PowerPC works. The one line change infrun.c fixes the exiting bug in
the common-code for platforms that use a trampoline to dispatch calls
to the signal handler by not stopping in the SIGTRAMP_FRAME.
Detailed description of the DWARF register mapping fix.
The PowerPC DWARF register mapping is the same for the .eh_frame and
.debug_frame on Linux. PowerPC uses different mapping for .eh_frame and
.debug_frame on other operating systems. The current GDB support for
mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum and
rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct for Linux.
The files have some legacy mappings for spe_acc, spefscr, EV which was
removed from GCC in 2017.
This patch adds a two new functions rs6000_linux_dwarf2_reg_to_regnum,
and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c to handle
the DWARF register mappings on Linux. Function
rs6000_linux_dwarf2_reg_to_regnum is installed for both gdb_dwarf_to_regnum
and gdbarch_stab_reg_to_regnum since the mappings are the same.
The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated to
call set_gdbarch_dwarf2_reg_to_regnum map the new function
rs6000_linux_dwarf2_reg_to_regnum for the architecture. Similarly,
dwarf2_frame_set_adjust_regnum is called to map
rs6000_linux_adjust_frame_regnum into the architecture.
Additional detail on the signal handling fix.
The specific sequence of events for handling a signal on most
architectures is as follows:
1) Some code is running when a signal arrives.
2) The kernel handles the signal and dispatches to the handler.
...
However on PowerPC the sequence of events is:
1) Some code is running when a signal arrives.
2) The kernel handles the signal and dispatches to the trampoline.
3) The trampoline performs a normal function call to the handler.
...
We want the "nexti" to step into, not over, signal handlers invoked by
the kernel. This is the case for most platforms as the kernel puts a
signal trampoline frame onto the stack to handle proper return after the
handler. However, on some platforms such as PowerPC, the kernel actually
uses a trampoline to handle *invocation* of the handler. We do not
want GDB to stop in the SIGTRAMP_FRAME. The issue is fixed in function
process_event_stop_test by adding a check that the frame is not a
SIGTRAMP_FRAME to the if statement to stop in a subroutine call. This
prevents GDB from erroneously detecting the trampoline invocation as a
subroutine call.
This patch fixes two regression test failures in gdb.base/store.exp.
The patch then fixes an exposed, dormant, signal handling issue that
is exposed in the signal handling test gdb.base/sigstep.exp.
The patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10 with
no new regressions. Note, only two of the five failures in store.exp
are fixed. The remaining three failures are fixed in a following
patch.
I noticed that if GDB is using a remote or extended-remote target,
then, if an inferior call caused a new thread to appear, or for an
existing thread to exit, then these events are not reported to the
user.
The problem is that for these targets GDB relies on a call to
update_thread_list to learn about changes to the inferior's thread
list.
If GDB doesn't pass through the normal stop code then GDB will not
call update_thread_list, and so will not report changes in the thread
list.
This commit adds an additional update_thread_list call, after which
thread events are correctly reported.
I noticed that sometimes the value returned by $_inferior_thread_count
can become out of sync with the actual thread count of the inferior,
and will disagree with the number of threads reported by 'info
threads'. This commit fixes this issue.
The cause of the problem is that 'info threads' includes a call to
update_thread_list, this can be seen in print_thread_info_1 in
thread.c, while $_inferior_thread_count doesn't include a similar
call, see the function inferior_thread_count_make_value also in
thread.c.
Of course, this is only a problem when GDB is running on a target that
relies on update_thread_list calls to learn about new threads,
e.g. remote or extended-remote targets. Native targets generally
learn about new threads as soon as they appear and will not have this
problem.
I ran into this issue when writing a test for the next commit which
uses inferior function calls to add an remove threads from an
inferior. But for testing I've made use of non-stop mode and
asynchronous inferior execution; by reading the inferior state I can
know when a new thread has been created, at which point I can print
$_inferior_thread_count while the inferior is still running. This is
important, if I stop the inferior then GDB will pass through an
update_thread_list call in the normal stop code, which will
synchronise the thread list, after which $_inferior_thread_count will
report the correct value.
With this change in place $_inferior_thread_count is now correct.
Add a new command completer function for the disassemble command.
There are two things that this completion function changes. First,
after the previous commit, the new function calls skip_over_slash_fmt,
which means that hitting tab after entering a /OPT flag now inserts a
space ready to start typing the address to disassemble at:
(gdb) disassemble /r<TAB>
(gdb) disassemble /r <CURSOR>
But also, we now get symbol completion after a /OPT option set,
previously this would do nothing:
(gdb) disassemble /r mai<TAB>
But now:
(gdb) disassemble /r mai<TAB>
(gdb) disassemble /r main <CURSOR>
Which was my main motivation for working on this commit.
However, I have made a second change in the completion function.
Currently, the disassemble command calls the generic
location_completer function, however, the disassemble docs say:
Note that the 'disassemble' command's address arguments are specified
using expressions in your programming language (*note Expressions:
Expressions.), not location specs (*note Location Specifications::).
So, for example, if you want to disassemble function 'bar' in file
'foo.c', you must type 'disassemble 'foo.c'::bar' and not 'disassemble
foo.c:bar'.
And indeed, if I try:
(gdb) disassemble hello.c:main
No symbol "hello" in current context.
(gdb) disassemble hello.c::main
No symbol "hello" in current context.
(gdb) disassemble 'hello.c'::main
Dump of assembler code for function main:
... snip ...
But, if I do this:
(gdb) disassemble hell<TAB>
(gdb) disassemble hello.c:<CURSOR>
which is a consequence of using the location_completer function. So
in this commit, after calling skip_over_slash_fmt, I forward the bulk
of the disassemble command completion to expression_completer. Now
when I try this:
(gdb) disassemble hell<TAB>
gives nothing, which I think is an improvement. There is one slight
disappointment, if I do:
(gdb) disassemble 'hell<TAB>
I still get nothing. I had hoped that this would expand to:
'hello.c':: but I guess this is a limitation of the current
expression_completer implementation, however, I don't think this is a
regression, the previous expansion was just wrong. Fixing
expression_completer is out of scope for this commit.
I've added some disassembler command completion tests, and also a test
that disassembling using 'FILE'::FUNC syntax works, as I don't think
that is tested anywhere.
Move the function skip_over_slash_fmt into completer.c, and make it
extern, with a declaration in completer.h.
This is a refactor in order to support the next commit. I've not
changed any of the code in skip_over_slash_fmt.
There should be no user visible changes after this commit.
The disassembler gained a new /b flag in this commit:
commit d4ce49b7ac
Date: Tue Jun 21 20:23:35 2022 +0100
gdb: disassembler opcode display formatting
The /b and /r flags result in the instruction opcodes displayed in
different formats, so it's not possible to have both at the same
time. Currently the /b flag overrides the /r flag.
We have a similar situation with the /m and /s flags, but here, if the
user tries to use both flags then they will get an error.
I think the error is clearer, so in this commit I propose that we add
an error if /r and /b are both used.
Obviously this change breaks backwards compatibility. I don't have a
compelling argument for why we should make the change beyond my
feeling that it was a mistake not to add this error from the start,
and that the new behaviour is better.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
When compiling gdb with -fsanitize=address on ARM, I get a crash in test
gdb.arch/arm-disp-step.exp, reproduced easily with:
$ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.arch/arm-disp-step/arm-disp-step -ex "break *test_call_end"
Reading symbols from testsuite/outputs/gdb.arch/arm-disp-step/arm-disp-step...
=================================================================
==23295==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb4a14fd1 at pc 0x01a48871 bp 0xbeab8490 sp 0xbeab8494
Since it doesn't require running the program, it can be reproduced locally on a
dev machine other than ARM, after acquiring the test binary.
The length of the allocate buffer `buf` is 1, and we try to extract an
integer of size 2 from it. The length of 1 comes from the subtraction
`bpaddr - boundary`. Normally, on ARM, all instructions are aligned on
a multiple of 2, so it's weird for this subtraction to result in 1. In
this case, boundary comes from the result of find_pc_partial_function
returning 0x549:
(gdb) p/x bpaddr
$2 = 0x54a
(gdb) p/x boundary
$3 = 0x549
(gdb) p/x bpaddr - boundary
$4 = 0x1
0x549 is the address of the test_call_subr label, 0x548, with the thumb
bit enabled. Before doing some math with the address, I think we need
to strip the thumb bit, like is done elsewhere (for instance for bpaddr
earlier in the same function).
I wonder if find_pc_partial_function should do that itself, in order to
return an address that is suitable for arithmetic. In any case, that
would be a change with a broad impact, so for now just fix the issue
locally.
After the patch:
$ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.arch/arm-disp-step/arm-disp-step -ex "break *test_call_end"
Reading symbols from testsuite/outputs/gdb.arch/arm-disp-step/arm-disp-step...
Breakpoint 1 at 0x54a: file /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.arch/arm-disp-step.S, line 103.
Change-Id: I74fc458dbea0d2c1e1f5eadd90755188df089288
Approved-By: Luis Machado <luis.machado@arm.com>
common-defs.h has a few defines that I suspect were used during the
transition to C++. These aren't needed any more, so remove them.
Tested by rebuilding.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
When resizing from a big to small terminal size, and you have a
TUI python window that would then be outside of the new size,
valgrind shows this error:
==3389== Invalid read of size 1
==3389== at 0xC3DFEE: wnoutrefresh (lib_refresh.c:167)
==3389== by 0xC3E3C9: wrefresh (lib_refresh.c:63)
==3389== by 0xA9766C: tui_unhighlight_win(tui_win_info*) (tui-wingeneral.c:134)
==3389== by 0x98921C: tui_py_window::rerender() (py-tui.c:183)
==3389== by 0xA8C23C: tui_layout_split::apply(int, int, int, int, bool) (tui-layout.c:1030)
==3389== by 0xA8C2A2: tui_layout_split::apply(int, int, int, int, bool) (tui-layout.c:1033)
==3389== by 0xA8C23C: tui_layout_split::apply(int, int, int, int, bool) (tui-layout.c:1030)
==3389== by 0xA8B1F8: tui_apply_current_layout(bool) (tui-layout.c:81)
==3389== by 0xA95CDB: tui_resize_all() (tui-win.c:525)
==3389== by 0xA95D1E: tui_async_resize_screen(void*) (tui-win.c:562)
==3389== by 0x6B855D: invoke_async_signal_handlers() (async-event.c:234)
==3389== by 0xC0CEF8: gdb_do_one_event(int) (event-loop.cc:199)
==3389== Address 0x115cc214 is 1,332 bytes inside a block of size 2,240 free'd
==3389== at 0x4A0A430: free (vg_replace_malloc.c:446)
==3389== by 0xC3CF7D: _nc_freewin (lib_newwin.c:121)
==3389== by 0xA8B1C6: tui_apply_current_layout(bool) (tui-layout.c:78)
==3389== by 0xA95CDB: tui_resize_all() (tui-win.c:525)
==3389== by 0xA95D1E: tui_async_resize_screen(void*) (tui-win.c:562)
==3389== by 0x6B855D: invoke_async_signal_handlers() (async-event.c:234)
==3389== by 0xC0CEF8: gdb_do_one_event(int) (event-loop.cc:199)
==3389== by 0x8E40E9: captured_command_loop() (main.c:407)
==3389== by 0x8E5E54: gdb_main(captured_main_args*) (main.c:1324)
==3389== by 0x62AC04: main (gdb.c:39)
It's because tui_py_window::m_inner_window still has the outside
coordinates, and wnoutrefresh then does an out-of-bounds access.
Fix this by resetting m_inner_window on every resize, it will anyways
be recreated in the next rerender call.
Approved-By: Andrew Burgess <aburgess@redhat.com>
In AIX unused or constant variables are collected as garbage by the linker and in the dwarf dump
an address with all f's in hexadecimal are assigned. Hence the testcase fails with many failures stating
it cannot access memory.
This patch is a small change to get it working in AIX as well.
Recently added test-case gdb.dwarf2/dw2-gas-workaround.exp:
- passes when gdb is configured using $(cd ../src; pwd)/configure, but
- fails when using ../src/configure.
Fix this by making the matching more precise:
...
- -re -wrap "$objdir.*" {
+ -re -wrap "name_for_id = $objdir/$srcfile\r\n.*" {
...
such that we only fail on the line:
...
[symtab-create] start_subfile: name = dw2-lines.c, name_for_id = \
/data/vries/gdb/leap-15-4/build/gdb/testsuite/dw2-lines.c^M
...
Reported-By: Carl Love <cel@us.ibm.com>
While working on background reading of DWARF, I came across the
DWZ-reading code. This code can query the user (via the debuginfod
support) -- something that cannot be done off the main thread.
Looking into it, I realized that this code can be run much earlier,
avoiding this problem. Digging a bit deeper, I also found a
discrepancy here between how the DWARF reader works in "readnow" mode
as compared to the normal modes.
This patch cleans this up by trying to read the DWZ file earlier, and
also by having the DWARF reader convert any exception here into a
warning. This unifies the various cases, but also makes it so that
errors do not prevent gdb from continuing on to the extent possible.
Regression tested on x86-64 Fedora 38.
On AlmaLinux 9.2 powerpc64le I run into:
...
(gdb) PASS: gdb.ada/array_return.exp: continuing to Create_Small_Float_Vector
finish^M
Run till exit from #0 pck.create_small_float_vector () at pck.adb:30^M
0x00000000100022d4 in p () at p.adb:25^M
25 Vector := Create_Small_Float_Vector;^M
Value returned is $3 = (2.80259693e-45, 2.80259693e-45)^M
(gdb) FAIL: gdb.ada/array_return.exp: value printed by finish of Create_Small_Float_Vector
...
while this is expected:
...
Value returned is $3 = (4.25, 4.25)^M
...
The problem is here in ppc64_aggregate_candidate:
...
if (!get_array_bounds (type, &low_bound, &high_bound))
return -1;
count *= high_bound - low_bound
...
The array type (containing 2 elements) is:
...
type Small_Float_Vector is array (1 .. 2) of Float;
...
so we have:
...
(gdb) p low_bound
$1 = 1
(gdb) p high_bound
$2 = 2
...
but we calculate the number of elements in the array using
"high_bound - low_bound", which is 1.
Consequently, gdb fails to correctly classify the type as a ELFv2 homogeneous
aggregate.
Fix this by calculating the number of elements in the array by using
"high_bound - low_bound + 1" instead.
Furthermore, high_bound can (in general, though perhaps not here) also be
smaller than low_bound, so to be safe take that into account as well:
...
LONGEST nr_array_elements = (low_bound > high_bound
? 0
: (high_bound - low_bound + 1));
count *= nr_array_elements;
...
Tested on powerpc64le-linux.
Approved-By: Ulrich Weigand <uweigand@de.ibm.com>
PR tdep/31015
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31015
When running test-case gdb.tui/tui-layout-asm-short-prog.exp on AlmaLinux 9.2
ppc64le, I run into:
...
FAIL: gdb.tui/tui-layout-asm-short-prog.exp: check asm box contents
...
The problem is that we get:
...
7 [ No Assembly Available ]
...
because tui_get_begin_asm_address doesn't succeed.
In more detail, tui_get_begin_asm_address calls:
...
find_line_pc (sal.symtab, sal.line, &addr);
...
with:
...
(gdb) p *sal.symtab
$5 = {next = 0x130393c0, m_compunit = 0x130392f0, m_linetable = 0x0,
filename = "tui-layout-asm-short-prog.S",
filename_for_id = "$gdb/build/gdb/testsuite/tui-layout-asm-short-prog.S",
m_language = language_asm, fullname = 0x0}
(gdb) p sal.line
$6 = 1
...
The problem is the filename_for_id which is the source file prefixed with the
compilation dir rather than the source dir.
This is due to faulty debug info generated by gas, PR28629:
...
<1a> DW_AT_name : tui-layout-asm-short-prog.S
<1e> DW_AT_comp_dir : $gdb/build/gdb/testsuite
<22> DW_AT_producer : GNU AS 2.35.2
...
The DW_AT_name is relative, and it's relative to the DW_AT_comp_dir entry,
making the effective name $gdb/build/gdb/testsuite/tui-layout-asm-short-prog.S.
The bug is fixed starting version 2.38, where we get instead:
...
<1a> DW_AT_name :
$gdb/src/gdb/testsuite/gdb.tui/tui-layout-asm-short-prog.S
<1e> DW_AT_comp_dir : $gdb/build/gdb/testsuite
<22> DW_AT_producer : GNU AS 2.38
...
Work around the faulty debug info by constructing the filename_for_id using
the second directory from the directory table in the .debug_line header:
...
The Directory Table (offset 0x22, lines 2, columns 1):
Entry Name
0 $gdb/build/gdb/testsuite
1 $gdb/src/gdb/testsuite/gdb.tui
...
Note that the used gas contains a backport of commit 3417bfca67 ("GAS:
DWARF-5: Ensure that the 0'th entry in the directory table contains the
current working directory."), because directory 0 is correct. With the
unpatched 2.35.2 release the directory 0 entry is incorrect: it's a copy of
entry 1.
Add a dwarf assembly test-case that reflects the debug info as generated by
unpatched gas 2.35.2.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
This patch implements the DAP setVariable request.
setVariable is a bit odd in that it specifies the variable to modify
by passing in the variable's container and the name of the variable.
This approach can't handle variable shadowing (there are a couple of
open DAP bugs on this topic), so this patch renames duplicates to
avoid the problem.
I stumbled across a few spots that mention that a function
"invalidates frame" and also assignments of NULL to a frame_info_ptr.
This code isn't harmful, but is also unnecessary since the
introduction of frame_info_ptr -- nowadays frame invalidations are
handled automatically.
Regression tested on x86-64 Fedora 38.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
On a big-endian ARM machine, the "return" command resulted in the
wrong value being returned when the function had a fixed-point return
type. This patch fixes the problem by unpacking and repacking the
fixed-point type appropriately.
Approved-By: Luis Machado <luis.machado@arm.com>