Currently, each target backend is responsible for printing "[Thread
...exited]" before deleting a thread. This leads to unnecessary
differences between targets, like e.g. with the remote target, we
never print such messages, even though we do print "[New Thread ...]".
E.g., debugging the gdb.threads/attach-many-short-lived-threads.exp
with gdbserver, letting it run for a bit, and then pressing Ctrl-C, we
currently see:
(gdb) c
Continuing.
^C[New Thread 3850398.3887449]
[New Thread 3850398.3887500]
[New Thread 3850398.3887551]
[New Thread 3850398.3887602]
[New Thread 3850398.3887653]
...
Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
(gdb)
Above, we only see "New Thread" notifications, even though threads
were deleted.
After this patch, we'll see:
(gdb) c
Continuing.
^C[Thread 3558643.3577053 exited]
[Thread 3558643.3577104 exited]
[Thread 3558643.3577155 exited]
[Thread 3558643.3579603 exited]
...
[New Thread 3558643.3597415]
[New Thread 3558643.3600015]
[New Thread 3558643.3599965]
...
Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
(gdb) q
This commit fixes this by moving the thread exit printing to common
code instead, triggered from within delete_thread (or rather,
set_thread_exited).
There's one wrinkle, though. While most targest want to print:
[Thread ... exited]
the Windows target wants to print:
[Thread ... exited with code <exit_code>]
... and sometimes wants to suppress the notification for the main
thread. To address that, this commits adds a delete_thread_with_code
function, only used by that target (so far).
This fix was originally posted as part of a larger series:
https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-1-pedro@palves.net/
But didn't really need to be part of that series. In order to get
this fix merged sooner, I (Andrew Burgess) have rebased this commit
outside of the original series. Any bugs introduced while splitting
this patch out and rebasing, are entirely my own.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30129
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
After this commit:
commit a78ef87574
Date: Wed Jun 22 18:10:00 2022 +0100
Always emit =thread-exited notifications, even if silent
The function mi_interp::on_thread_exited (or mi_thread_exit as the
function was called back then) no longer makes use of the "silent"
parameter.
As a result there is no difference between inferior::clear_thread_list
with silent true or false, because:
- None of the interpreter ::on_thread_exited functions rely on the
silent parameter, and
- None of GDB's thread_exit observers rely on the silent parameter
either.
This commit removes the silent parameter from
inferior::clear_thread_list, and makes the function always silent.
This commit was originally part of a larger series:
https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-1-pedro@palves.net/
But didn't really need to be part of that series. I had an interest
in seeing this patch merged:
https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-31-pedro@palves.net/
Which also didn't really need to be part of the larger series, but
does depend, at least a little, on this commit. In order to get the
fix I'm interested in merged quicker, I (Andrew Burgess) have rebased
this commit outside of the original series. Any bugs introduced while
splitting this patch out and rebasing, are entirely my own.
There should be no user visible changes after this commit.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
This changes add_thread_with_info to accept a unique_ptr, making it
clear that it takes ownership of the passed-in pointer.
I can't test the AIX or Darwin changes, but I think they are
relatively obvious.
Add the maybe_switch_inferior function, which ensures that the given
inferior is the current one. Return an instantiated
scoped_restore_current_thread object only we actually needed to switch
inferior.
Returning a scoped_restore_current_thread requires it to be
move-constructible, so give it a move constructor.
Change-Id: I1231037102ed6166f2530399e8257ad937fb0569
Reviewed-By: Pedro Alves <pedro@palves.net>
Make find_thread_ptid (the overload that takes a process_stratum_target)
a method of process_stratum_target.
Change-Id: Ib190a925a83c6b93e9c585dc7c6ab65efbdd8629
Reviewed-By: Tom Tromey <tom@tromey.com>
Make find_thread_ptid (the overload that takes an inferior) a method of
struct inferior.
Change-Id: Ie5b9fa623ff35aa7ddb45e2805254fc8e83c9cd4
Reviewed-By: Tom Tromey <tom@tromey.com>
I noticed that breakpoint::print_recreate_thread was printing the
global thread-id. This function is used to implement the 'save
breakpoints' command, and should be writing out suitable CLI commands
for recreating the current breakpoints. The CLI does not use global
thread-ids, but instead uses the inferior specific thread-ids,
e.g. "2.1".
After some discussion on the mailing list it was suggested that the
most consistent solution would be for the saved breakpoints file to
always contain the inferior-qualified thread-id, so the file would
include "thread 1.1" instead of just "thread 1", even when there is
only a single inferior.
So, this commit adds print_full_thread_id, which is just like the
existing print_thread_id, only it always prints the inferior-qualified
thread-id.
I then update the existing print_thread_id to make use of this new
function, and finally, I update breakpoint::print_recreate_thread to
also use this new function.
There's a multi-inferior test that confirms the saved breakpoints file
correctly includes the fully-qualified thread-id, and I've also
updated the single inferior test gdb.base/save-bp.exp to have it
validate that the saved breakpoints file includes the
inferior-qualified thread-id, even for this single inferior case.
Background
----------
When a thread-specific breakpoint is deleted as a result of the
specific thread exiting the function remove_threaded_breakpoints is
called which sets the disposition of the breakpoint to
disp_del_at_next_stop and sets the breakpoint number to 0. Setting
the breakpoint number to zero has the effect of hiding the breakpoint
from the user. We also print a message indicating that the breakpoint
has been deleted.
It was brought to my attention during a review of another patch[1]
that setting a breakpoints number to zero will suppress the MI
breakpoint-deleted notification for that breakpoint, and indeed, this
can be seen to be true, in delete_breakpoint, if the breakpoint number
is zero, then GDB will not notify the breakpoint_deleted observer.
It seems wrong that a user created, thread-specific breakpoint, will
have a =breakpoint-created notification, but will not have a
=breakpoint-deleted notification. I suspect that this is a bug.
[1] https://sourceware.org/pipermail/gdb-patches/2023-February/196560.html
The First Problem
-----------------
During my initial testing I wanted to see how GDB handled the
breakpoint after it's number was set to zero. To do this I created
the testcase gdb.threads/thread-bp-deleted.exp. This test creates a
worker thread, which immediately exits. After the worker thread has
exited the main thread spins in a loop.
In GDB I break once the worker thread has been created and place a
thread-specific breakpoint, then use 'continue&' to resume the
inferior in non-stop mode. The worker thread then exits, but the main
thread never stops - instead it sits in the spin. I then tried to use
'maint info breakpoints' to see what GDB thought of the
thread-specific breakpoint.
Unfortunately, GDB crashed like this:
(gdb) continue&
Continuing.
(gdb) [Thread 0x7ffff7c5d700 (LWP 1202458) exited]
Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.
maint info breakpoints
... snip some output ...
Fatal signal: Segmentation fault
----- Backtrace -----
0x5ffb62 gdb_internal_backtrace_1
../../src/gdb/bt-utils.c:122
0x5ffc05 _Z22gdb_internal_backtracev
../../src/gdb/bt-utils.c:168
0x89965e handle_fatal_signal
../../src/gdb/event-top.c:964
0x8997ca handle_sigsegv
../../src/gdb/event-top.c:1037
0x7f96f5971b1f ???
/usr/src/debug/glibc-2.30-2-gd74461fa34/nptl/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0xe602b0 _Z15print_thread_idP11thread_info
../../src/gdb/thread.c:1439
0x5b3d05 print_one_breakpoint_location
../../src/gdb/breakpoint.c:6542
0x5b462e print_one_breakpoint
../../src/gdb/breakpoint.c:6702
0x5b5354 breakpoint_1
../../src/gdb/breakpoint.c:6924
0x5b58b8 maintenance_info_breakpoints
../../src/gdb/breakpoint.c:7009
... etc ...
As the thread-specific breakpoint is set to disp_del_at_next_stop, and
GDB hasn't stopped yet, then the breakpoint still exists in the global
breakpoint list.
The breakpoint will not show in 'info breakpoints' as its number is
zero, but it will show in 'maint info breakpoints'.
As GDB prints the breakpoint, the thread-id for the breakpoint is
printed as part of the 'stop only in thread ...' line. Printing the
thread-id involves calling find_thread_global_id to convert the global
thread-id into a thread_info*. Then calling print_thread_id to
convert the thread_info* into a string.
The problem is that find_thread_global_id returns nullptr as the
thread for the thread-specific breakpoint has exited. The
print_thread_id assumes it will be passed a non-nullptr. As a result
GDB crashes.
In this commit I've added an assert to print_thread_id (gdb/thread.c)
to check that the pointed passed in is not nullptr. This assert would
have triggered in the above case before GDB crashed.
MI Notifications: The Dangers Of Changing A Breakpoint's Number
---------------------------------------------------------------
Currently the delete_breakpoint function doesn't trigger the
breakpoint_deleted observer for any breakpoint with the number zero.
There is a comment explaining why this is the case in the code; it's
something about watchpoints. But I did consider just removing the 'is
the number zero' guard and always triggering the breakpoint_deleted
observer, figuring that I'd then fix the watchpoint issue some other
way.
But I realised this wasn't going to be good enough. When the MI
notification was delivered the number would be zero, so any frontend
parsing the notifications would not be able to match
=breakpoint-deleted notification to the earlier =breakpoint-created
notification.
What this means is that, at the point the breakpoint_deleted observer
is called, the breakpoint's number must be correct.
MI Notifications: The Dangers Of Delaying Deletion
--------------------------------------------------
The test I used to expose the above crash also brought another problem
to my attention. In the above test we used 'continue&' to resume,
after which a thread exited, but the inferior didn't stop. Recreating
the same test in the MI looks like this:
-break-insert -p 2 main
^done,bkpt={number="2",type="breakpoint",disp="keep",...<snip>...}
(gdb)
-exec-continue
^running
*running,thread-id="all"
(gdb)
~"[Thread 0x7ffff7c5d700 (LWP 987038) exited]\n"
=thread-exited,id="2",group-id="i1"
~"Thread-specific breakpoint 2 deleted - thread 2 no longer in the thread list.\n"
At this point the we have a single thread left, which is still
running:
-thread-info
^done,threads=[{id="1",target-id="Thread 0x7ffff7c5eb80 (LWP 987035)",name="thread-bp-delet",state="running",core="4"}],current-thread-id="1"
(gdb)
Notice that we got the =thread-exited notification from GDB as soon as
the thread exited. We also saw the CLI line from GDB, the line
explaining that breakpoint 2 was deleted. But, as expected, we didn't
see the =breakpoint-deleted notification.
I say "as expected" because the number was set to zero. But, even if
the number was not set to zero we still wouldn't see the
notification. The MI notification is driven by the breakpoint_deleted
observer, which is only called when we actually delete the breakpoint,
which is only done the next time GDB stops.
Now, maybe this is fine. The notification is delivered a little
late. But remember, by setting the number to zero the breakpoint will
be hidden from the user, for example, the breakpoint is removed from
the MI's -break-info command output.
This means that GDB is in a position where the breakpoint doesn't show
up in the breakpoint table, but a =breakpoint-deleted notification has
not yet been sent out. This doesn't seem right to me.
What this means is that, when the thread exits, we should immediately
be sending out the =breakpoint-deleted notification. We should not
wait for GDB to next stop before sending the notification.
The Solution
------------
My proposed solution is this; in remove_threaded_breakpoints, instead
of setting the disposition to disp_del_at_next_stop and setting the
number to zero, we now just call delete_breakpoint directly.
The notification will now be sent out immediately; as soon as the
thread exits.
As the number has not changed when delete_breakpoint is called, the
notification will have the correct number.
And as the breakpoint is immediately removed from the breakpoint list,
we no longer need to worry about 'maint info breakpoints' trying to
print the thread-id for an exited thread.
My only concern is that calling delete_breakpoint directly seems so
obvious that I wonder why the original patch (that added
remove_threaded_breakpoints) didn't take this approach. This code was
added in commit 49fa26b041, but the commit message offers no clues
to why this approach was taken, and the original email thread offers
no insights either[2]. There are no test regressions after making
this change, so I'm hopeful that this is going to be fine.
[2] https://sourceware.org/pipermail/gdb-patches/2013-September/106493.html
The Complication
----------------
Of course, it couldn't be that simple.
The script gdb.python/py-finish-breakpoint.exp had some regressions
during testing.
The problem was with the FinishBreakpoint.out_of_scope callback
implementation. This callback is supposed to trigger whenever the
FinishBreakpoint goes out of scope; and this includes when the thread
for the breakpoint exits.
The problem I ran into is the Python FinishBreakpoint implementation.
Specifically, after this change I was loosing some of the out_of_scope
calls.
The problem is that the out_of_scope call (of which I'm interested) is
triggered from the inferior_exit observer. Before my change the
observers were called in this order:
thread_exit
inferior_exit
breakpoint_deleted
The inferior_exit would trigger the out_of_scope call.
After my change the breakpoint_deleted notification (for
thread-specific breakpoints) occurs earlier, as soon as the
thread-exits, so now the order is:
thread_exit
breakpoint_deleted
inferior_exit
Currently, after the breakpoint_deleted call the Python object
associated with the breakpoint is released, so, when we get to the
inferior_exit observer, there's no longer a Python object to call the
out_of_scope method on.
My solution is to follow the model for how bpfinishpy_pre_stop_hook
and bpfinishpy_post_stop_hook are called, this is done from
gdbpy_breakpoint_cond_says_stop in py-breakpoint.c.
I've now added a new bpfinishpy_pre_delete_hook
gdbpy_breakpoint_deleted in py-breakpoint.c, and from this new hook
function I check and where needed call the out_of_scope method.
With this fix in place I now see the
gdb.python/py-finish-breakpoint.exp test fully passing again.
Testing
-------
Tested on x86-64/Linux with unix, native-gdbserver, and
native-extended-gdbserver boards.
New tests added to covers all the cases I've discussed above.
Approved-By: Pedro Alves <pedro@palves.net>
This helps resolve some cyclic include problem later in the series.
The only language-related thing frame.h needs is enum language, and that
is in defs.h.
Doing so reveals that a bunch of files were relying on frame.h to
include language.h, so fix the fallouts here and there.
Change-Id: I178a7efec1953c2d088adb58483bade1f349b705
Reviewed-By: Bruno Larsen <blarsen@redhat.com>
I noticed that pc_in_unmapped_range had a weird return type -- it was
returning a CORE_ADDR but intending to return a bool. This patch
changes all the pc_in_* functions to return bool instead.
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
Add a new convenience variable $_inferior_thread_count that contains
the number of live (non-exited) threads in the current inferior. This
can be used in command scripts, or breakpoint conditions, etc to
adjust the behaviour for multi-threaded inferiors.
This value is only stable in all-stop mode. In non-stop mode, where
new threads can be started, and existing threads exit, at any time,
this convenience variable can give a different value each time it is
evaluated.
Turns out we'll be gaining a new use of this function very soon, the
incoming AMDGPU port needs it. Let's add it back, as it isn't really
hurting anything.
This reverts commit 39b8a8090e.
Now that filtered and unfiltered output can be treated identically, we
can unify the printf family of functions. This is done under the name
"gdb_printf". Most of this patch was written by script.
A number of spots call printf_unfiltered only because they are in code
that should not be interrupted by the pager. However, I believe these
cases are all handled by infrun's blanket ban on paging, and so can be
converted to the default (_filtered) API.
After this patch, I think all the remaining _unfiltered calls are ones
that really ought to be. A few -- namely in complete_command -- could
be replaced by a scoped assignment to pagination_enabled, but for the
remainder, the code seems simple enough like this.
No kind of internal var uses it remove it. This makes the transition to
using a variant easier, since we don't need to think about where this
should be called (in a destructor or not), if it can throw, etc.
Change-Id: Iebbc867d1ce6716480450d9790410d6684cbe4dd
While working on function calls, I realized that the thread_fsm member
of struct thread_info is a raw pointer to a resource it owns. This
commit changes the type of the thread_fsm member to a std::unique_ptr in
order to signify this ownership relationship and slightly ease resource
management (no need to manually call delete).
To ensure consistent use, the field is made a private member
(m_thread_fsm). The setter method (set_thread_fsm) can then check
that it is incorrect to associate a FSM to a thread_info object if
another one is already in place. This is ensured by an assertion.
The function run_inferior_call takes an argument as a pointer to a
call_thread_fsm and installs it in it in a thread_info instance. Also
change this function's signature to accept a unique_ptr in order to
signify that the ownership of the call_thread_fsm is transferred during
the call.
No user visible change expected after this commit.
Tested on x86_64-linux with no regression observed.
Change-Id: Ia1224f72a4afa247801ce6650ce82f90224a9ae8
Same idea as 0fab795564 ("gdb: use ptid_t::to_string in infrun debug
messages"), but throughout GDB.
Change-Id: I62ba36eaef29935316d7187b9b13d7b88491acc1
While working on another patch I wanted to add some extra debug
information to the attach_command function. This required me to add a
new function to convert the thread_info::state variable to a string.
The new debug might be useful to others, and the state to string
function might be useful in other locations, so I thought I'd merge
it.
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
Add new commands:
set debug threads on|off
show debug threads
Prints additional debug information relating to thread creation and
deletion.
GDB already announces when threads are created of course.... most of
the time, but sometimes threads are added silently, in which case this
debug message is the only mechanism to see the thread being added.
Also, though GDB does announce when a thread exits, it doesn't
announce when the thread object is deleted, I've added a debug message
for that.
Additionally, having message printed through the debug system will
cause the messages to be nested to an appropriate depth when other
debug sub-systems are turned on (especially things like `infrun` and
`lin-lwp`).
This adds a 'task apply' command, which is the Ada tasking analogue of
'thread apply'. Unlike 'thread apply', it doesn't offer the
'ascending' flag; but otherwise it's essentially the same.
I stumbled on a bug caused by the fact that a code path read
target_waitstatus::value::sig (expecting it to contain a gdb_signal
value) while target_waitstatus::kind was TARGET_WAITKIND_FORKED. This
meant that the active union field was in fact
target_waitstatus::value::related_pid, and contained a ptid. The read
signal value was therefore garbage, and that caused GDB to crash soon
after. Or, since that GDB was built with ubsan, this nice error
message:
/home/simark/src/binutils-gdb/gdb/linux-nat.c:1271:12: runtime error: load of value 2686365, which is not a valid value for type 'gdb_signal'
Despite being a large-ish change, I think it would be nice to make
target_waitstatus safe against that kind of bug. As already done
elsewhere (e.g. dynamic_prop), validate that the type of value read from
the union matches what is supposed to be the active field.
- Make the kind and value of target_waitstatus private.
- Make the kind initialized to TARGET_WAITKIND_IGNORE on
target_waitstatus construction. This is what most users appear to do
explicitly.
- Add setters, one for each kind. Each setter takes as a parameter the
data associated to that kind, if any. This makes it impossible to
forget to attach the associated data.
- Add getters, one for each associated data type. Each getter
validates that the data type fetched by the user matches the wait
status kind.
- Change "integer" to "exit_status", "related_pid" to "child_ptid",
just because that's more precise terminology.
- Fix all users.
That last point is semi-mechanical. There are a lot of obvious changes,
but some less obvious ones. For example, it's not possible to set the
kind at some point and the associated data later, as some users did.
But in any case, the intent of the code should not change in this patch.
This was tested on x86-64 Linux (unix, native-gdbserver and
native-extended-gdbserver boards). It was built-tested on x86-64
FreeBSD, NetBSD, MinGW and macOS. The rest of the changes to native
files was done as a best effort. If I forgot any place to update in
these files, it should be easy to fix (unless the change happens to
reveal an actual bug).
Change-Id: I0ae967df1ff6e28de78abbe3ac9b4b2ff4ad03b7
The pattern for using execute_command_to_string is:
...
std::string output;
output = execute_fn_to_string (fn, term_out);
...
This results in a problem when using it in a try/catch:
...
try
{
output = execute_fn_to_string (fn, term_out)
}
catch (const gdb_exception &e)
{
/* Use output. */
}
...
If an expection was thrown during execute_fn_to_string, then the output
remains unassigned, while it could be worthwhile to known what output was
generated by gdb before the expection was thrown.
Fix this by returning the string using a parameter instead:
...
execute_fn_to_string (output, fn, term_out)
...
Also add a variant without string parameter, to support places where the
function is used while ignoring the result:
...
execute_fn_to_string (fn, term_out)
...
Tested on x86_64-linux.
This started out as changing thread_info::name to a unique_xmalloc_ptr.
That showed that almost all users of that field had the same logic to
get a thread's name: use thread_info::name if non-nullptr, else ask the
target. Factor out this logic in a new thread_name free function. Make
the field private (rename to m_name) and add some accessors.
Change-Id: Iebdd95f4cd21fbefc505249bd1d05befc466a2fc
Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and
when we want to indicate that there is no stop_pc available we set
this field back to a special value.
There are actually two special values used, in post_create_inferior
the stop_pc is set to 0. This is a little unfortunate, there are
plenty of embedded targets where 0 is a valid pc value. The more
common special value for stop_pc though, is set in
thread_info::set_executing, where the value (~(CORE_ADDR) 0) is used.
This commit changes things so that the stop_pc is instead a
gdb::optional. We can now explicitly reset the field to an
uninitialised state, we also have asserts that we don't read the
stop_pc when its in an uninitialised state (both in
gdbsupport/gdb_optional.h, when compiling with _GLIBCXX_DEBUG
defined, and in thread_info::stop_pc).
One situation where a thread will not have a stop_pc value is when the
thread is stopped as a consequence of GDB being in all stop mode, and
some other thread stopped at an interesting event. When GDB brings
all the other threads to a stop those other threads will not have a
stop_pc set (thus avoiding an unnecessary read of the pc register).
Previously, when GDB passed through handle_one (in infrun.c) the
threads executing flag was set to false and the stop_pc field was left
unchanged, i.e. it would (previous) have been left as ~0.
Now, handle_one leaves the stop_pc with no value.
This caused a problem when we later try to set these threads running
again, in proceed() we compare the current pc with the cached stop_pc.
If the thread was stopped via handle_one then the stop_pc would have
been left as ~0, and the compare (in proceed) would (likely) fail.
Now however, this compare tries to read the stop_pc when it has no
value and this would trigger an assert.
To resolve this I've added thread_info::stop_pc_p() which returns true
if the thread has a cached stop_pc. We should only ever call
thread_info::stop_pc() if we know that there is a cached stop_pc,
however, this doesn't mean that every call to thread_info::stop_pc()
needs to be guarded with a call to thread_info::stop_pc_p(), in most
cases we know that the thread we are looking at stopped due to some
interesting event in that thread, and so, we know that the stop_pc is
valid.
After running the testsuite I've seen no other situations where
stop_pc is read uninitialised.
There should be no user visible changes after this commit.
Rename thread_info::executing to thread_info::m_executing, and make it
private. Add a new get/set member functions, and convert GDB to make
use of these.
The only real change of interest in this patch is in thread.c where I
have deleted the helper function set_executing_thread, and now just
use the new set function thread_info::set_executing. However, the old
helper function set_executing_thread included some code to reset the
thread's stop_pc, so I moved this code into the new function
thread_info::set_executing. However, I don't believe there is
anywhere that this results in a change of behaviour, previously the
executing flag was always set true through a call to
set_executing_thread anyway.
If I debug a single-thread program and look at the infrun debug logs, I
see:
[infrun] start_step_over: stealing global queue of threads to step, length = 2
That makes no sense... turns out there's a buglet in
thread_step_over_chain_length, "num" should be initialized to 0. I
think this bug is a leftover from an earlier version of the code (not
merged upstream) that manually walked the list, where the first item was
implicitly counted (hence the 1).
Change-Id: I0af03aa93509aed36528be5076894dc156a0b5ce
When debugging a large number of threads (thousands), looking up a
thread by ptid_t using the inferior::thread_list linked list can add up.
Add inferior::thread_map, an std::unordered_map indexed by ptid_t, and
change the find_thread_ptid function to look up a thread using
std::unordered_map::find, instead of iterating on all of the
inferior's threads. This should make it faster to look up a thread
from its ptid.
Change-Id: I3a8da0a839e18dee5bb98b8b7dbeb7f3dfa8ae1c
Co-Authored-By: Pedro Alves <pedro@palves.net>
Looking up threads that are both resumed and have a pending wait
status to report is something that we do quite often in the fast path
and is expensive if there are many threads, since it currently requires
walking whole thread lists.
The first instance is in maybe_set_commit_resumed_all_targets. This is
called after handling each event in fetch_inferior_event, to see if we
should ask targets to commit their resumed threads or not. If at least
one thread is resumed but has a pending wait status, we don't ask the
targets to commit their resumed threads, because we want to consume and
handle the pending wait status first.
The second instance is in random_pending_event_thread, where we want to
select a random thread among all those that are resumed and have a
pending wait status. This is called every time we try to consume
events, to see if there are any pending events that we we want to
consume, before asking the targets for more events.
To allow optimizing these cases, maintain a per-process-target list of
threads that are resumed and have a pending wait status.
In maybe_set_commit_resumed_all_targets, we'll be able to check in O(1)
if there are any such threads simply by checking whether the list is
empty.
In random_pending_event_thread, we'll be able to use that list, which
will be quicker than iterating the list of threads, especially when
there are no resumed with pending wait status threads.
About implementation details: using the new setters on class
thread_info, it's relatively easy to maintain that list. Any time the
"resumed" or "pending wait status" property is changed, we check whether
that should cause the thread to be added or removed from the list.
In set_thread_exited, we try to remove the thread from the list, because
keeping an exited thread in that list would make no sense (especially if
the thread is freed). My first implementation assumed that a process
stratum target was always present when set_thread_exited is called.
That's however, not the case: in some cases, targets unpush themselves
from an inferior and then call "exit_inferior", which exits all the
threads. If the target is unpushed before set_thread_exited is called
on the threads, it means we could mistakenly leave some threads in the
list. I tried to see how hard it would be to make it such that targets
have to exit all threads before unpushing themselves from the inferior
(that would seem logical to me, we don't want threads belonging to an
inferior that has no process target). That seemed quite difficult and
not worth the time at the moment. Instead, I changed
inferior::unpush_target to remove all threads of that inferior from the
list.
As of this patch, the list is not used, this is done in the subsequent
patches.
The debug messages in process-stratum-target.c need to print some ptids.
However, they can't use target_pid_to_str to print them without
introducing a dependency on the current inferior (the current inferior
is used to get the current target stack). For debug messages, I find it
clearer to print the spelled out ptid anyway (the pid, lwp and tid
values). Add a ptid_t::to_string method that returns a string
representation of the ptid that is meant for debug messages, a bit like
we already have frame_id::to_string.
Change-Id: Iad8f93db2d13984dd5aa5867db940ed1169dbb67
A following patch will want to take some action when a pending wait
status is set on or removed from a thread. Add a getter and a setter on
thread_info for the pending waitstatus, so that we can add some code in
the setter later.
The thing is, the pending wait status field is in the
thread_suspend_state, along with other fields that we need to backup
before and restore after the thread does an inferior function call.
Therefore, make the thread_suspend_state member private
(thread_info::suspend becomes thread_info::m_suspend), and add getters /
setters for all of its fields:
- pending wait status
- stop signal
- stop reason
- stop pc
For the pending wait status, add the additional has_pending_waitstatus
and clear_pending_waitstatus methods.
I think this makes the thread_info interface a bit nicer, because we
now access the fields as:
thread->stop_pc ()
rather than
thread->suspend.stop_pc
The stop_pc field being in the `suspend` structure is an implementation
detail of thread_info that callers don't need to be aware of.
For the backup / restore of the thread_suspend_state structure, add
save_suspend_to and restore_suspend_from methods. You might wonder why
`save_suspend_to`, as opposed to a simple getter like
thread_suspend_state &suspend ();
I want to make it clear that this is to be used only for backing up and
restoring the suspend state, _not_ to access fields like:
thread->suspend ()->stop_pc
Adding some getters / setters allows adding some assertions. I find
that this helps understand how things are supposed to work. Add:
- When getting the pending status (pending_waitstatus method), ensure
that there is a pending status.
- When setting a pending status (set_pending_waitstatus method), ensure
there is no pending status.
There is one case I found where this wasn't true - in
remote_target::process_initial_stop_replies - which needed adjustments
to respect that contract. I think it's because
process_initial_stop_replies is kind of (ab)using the
thread_info::suspend::waitstatus to store some statuses temporarily, for
its internal use (statuses it doesn't intent on leaving pending).
process_initial_stop_replies pulls out stop replies received during the
initial connection using target_wait. It always stores the received
event in `evthread->suspend.waitstatus`. But it only sets
waitstatus_pending_p, if it deems the event interesting enough to leave
pending, to be reported to the core:
if (ws.kind != TARGET_WAITKIND_STOPPED
|| ws.value.sig != GDB_SIGNAL_0)
evthread->suspend.waitstatus_pending_p = 1;
It later uses this flag a bit below, to choose which thread to make the
"selected" one:
if (selected == NULL
&& thread->suspend.waitstatus_pending_p)
selected = thread;
And ultimately that's used if the user-visible mode is all-stop, so that
we print the stop for that interesting thread:
/* In all-stop, we only print the status of one thread, and leave
others with their status pending. */
if (!non_stop)
{
thread_info *thread = selected;
if (thread == NULL)
thread = lowest_stopped;
if (thread == NULL)
thread = first;
print_one_stopped_thread (thread);
}
But in any case (all-stop or non-stop), print_one_stopped_thread needs
to access the waitstatus value of these threads that don't have a
pending waitstatus (those that had TARGET_WAITKIND_STOPPED +
GDB_SIGNAL_0). This doesn't work with the assertions I've
put.
So, change the code to only set the thread's wait status if it is an
interesting one that we are going to leave pending. If the thread
stopped due to a non-interesting event (TARGET_WAITKIND_STOPPED +
GDB_SIGNAL_0), don't store it. Adjust print_one_stopped_thread to
understand that if a thread has no pending waitstatus, it's because it
stopped with TARGET_WAITKIND_STOPPED + GDB_SIGNAL_0.
The call to set_last_target_status also uses the pending waitstatus.
However, given that the pending waitstatus for the thread may have been
cleared in print_one_stopped_thread (and that there might not even be a
pending waitstatus in the first place, as explained above), it is no
longer possible to do it at this point. To fix that, move the call to
set_last_target_status in print_one_stopped_thread. I think this will
preserve the existing behavior, because set_last_target_status is
currently using the current thread's wait status. And the current
thread is the last one for which print_one_stopped_thread is called. So
by calling set_last_target_status in print_one_stopped_thread, we'll get
the same result. set_last_target_status will possibly be called
multiple times, but only the last call will matter. It just means
possibly more calls to set_last_target_status, but those are cheap.
Change-Id: Iedab9653238eaf8231abcf0baa20145acc8b77a7
A following patch will want to do things when a thread's resumed state
changes. Make the `resumed` field private (renamed to `m_resumed`) and
add a getter and a setter for it. The following patch in question will
therefore be able to add some code to the setter.
Change-Id: I360c48cc55a036503174313261ce4e757d795319
The threads that need a step-over are currently linked using an
hand-written intrusive doubly-linked list, so that seems a very good
candidate for intrusive_list, convert it.
For this, we have a use case of appending a list to another one (in
start_step_over). Based on the std::list and Boost APIs, add a splice
method. However, only support splicing the other list at the end of the
`this` list, since that's all we need.
Add explicit default assignment operators to
reference_to_pointer_iterator, which are otherwise implicitly deleted.
This is needed because to define thread_step_over_list_safe_iterator, we
wrap reference_to_pointer_iterator inside a basic_safe_iterator, and
basic_safe_iterator needs to be able to copy-assign the wrapped
iterator. The move-assignment operator is therefore not needed, only
the copy-assignment operator is. But for completeness, add both.
Change-Id: I31b2ff67c7b78251314646b31887ef1dfebe510c
Change inferior_list, the global list of inferiors, to use
intrusive_list. I think most other changes are somewhat obvious
fallouts from this change.
There is a small change in behavior in scoped_mock_context. Before this
patch, constructing a scoped_mock_context would replace the whole
inferior list with only the new mock inferior. Tests using two
scoped_mock_contexts therefore needed to manually link the two inferiors
together, as the second scoped_mock_context would bump the first mock
inferior from the thread list. With this patch, a scoped_mock_context
adds its mock inferior to the inferior list on construction, and removes
it on destruction. This means that tests run with mock inferiors in the
inferior list in addition to any pre-existing inferiors (there is always
at least one). There is no possible pid clash problem, since each
scoped mock inferior uses its own process target, and pids are per
process target.
Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I7eb6a8f867d4dcf8b8cd2dcffd118f7270756018
GDB currently has several objects that are put in a singly linked list,
by having the object's type have a "next" pointer directly. For
example, struct thread_info and struct inferior. Because these are
simply-linked lists, and we don't keep track of a "tail" pointer, when
we want to append a new element on the list, we need to walk the whole
list to find the current tail. It would be nice to get rid of that
walk. Removing elements from such lists also requires a walk, to find
the "previous" position relative to the element being removed. To
eliminate the need for that walk, we could make those lists
doubly-linked, by adding a "prev" pointer alongside "next". It would be
nice to avoid the boilerplate associated with maintaining such a list
manually, though. That is what the new intrusive_list type addresses.
With an intrusive list, it's also possible to move items out of the
list without destroying them, which is interesting in our case for
example for threads, when we exit them, but can't destroy them
immediately. We currently keep exited threads on the thread list, but
we could change that which would simplify some things.
Note that with std::list, element removal is O(N). I.e., with
std::list, we need to walk the list to find the iterator pointing to
the position to remove. However, we could store a list iterator
inside the object as soon as we put the object in the list, to address
it, because std::list iterators are not invalidated when other
elements are added/removed. However, if you need to put the same
object in more than one list, then std::list<object> doesn't work.
You need to instead use std::list<object *>, which is less efficient
for requiring extra memory allocations. For an example of an object
in multiple lists, see the step_over_next/step_over_prev fields in
thread_info:
/* Step-over chain. A thread is in the step-over queue if these are
non-NULL. If only a single thread is in the chain, then these
fields point to self. */
struct thread_info *step_over_prev = NULL;
struct thread_info *step_over_next = NULL;
The new intrusive_list type gives us the advantages of an intrusive
linked list, while avoiding the boilerplate associated with manually
maintaining it.
intrusive_list's API follows the standard container interface, and thus
std::list's interface. It is based the API of Boost's intrusive list,
here:
https://www.boost.org/doc/libs/1_73_0/doc/html/boost/intrusive/list.html
Our implementation is relatively simple, while Boost's is complicated
and intertwined due to a lot of customization options, which our version
doesn't have.
The easiest way to use an intrusive_list is to make the list's element
type inherit from intrusive_node. This adds a prev/next pointers to
the element type. However, to support putting the same object in more
than one list, intrusive_list supports putting the "node" info as a
field member, so you can have more than one such nodes, one per list.
As a first guinea pig, this patch makes the per-inferior thread list use
intrusive_list using the base class method.
Unlike Boost's implementation, ours is not a circular list. An earlier
version of the patch was circular: the intrusive_list type included an
intrusive_list_node "head". In this design, a node contained pointers
to the previous and next nodes, not the previous and next elements.
This wasn't great for when debugging GDB with GDB, as it was difficult
to get from a pointer to the node to a pointer to the element. With the
design proposed in this patch, nodes contain pointers to the previous
and next elements, making it easy to traverse the list by hand and
inspect each element.
The intrusive_list object contains pointers to the first and last
elements of the list. They are nullptr if the list is empty.
Each element's node contains a pointer to the previous and next
elements. The first element's previous pointer is nullptr and the last
element's next pointer is nullptr. Therefore, if there's a single
element in the list, both its previous and next pointers are nullptr.
To differentiate such an element from an element that is not linked into
a list, the previous and next pointers contain a special value (-1) when
the node is not linked. This is necessary to be able to reliably tell
if a given node is currently linked or not.
A begin() iterator points to the first item in the list. An end()
iterator contains nullptr. This makes iteration until end naturally
work, as advancing past the last element will make the iterator contain
nullptr, making it equal to the end iterator. If the list is empty,
a begin() iterator will contain nullptr from the start, and therefore be
immediately equal to the end.
Iterating on an intrusive_list yields references to objects (e.g.
`thread_info&`). The rest of GDB currently expects iterators and ranges
to yield pointers (e.g. `thread_info*`). To bridge the gap, add the
reference_to_pointer_iterator type. It is used to define
inf_threads_iterator.
Add a Python pretty-printer, to help inspecting intrusive lists when
debugging GDB with GDB. Here's an example of the output:
(top-gdb) p current_inferior_.m_obj.thread_list
$1 = intrusive list of thread_info = {0x61700002c000, 0x617000069080, 0x617000069400, 0x61700006d680, 0x61700006eb80}
It's not possible with current master, but with this patch [1] that I
hope will be merged eventually, it's possible to index the list and
access the pretty-printed value's children:
(top-gdb) p current_inferior_.m_obj.thread_list[1]
$2 = (thread_info *) 0x617000069080
(top-gdb) p current_inferior_.m_obj.thread_list[1].ptid
$3 = {
m_pid = 406499,
m_lwp = 406503,
m_tid = 0
}
Even though iterating the list in C++ yields references, the Python
pretty-printer yields pointers. The reason for this is that the output
of printing the thread list above would be unreadable, IMO, if each
thread_info object was printed in-line, since they contain so much
information. I think it's more useful to print pointers, and let the
user drill down as needed.
[1] https://sourceware.org/pipermail/gdb-patches/2021-April/178050.html
Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I3412a14dc77f25876d742dab8f44e0ba7c7586c0
The alias creation functions currently accept a name to specify the
target command. They pass this to add_alias_cmd, which needs to lookup
the target command by name.
Given that:
- We don't support creating an alias for a command before that command
exists.
- We always use add_info_alias just after creating that target command,
and therefore have access to the target command's cmd_list_element.
... change add_com_alias to accept the target command as a
cmd_list_element (other functions are done in subsequent patches). This
ensures we don't create the alias before the target command, because you
need to get the cmd_list_element from somewhere when you call the alias
creation function. And it avoids an unecessary command lookup. So it
seems better to me in every aspect.
gdb/ChangeLog:
* command.h (add_com_alias): Accept target as
cmd_list_element. Update callers.
Change-Id: I24bed7da57221cc77606034de3023fedac015150
gdb/ChangeLog:
* ui-out.h (class ui_out): Add ui_out::text accepting a constant
reference to a std::string. Fix all callers using
std::string::c_str.
* ui-out.c (ui_out::text): Ditto.
Previously, the prefixname field of struct cmd_list_element was manually
set for prefix commands. This seems verbose and error prone as it
required every single call to functions adding prefix commands to
specify the prefix name while the same information can be easily
generated.
Historically, this was not possible as the prefix field was null for
many commands, but this was fixed in commit
3f4d92ebdf by Philippe Waroquiers, so
we can rely on the prefix field being set when generating the prefix
name.
This commit also fixes a use after free in this scenario:
* A command gets created via Python (using the gdb.Command class).
The prefix name member is dynamically allocated.
* An alias to the new command is created. The alias's prefixname is set
to point to the prefixname for the original command with a direct
assignment.
* A new command with the same name as the Python command is created.
* The object for the original Python command gets freed and its
prefixname gets freed as well.
* The alias is updated to point to the new command, but its prefixname
is not updated so it keeps pointing to the freed one.
gdb/ChangeLog:
* command.h (add_prefix_cmd): Remove the prefixname argument as
it can now be generated automatically. Update all callers.
(add_basic_prefix_cmd): Ditto.
(add_show_prefix_cmd): Ditto.
(add_prefix_cmd_suppress_notification): Ditto.
(add_abbrev_prefix_cmd): Ditto.
* cli/cli-decode.c (add_prefix_cmd): Ditto.
(add_basic_prefix_cmd): Ditto.
(add_show_prefix_cmd): Ditto.
(add_prefix_cmd_suppress_notification): Ditto.
(add_prefix_cmd_suppress_notification): Ditto.
(add_abbrev_prefix_cmd): Ditto.
* cli/cli-decode.h (struct cmd_list_element): Replace the
prefixname member variable with a method which generates the
prefix name at runtime. Update all code reading the prefix
name to use the method, and remove all code setting it.
* python/py-cmd.c (cmdpy_destroyer): Remove code to free the
prefixname member as it's now a method.
(cmdpy_function): Determine if the command is a prefix by
looking at prefixlist, not prefixname.