As follow-up to this discussion:
https://sourceware.org/pipermail/gdb-patches/2020-August/171385.html
... make runto_main not pass no-message to runto. This means that if we
fail to run to main, for some reason, we'll emit a FAIL. This is the
behavior we want the majority of (if not all) the time.
Without this, we rely on tests logging a failure if runto_main fails,
otherwise. They do so in a very inconsisteny mannet, sometimes using
"fail", "unsupported" or "untested". The messages also vary widly.
This patch removes all these messages as well.
Also, remove a few "fail" where we call runto (and not runto_main). by
default (without an explicit no-message argument), runto prints a
failure already. In two places, gdb.multi/multi-re-run.exp and
gdb.python/py-pp-registration.exp, remove "message" passed to runto.
This removes a few PASSes that we don't care about (but FAILs will still
be printed if we fail to run to where we want to). This aligns their
behavior with the rest of the testsuite.
Change-Id: Ib763c98c5f4fb6898886b635210d7c34bd4b9023
Make gdb_open_cloexec return a scoped_fd, to encourage using automatic
management of the file descriptor closing. Except in the most trivial
cases, I changed the callers to just release the fd, which retains their
existing behavior. That will allow the transition to using scoped_fd
more to go gradually, one caller at a time.
Change-Id: Ife022b403f96e71d5ebb4f1056ef6251b30fe554
The "make thread_suspend_state::stop_pc optional" patch caused a
regression on Windows when using shared libraries. I tracked this
down to an unguarded use of stop_pc() in the TARGET_WAITKIND_LOADED
case of handle_inferior_event. This patch fixes the bug by ensuring
that the stop PC is set at this point.
With running test-case gdb.debuginfod/fetch_src_and_symbols.exp with target
board unix/-bad, I get:
...
gcc: error: unrecognized command line option '-bad'^M
compiler exited with status 1
gdb compile failed, gcc: error: unrecognized command line option '-bad'
FAIL: gdb.debuginfod/fetch_src_and_symbols.exp: compile
...
Replace the FAIL with the usual:
...
UNTESTED: gdb.debuginfod/fetch_src_and_symbols.exp: failed to compile
...
Tested on x86_64-linux.
When running test-case gdb.base/info-os.exp with target board unix/-bad, I run
into:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.base/info-os.exp: failed to prepare
FAIL: gdb.base/info-os.exp: cannot compile test program
...
Remove the redundant FAIL.
Tested on x86_64-linux.
When running test-case gdb.base/info-os.exp, I run into:
...
PASS: gdb.base/info-os.exp: get threads
PASS: gdb.base/info-os.exp: get threads
DUPLICATE: gdb.base/info-os.exp: get threads
...
Fix this not doing pass followed by exp_continue in gdb_test_multiple.
Tested on x86_64-linux.
When running test-case gdb.dwarf2/dw2-opt-structptr.exp with target board
unix/-bad, I get:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.dwarf2/dw2-opt-structptr.exp: dw2-opt-structptr.exp
UNTESTED: gdb.dwarf2/dw2-opt-structptr.exp: failed to compile
ERROR: (dw2-opt-structptr) No such file or directory
UNRESOLVED: gdb.dwarf2/dw2-opt-structptr.exp: console: set print object on
...
Merge the two UNTESTEDs.
Fix the UNRESOLVED by checking result of compilation.
Tested on x86_64-linux.
When running test-case gdb.base/structs.exp with target board unix/-bad, I
get:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.base/structs.exp: failed to prepare
ERROR: tcl error sourcing src/gdb/testsuite/gdb.base/structs.exp.
ERROR: can't read "use_gdb_stub": no such variable
...
Fix this by checking the compilation result.
Fix the resulting DUPLICATEs using with_test_prefix.
Tested on x86_64-linux.
When running test-case gdb.base/cvexpr.exp with target board unix/-bad, I get:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
ERROR: tcl error sourcing src/gdb/testsuite/gdb.base/cvexpr.exp.
ERROR: can't read "use_gdb_stub": no such variable
...
This is triggered in a part of the test that claims to require no debug
information, but uses the exec containing either dwarf or ctf.
Fix this by preparing another executable compiled with nodebug, and using
that one instead.
Also use with_test_prefix to mark the nodebug part, such that we have:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.base/cvexpr.exp: dwarf: failed to prepare
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.base/cvexpr.exp: nodebug: failed to prepare
...
Tested on x86_64-linux.
When running test-case gdb.base/call-sc.exp with target board unix/-bad, I
get:
...
gdb compile failed, gcc: error: unrecognized command line option '-bad'
UNTESTED: gdb.base/call-sc.exp: failed to prepare
ERROR: tcl error sourcing src/gdb/testsuite/gdb.base/call-sc.exp.
ERROR: can't read "use_gdb_stub": no such variable
...
Fix this by checking the compilation result.
Fix the resulting DUPLICATE:
...
DUPLICATE: gdb.base/call-sc.exp: failed to prepare
...
using with_test_prefix.
Tested on x86_64-linux.
The effect of:
...
untested "y.exp"
...
in a gdb.x/y.exp is:
...
UNTESTED: gdb.x/y.exp: y.exp
...
which is a bit pointless.
Replace these untested messages in gdb.mi/*.exp with the usual "failed to
compile".
Likewise for an:
...
untested $testname
...
where the variable is undefined.
Tested on x86_64-linux.
On ubuntu 18.04.5, I run into:
...
(gdb) mt print objfiles dwindex^M
^M
Object file build/gdb/testsuite/outputs/gdb.rust/dwindex/dwindex: \
Objfile at 0x55dab0b87a50, bfd at 0x55dab0b0cfa0, 1095 minsyms^M
^M
Psymtabs:^M
vendor/compiler_builtins/src/int/specialized_div_rem/mod.rs at 0x55dab0db0720^M
...
library/std/src/sys/unix/stdio.rs at 0x55dab0d96320^M
ERROR: internal buffer is full.
UNRESOLVED: gdb.rust/dwindex.exp: check if index present
...
Fix this by using -lbl in proc ensure_gdb_index.
Tested on x86_64-linux.
When running test-case gdb.base/break-interp.exp on openSUSE Leap 42.3, I get:
...
(gdb) info addr dl_main^M
Symbol "dl_main" is at 0x1750 in a file compiled without debugging.^M
(gdb) FAIL: gdb.base/break-interp.exp: info addr dl_main
...
while the regexp expects "Symbol \"dl_main\" is a function at address $hex\\."
Fix this by also accepting this variant.
Tested on x86_64-linux.
The gdb.multi/multi-term-settings.exp testcase sometimes fails like so:
Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.multi/multi-term-settings.exp ...
FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT)
It's easier to reproduce if you stress the machine at the same time, like e.g.:
$ stress -c 24
Looking at gdb.log, we see:
(gdb) attach 60422
Attaching to program: build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings, process 60422
[New Thread 60422.60422]
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007f2fc2485334 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry <mailto:clock_id@entry>=0, flags=flags@entry <mailto:flags@entry>=0, req=req@entry <mailto:req@entry>=0x7ffe23126940, rem=rem@entry <mailto:rem@entry>=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory.
(gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: inf2: attach
set schedule-multiple on
(gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: set schedule-multiple on
info inferiors
Num Description Connection Executable
1 process 60404 1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings
* 2 process 60422 1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings
(gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: info inferiors
pid=60422, count=46
pid=60422, count=47
pid=60422, count=48
pid=60422, count=49
pid=60422, count=50
pid=60422, count=51
pid=60422, count=52
pid=60422, count=53
pid=60422, count=54
pid=60422, count=55
pid=60422, count=56
pid=60422, count=57
pid=60422, count=58
pid=60422, count=59
pid=60422, count=60
pid=60422, count=61
pid=60422, count=62
pid=60422, count=63
pid=60422, count=64
pid=60422, count=65
pid=60422, count=66
pid=60422, count=67
pid=60422, count=68
pid=60422, count=69
pid=60404, count=54
pid=60404, count=55
pid=60404, count=56
pid=60404, count=57
pid=60404, count=58
PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: continue
Quit
(gdb) FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT)
If you look at the testcase's sources, you'll see that the intention
is to resumes the program with "continue", wait to see a few of those
"pid=..., count=..." lines, and then interrupt the program with
Ctrl-C. But somehow, that resulted in GDB printing "Quit", instead of
the Ctrl-C stopping the program with SIGINT.
Here's what is happening:
#1 - those "pid=..., count=..." lines we see above weren't actually
output by the inferior after it has been continued (see #1).
Note that "inf1_how" and "inf2_how" are "attach". What happened
is that those "pid=..., count=..." lines were output by the
inferiors _before_ they were attached to. We see them at that
point instead of earlier, because that's where the testcase
reads from the inferiors' spawn_ids.
#2 - The testcase mistakenly thinks those "pid=..., count=..." lines
happened after the continue was processed by GDB, meaning it has
waited enough, and so sends the Ctrl-C. GDB hasn't yet passed
the terminal to the inferior, so the Ctrl-C results in that
Quit.
The fix here is twofold:
#1 - flush inferior output right after attaching
#2 - consume the "Continuing" printed by "continue", indicating the
inferior has the terminal. This is the same as done throughout
the testsuite to handle this exact problem of sending Ctrl-C too
soon.
gdb/testsuite/ChangeLog:
yyyy-mm-dd Pedro Alves <pedro@palves.net <mailto:pedro@palves.net>>
* gdb.multi/multi-term-settings.exp (create_inferior): Flush
inferior output.
(coretest): Use $gdb_test_name. After issuing "continue", wait
for "Continuing".
Change-Id: Iba7671dfe1eee6b98d29cfdb05a1b9aa2f9defb9
I build gdb without xml support using --without-expat, and ran into:
...
(gdb) target remote | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M
Remote debugging using | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M
relaying data between gdb and process 22032^M
warning: Can not parse XML target description; XML support was disabled at \
compile time^M
...
(gdb) PASS: gdb.base/valgrind-infcall.exp: continue #1
p gdb_test_infcall ()^M
Remote 'g' packet reply is too long (expected 560 bytes, got 800 bytes): ...^M
(gdb) FAIL: gdb.base/valgrind-infcall.exp: p gdb_test_infcall ()
...
After googling the error message with context valgrind gdbserver, I found
indications that the Remote 'g' packet reply error is due to missing xml
support.
And here ( https://www.valgrind.org/docs/manual/manual-core-adv.html ) I
found:
...
GDB version needed for ARM and PPC32/64.
You must use a GDB version which is able to read XML target description sent
by a gdbserver. This is the standard setup if GDB was configured and built
with the "expat" library. If your GDB was not configured with XML support, it
will report an error message when using the "target" command. Debugging will
not work because GDB will then not be able to fetch the registers from the
Valgrind gdbserver.
...
So I guess I'm running into the same problem for x86_64.
Fix this by skipping all gdb.base/valgrind-*.exp tests if xml support is not
available. Although only the gdb.base/valgrind-infcall*.exp produce fails,
the Remote 'g' packet reply error occurs in all tests, so it seems prudent to
disable them all.
Tested on x86_64-linux.
With a gdb build using python 2.7, I run into:
...
(gdb) python \
gdb.events.breakpoint_modified.connect(lambda bp: print(bp.enabled))^M
File "<string>", line 1^M
gdb.events.breakpoint_modified.connect(lambda bp: print(bp.enabled))^M
^^M
SyntaxError: invalid syntax^M
Error while executing Python code.^M
(gdb) FAIL: gdb.python/py-breakpoint.exp: test_bkpt_auto_disable: \
trap breakpoint_modified event
...
This is caused by the following:
- a lambda function body needs to be an expression
- in python 2, print is a statement, while in python 3 it's a function
- a function call is an expression, and a statement is not.
Fix this by defining a function print_bp_enabled:
...
def print_bp_enabled (bp):
print (bp.enabled)
end
...
and using that instead.
Tested on x86_64-linux.
With a gdb configured to be somewhat minimal, while still supporting python:
...
$ gdb --configuration
This GDB was configured as follows:
configure --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-auto-load-dir=$debugdir:$datadir/auto-load
--with-auto-load-safe-path=$debugdir:$datadir/auto-load
--without-expat
--with-gdb-datadir=$install/share/gdb (relocatable)
--with-jit-reader-dir=$install/lib64/gdb (relocatable)
--without-libunwind-ia64
--without-lzma
--without-babeltrace
--without-intel-pt
--with-mpfr
--without-xxhash
--with-python=/usr
--with-python-libdir=/usr/lib
--with-debuginfod
--without-guile
--disable-source-highlight
--with-separate-debug-dir=/usr/lib/debug
--with-system-gdbinit=$devel/system-gdbinit
...
and using gcc 4.8 to build gdb (causing std::thread not to be used due to
PR28318) I ran into:
...
(gdb) PASS: gdb.gdb/python-helper.exp: start inner gdb
print 1^M
^M
Breakpoint 2, value_print () at src/gdb/valprint.c:1174^M
1174 scoped_value_mark free_values;^M
(xgdb) FAIL: gdb.gdb/python-helper.exp: hit breakpoint in inner gdb (timeout)
...
The problem is that the regexp expects "hit Breakpoint $decimal". The "hit"
part is missing.
The "hit" is printed by maybe_print_thread_hit_breakpoint, when
show_thread_that_caused_stop returns true:
...
int
show_thread_that_caused_stop (void)
{
return highest_thread_num > 1;
}
...
Apparently, that's not the case.
Fix this by removing "hit" from the regexp, making the regexp more similar to
what is used in say, continue_to_breakpoint.
Tested on x86_64-linux.
In this commit:
commit abbbd4a3e0
Date: Wed Aug 11 13:24:33 2021 +0100
gdb: use libbacktrace to create a better backtrace for fatal signals
The build of GDB was broken iff, the execinfo backtrace API is not
available, and, libbacktrace is either disabled, or not usable. In
this case you'll see build errors like this:
CXX bt-utils.o
/home/username/src/binutils-gdb/gdb/bt-utils.c: In function 'void gdb_internal_backtrace()':
/home/username/src/binutils-gdb/gdb/bt-utils.c:165:5: error: 'gdb_internal_backtrace_1' was not declared in this scope
gdb_internal_backtrace_1 ();
^~~~~~~~~~~~~~~~~~~~~~~~
This commit fixes the issue by guarding the call to
gdb_internal_backtrace_1 with '#ifdef GDB_PRINT_INTERNAL_BACKTRACE',
which is only defined when one of the backtrace libraries are
available.
With this commit:
commit 91f2597bd2
Date: Thu Aug 12 18:24:59 2021 +0100
gdb: print backtrace for internal error/warning
I included some references to 'stderr', which, it was pointed out,
would be better written as 'standard error stream'. See:
https://sourceware.org/pipermail/gdb-patches/2021-September/182225.html
This commit replaces the two instances of 'stderr' that I introduced.
In a recent commit I used 'manor' in some comments rather than
'manner'. This commit fixes those two mistakes.
I also looked through the gdb/ tree and found one additional instance
of this mistake that this commit also fixes.
The following scenario hangs:
- maint set target-non-stop on
- `gdbserver --attach`
- a multi-threaded program
For example:
Terminal 1:
$ gnome-calculator&
[1] 495731
$ ../gdbserver/gdbserver --once --attach :1234 495731
Attached; pid = 495731
Listening on port 1234
Terminal 2:
$ ./gdb -nx -q --data-directory=data-directory /usr/bin/gnome-calculator -ex "maint set target-non-stop on" -ex "tar rem :1234"
Reading symbols from /usr/bin/gnome-calculator...
(No debugging symbols found in /usr/bin/gnome-calculator)
Remote debugging using :1234
* hangs *
What happens is:
- The protocol between gdb and gdbserver is in non-stop mode, but the
user-visible behavior is all-stop
- On connect, gdbserver sends one stop reply for one thread that is
stops, the others stay running
- In process_initial_stop_replies, gdb calls stop_all_threads to stop
these other threads, because we are using the all-stop user-visible
mode
- stop_all_threads sends a stop request for all the running threads and
then waits for resulting events
- At this point, the remote target is in target_async(0) mode, which
makes stop_all_threads not consider it for events
- stop_all_threads loops indefinitely (it does not even block
indefinitely, it is in an infinite busy loop) because there are no
event sources. wait_one_event returns a TARGET_WAITKIND_NO_RESUMED
wait status.
Fix that by making the remote target async around the stop_all_threads
call.
I haven't implemented it because I'm not sure how to do it, but I think
it would be a good idea to have, in stop_all_threads / wait_one /
handle_one, an assert to check that if we are expecting one or more
event, then there are some targets that are in a state where they can
supply some events. Otherwise, we'll necessarily be stuck in this
infinite loop, and it's probably due to a bug in GDB. I'm not too sure
where to put this or how to express it though. Perhaps in
stop_all_threads, here:
for (int i = 0; i < waits_needed; i++)
{
wait_one_event event = wait_one ();
*here*
if (handle_one (event))
break;
}
If at that point, the returned event is TARGET_WAITKIND_NO_RESUMED,
there's a problem. We expect some event, because we've asked some
threads to stop, but all targets are answering that they won't have any
events for us. That's a contradiction, and a sign that something has
gone wrong. It could perhaps event be:
gdb_assert (event.ws.kind != TARGET_WAITKIND_NO_RESUMED);
in handle_one, as the idea is the same in prepare_for_detach.
A bit more sophisticated would be: we know which targets we are
expecting waits from, since we know which threads we have asked to
stop. So if any of these targets returns TARGET_WAITKIND_NO_RESUMED,
something is fishy.
Add a test that tests attaching with gdbserver's --attach flag to a
multi-threaded program, and then connecting to it. Without the fix, the
test reproduces the hang.
Change-Id: If6f6690a4887ca66693ef1af64791dda4c65f24f
There are two errors of this kind:
CXX darwin-nat.o
/Users/smarchi/src/binutils-gdb/gdb/darwin-nat.c:1175:19: error: format specifies type 'unsigned long' but the argument has type 'ULONGEST' (aka 'unsigned long long') [-Werror,-Wformat]
ptid.pid (), ptid.tid ());
^~~~~~~~~~~
Fix them by using ptid_t's to_string method.
Change-Id: I52087d5f7ee0fc01ac8b3f87d4db0217cb0d7cc7
The test currently requires the "inf 1" breakpoint to be before the "inf
2" breakpoint. This is not always the case:
info breakpoints 2
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE>
2.1 y 0x0000555555554730 in callee at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/foll-fork.c:9 inf 2
2.2 y 0x0000555555554730 in callee at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/foll-fork.c:9 inf 1
(gdb) FAIL: gdb.base/foll-fork.exp: follow-fork-mode=parent: detach-on-fork=off: cmd=next 2: test_follow_fork: info breakpoints
Since add_location_to_breakpoint uses only the address as a criterion to
sort locations, the order of locations at the same address is not
stable: it will depend on the insertion order. Here, the insertion
order comes from the order of SALs when creating the breakpoint, which
can vary from machine to machine. While it would be more user-friendly
to have a more stable order for printed breakpoint locations, it doesn't
really matter for this test, and it would be hard to define an order
that will be the same everywhere, all the time.
So, loosen the regexp to accept "inf 1" and "inf 2" in any order.
Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: I5ada2e0c6ad0669e0d161bfb6b767229c0970d16
This commit builds on previous work to allow GDB to print a backtrace
of itself when GDB encounters an internal-error or internal-warning.
This fixes PR gdb/26377.
There's not many places where we call internal_warning, and I guess in
most cases the user would probably continue their debug session. And
so, in order to avoid cluttering up the output, by default, printing
of a backtrace is off for internal-warnings.
In contrast, printing of a backtrace is on by default for
internal-errors, as I figure that in most cases hitting an
internal-error is going to be the end of the debug session.
Whether a backtrace is printed or not can be controlled with the new
settings:
maintenance set internal-error backtrace on|off
maintenance show internal-error backtrace
maintenance set internal-warning backtrace on|off
maintenance show internal-warning backtrace
Here is an example of what an internal-error now looks like with the
backtrace included:
(gdb) maintenance internal-error blah
../../src.dev-3/gdb/maint.c:82: internal-error: blah
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x5c61ca gdb_internal_backtrace_1
../../src.dev-3/gdb/bt-utils.c:123
0x5c626d _Z22gdb_internal_backtracev
../../src.dev-3/gdb/bt-utils.c:165
0xe33237 internal_vproblem
../../src.dev-3/gdb/utils.c:393
0xe33539 _Z15internal_verrorPKciS0_P13__va_list_tag
../../src.dev-3/gdb/utils.c:470
0x1549652 _Z14internal_errorPKciS0_z
../../src.dev-3/gdbsupport/errors.cc:55
0x9c7982 maintenance_internal_error
../../src.dev-3/gdb/maint.c:82
0x636f57 do_simple_func
../../src.dev-3/gdb/cli/cli-decode.c:97
.... snip, lots more backtrace lines ....
---------------------
../../src.dev-3/gdb/maint.c:82: internal-error: blah
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y
This is a bug, please report it. For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.
../../src.dev-3/gdb/maint.c:82: internal-error: blah
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
My hope is that this backtrace might make it slightly easier to
diagnose GDB issues if all that is provided is the console output, I
find that we frequently get reports of an assert being hit that is
located in pretty generic code (frame.c, value.c, etc) and it is not
always obvious how we might have arrived at the assert.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26377
GDB recently gained the ability to print a backtrace when a fatal
signal is encountered. This backtrace is produced using the backtrace
and backtrace_symbols_fd API available in glibc.
However, in order for this API to actually map addresses to symbol
names it is required that the application (GDB) be compiled with
-rdynamic, which GDB is not by default.
As a result, the backtrace produced often looks like this:
Fatal signal: Bus error
----- Backtrace -----
./gdb/gdb[0x80ec00]
./gdb/gdb[0x80ed56]
/lib64/libc.so.6(+0x3c6b0)[0x7fc2ce1936b0]
/lib64/libc.so.6(__poll+0x4f)[0x7fc2ce24da5f]
./gdb/gdb[0x15495ba]
./gdb/gdb[0x15489b8]
./gdb/gdb[0x9b794d]
./gdb/gdb[0x9b7a6d]
./gdb/gdb[0x9b943b]
./gdb/gdb[0x9b94a1]
./gdb/gdb[0x4175dd]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7fc2ce17e1a3]
./gdb/gdb[0x4174de]
---------------------
This is OK if you have access to the exact same build of GDB, you can
manually map the addresses back to symbols, however, it is next to
useless if all you have is a backtrace copied into a bug report.
GCC uses libbacktrace for printing a backtrace when it encounters an
error. In recent commits I added this library into the binutils-gdb
repository, and in this commit I allow this library to be used by
GDB. Now (when GDB is compiled with debug information) the backtrace
looks like this:
----- Backtrace -----
0x80ee08 gdb_internal_backtrace
../../src/gdb/event-top.c:989
0x80ef0b handle_fatal_signal
../../src/gdb/event-top.c:1036
0x7f24539dd6af ???
0x7f2453a97a5f ???
0x154976f gdb_wait_for_event
../../src/gdbsupport/event-loop.cc:613
0x1548b6d _Z16gdb_do_one_eventv
../../src/gdbsupport/event-loop.cc:237
0x9b7b02 start_event_loop
../../src/gdb/main.c:421
0x9b7c22 captured_command_loop
../../src/gdb/main.c:481
0x9b95f0 captured_main
../../src/gdb/main.c:1353
0x9b9656 _Z8gdb_mainP18captured_main_args
../../src/gdb/main.c:1368
0x4175ec main
../../src/gdb/gdb.c:32
---------------------
Which seems much more useful.
Use of libbacktrace is optional. If GDB is configured with
--disable-libbacktrace then the libbacktrace directory will not be
built, and GDB will not try to use this library. In this case GDB
would try to use the old backtrace and backtrace_symbols_fd API.
All of the functions related to writing the backtrace of GDB itself
have been moved into the new files gdb/by-utils.{c,h}.
Replace the manually maintained linked list of lwp_info objects with
intrusive_list. Replace the ALL_LWPS macro with all_lwps, which returns
a range. Add all_lwps_safe as well, for use in iterate_over_lwps, which
currently iterates in a safe manner.
Change-Id: I355313502510acc0103f5eaf2fbde80897d6376c
Replace the lwp_free function with a destructor. Make lwp_info
non-copyable, since there is now a destructor (we wouldn't want an
lwp_info object getting copied and this->arch_private getting deleted
twice).
Change-Id: I09fcbe967e362566d3a06fed2abca2a9955570fa
Initialize all fields in the class declaration directly. This opens the
door to using intrusive_list, done in the following patch.
Change-Id: I38bb27410cd9ebf511d310bb86fe2ea1872c3b05
We found that when handling forks, two inferiors can unexpectedly share
their program space and address space. To reproduce:
1. Using a test program that forks...
2. "set follow-fork-mode child"
3. "set detach-on-fork on" (the default)
4. run to a breakpoint somewhere after the fork
Step 4 should have created a new inferior:
(gdb) info inferiors
Num Description Connection Executable
1 <null> /home/smarchi/build/wt/amd/gdb/fork
* 2 process 251425 1 (native) /home/smarchi/build/wt/amd/gdb/fork
By inspecting the state of GDB, we can see that the two inferiors now
share one program space and one address space:
Inferior 1:
(top-gdb) p inferior_list.m_front.num
$2 = 1
(top-gdb) p inferior_list.m_front.aspace
$3 = (struct address_space *) 0x5595e2520400
(top-gdb) p inferior_list.m_front.pspace
$4 = (struct program_space *) 0x5595e2520440
Inferior 2:
(top-gdb) p inferior_list.m_front.next.num
$5 = 2
(top-gdb) p inferior_list.m_front.next.aspace
$6 = (struct address_space *) 0x5595e2520400
(top-gdb) p inferior_list.m_front.next.pspace
$7 = (struct program_space *) 0x5595e2520440
You can then run inferior 1 again and the two inferiors will still
erroneously share their spaces, but already at this point this is wrong.
The cause of the bad {a,p}space sharing is in follow_fork_inferior.
When following the child and detaching from the parent, we just re-use
the parent's spaces, rather than cloning them. When we switch back to
inferior 1 and run again, we find ourselves with two unrelated inferiors
sharing spaces.
Fix that by creating new spaces for the parent after having moved them
to the child. My initial implementation created new spaces for the
child instead. Doing this breaks doing "next" over fork(). When "next"
start, we record the symtab of the starting location. When the program
stops, we compare that symtab with the symtab the program has stopped
at. If the symtab or the line number has changed, we conclude the
"next" is done. If we create a new program space for the child and copy
the parent's program space to it with clone_program_space, it creates
new symtabs for the child as well. When the child stop, but still on
the fork() line, GDB thinks the "next" is done because the symtab
pointers no longer match. In reality they are two symtab instances that
represent the same file. But moving the spaces to the child and
creating new spaces for the parent, we avoid this problem.
Note that the problem described above happens today with "detach-on-fork
off" and "follow-fork-mode child", because we create new spaces for the
child. This will have to be addressed later.
Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is
expected to have a location in each inferiors. Without the fix, when
the two inferiors erroneously share a program space, GDB reports a
single location.
Change-Id: Ifea76e14f87b9f7321fc3a766217061190e71c6e
Rename the variables / parameters used to match the corresponding GDB
setting name, I find that easier to follow.
Change-Id: Idcbddbbb369279fcf1e808b11a8c478f21b2a946
This test is difficult to follow and modify because the state of GDB is
preserved some tests. Add a setup proc, which starts a new GDB and runs
to main, and use it in all test procs. Use proc_with_prefix to avoid
duplicates.
The check_fork_catchpoints proc also seems used to check for follow-fork
support by checking if catchpoints are supported. If they are not, it
uses "return -code return", which makes its caller return. I find this
unnecessary complex, versus just returning a boolean. Modify it to do
so.
Change-Id: I23e62b204286c5e9c5c86d2727f7d33fb126ed08
It looks like this test has some code to check at runtime the support of
fork handling of the target (see check_fork_catchpoints). So, it seems
to me that the check based on target triplet at the beginning of the
test is not needed. This kind of gating is generally not desirable,
because we wouldn't think of updating it when adding fork support to a
target. For example, FreeBSD supports fork, but it wasn't listed here.
Change-Id: I6b55f2298edae6b37c3681fb8633d8ea1b5aabee
Remove DUPLICATEs, and and at the same time replace two uses of
gdb_test_multiple with gdb_test. I don't think using gdb_test_multiple
is necessary here.
Change-Id: I8dcf097c3364e92d4f0e11f0c0f05dbb88e86742
When building g++-4.8, we run into:
...
src/gdb/dwarf2/read.c:919:5: error: multiple fields in union \
'partial_die_info::<anonymous union>' initialized
...
This is due to:
...
union
{
struct
{
CORE_ADDR lowpc = 0;
CORE_ADDR highpc = 0;
};
ULONGEST ranges_offset;
};
...
The error looks incorrect, given that only one union member is initialized,
and does not reproduce with newer g++.
Nevertheless, work around this by moving the initialization to a constructor.
[ I considered just removing the initialization, with the idea that access
should be guarded by has_pc_info, but I ran into one failure in the testsuite,
for gdb.base/check-psymtab.exp due to add_partial_symbol using lowpc without
checking has_pc_info. ]
Tested on x86_64-linux.
In some situations it is possible that a user might not want GDB to
try and access source code files, for example, the source code might
be stored on a slow to access network file system.
It is almost certainly possible that using some combination of 'set
directories' and/or 'set substitute-path' a user can trick GDB into
being unable to find the source files, but this feels like a rather
crude way to solve the problem.
In this commit a new option is add that stops GDB from opening and
reading the source files. A user can run with source code reading
disabled if this is required, then re-enable later if they decide
that they now want to view the source code.
For some reason we have two locations where cmd_list_elements are
declared, cli/cli-cmds.h and gdbcmd.h. Worse still there is
duplication between these two locations.
In this commit I have moved all of the cmd_list_element declarations
from gdbcmd.h into cli/cli-cmds.h and removed the duplicates.
There should be no user visible changes after this commit.
I ran into this assertion while GDB was trying to unwind the stack:
gdb/inline-frame.c:173: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `frame_id_p (*this_id)' failed.
That is, when building the frame_id for an inline frame, GDB asks for
the frame_id of the previous frame. Unfortunately, no valid frame_id
was returned for the previous frame, and so the assertion triggers.
What is happening is this, I had a stack that looked something like
this (the arrows '->' point from caller to callee):
normal_frame -> inline_frame
However, for whatever reason (e.g. broken debug information, or
corrupted stack contents in the inferior), when GDB tries to unwind
"normal_frame", it ends up getting back effectively the same frame,
thus the call stack looks like this to GDB:
.-> normal_frame -> inline_frame
| |
'-----'
Given such a situation we would expect GDB to terminate the stack with
an error like this:
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
However, the inline_frame causes a problem, and here's why:
When unwinding we start from the sentinel frame and call
get_prev_frame. We eventually end up in get_prev_frame_if_no_cycle,
in here we create a raw frame, and as this is frame #0 we immediately
return.
However, eventually we will try to unwind the stack further. When we
do this we inevitably needing to know the frame_id for frame #0, and
so, eventually, we end up in compute_frame_id.
In compute_frame_id we first find the right unwinder for this frame,
in our case (i.e. for inline_frame) the $pc is within the function
normal_frame, but also within a block associated with the inlined
function inline_frame, as such the inline frame unwinder claims this
frame.
Back in compute_frame_id we next compute the frame_id, for our
inline_frame this means a call to inline_frame_this_id.
The ID of an inline frame is based on the id of the previous frame, so
from inline_frame_this_id we call get_prev_frame_always, this
eventually calls get_prev_frame_if_no_cycle again, which creates
another raw frame and calls compute_frame_id (for frames other than
frame 0 we immediately compute the frame_id).
In compute_frame_id we again identify the correct unwinder for this
frame. Our $pc is unchanged, however, the fact that the next frame is
of type INLINE_FRAME prevents the inline frame unwinder from claiming
this frame again, and so, the standard DWARF frame unwinder claims
normal_frame.
We return to compute_frame_id and call the standard DWARF function to
build the frame_id for normal_frame.
With the frame_id of normal_frame figured out we return to
compute_frame_id, and then to get_prev_frame_if_no_cycle, where we add
the ID for normal_frame into the frame_id cache, and return the frame
back to inline_frame_this_id.
From inline_frame_this_id we build a frame_id for inline_frame and
return to compute_frame_id, and then to get_prev_frame_if_no_cycle,
which adds the frame_id for inline_frame into the frame_id cache.
So far, so good.
However, as we are trying to unwind the complete stack, we eventually
ask for the previous frame of normal_frame, remember, at this point
GDB doesn't know the stack is corrupted (with a cycle), GDB still
needs to figure that out.
So, we eventually end up in get_prev_frame_if_no_cycle where we create
a raw frame and call compute_frame_id, remember, this is for the frame
before normal_frame.
The first task for compute_frame_id is to find the unwinder for this
frame, so all of the frame sniffers are tried in order, this includes
the inline frame sniffer.
The inline frame sniffer asks for the $pc, this request is sent up the
stack to normal_frame, which, due to its cyclic behaviour, tells GDB
that the $pc in the previous frame was the same as the $pc in
normal_frame.
GDB spots that this $pc corresponds to both the function normal_frame
and also the inline function inline_frame. As the next frame is not
an INLINE_FRAME then GDB figures that we have not yet built a frame to
cover inline_frame, and so the inline sniffer claims this new frame.
Our stack is now looking like this:
inline_frame -> normal_frame -> inline_frame
But, we have not yet computed the frame id for the outer most (on the
left) inline_frame. After the frame sniffer has claimed the inline
frame GDB returns to compute_frame_id and calls inline_frame_this_id.
In here GDB calls get_prev_frame_always, which eventually ends up
in get_prev_frame_if_no_cycle again, where we create a raw frame and
call compute_frame_id.
Just like before, compute_frame_id tries to find an unwinder for this
new frame, it sees that the $pc is within both normal_frame and
inline_frame, but the next frame is, again, an INLINE_FRAME, so, just
like before the standard DWARF unwinder claims this frame. Back in
compute_frame_id we again call the standard DWARF function to build
the frame_id for this new copy of normal_frame.
At this point the stack looks like this:
normal_frame -> inline_frame -> normal_frame -> inline_frame
After compute_frame_id we return to get_prev_frame_if_no_cycle, where
we try to add the frame_id for the new normal_frame into the frame_id
cache, however, unlike before, we fail to add this frame_id as it is
a duplicate of the previous normal_frame frame_id. Having found a
duplicate get_prev_frame_if_no_cycle unlinks the new frame from the
stack, and returns nullptr, the stack now looks like this:
inline_frame -> normal_frame -> inline_frame
The nullptr result from get_prev_frame_if_no_cycle is fed back to
inline_frame_this_id, which forwards this to get_frame_id, which
immediately returns null_frame_id. As null_frame_id is not considered
a valid frame_id, this is what triggers the assertion.
In summary then:
- inline_frame_this_id currently assumes that as the inline frame
exists, we will always get a valid frame back from
get_prev_frame_always,
- get_prev_frame_if_no_cycle currently assumes that it is safe to
return nullptr when it sees a cycle.
Notice that in frame.c:compute_frame_id, this code:
fi->this_id.value = outer_frame_id;
fi->unwind->this_id (fi, &fi->prologue_cache, &fi->this_id.value);
gdb_assert (frame_id_p (fi->this_id.value));
The assertion makes it clear that the this_id function must always
return a valid frame_id (e.g. null_frame_id is not a valid return
value), and similarly in inline_frame.c:inline_frame_this_id this
code:
*this_id = get_frame_id (get_prev_frame_always (this_frame));
/* snip comment */
gdb_assert (frame_id_p (*this_id));
Makes it clear that every inline frame expects to be able to get a
previous frame, which will have a valid frame_id.
As I have discussed above, these assumptions don't currently hold in
all cases.
One possibility would be to move the call to get_prev_frame_always
forward from inline_frame_this_id to inline_frame_sniffer, however,
this falls foul of (in frame.c:frame_cleanup_after_sniffer) this
assertion:
/* No sniffer should extend the frame chain; sniff based on what is
already certain. */
gdb_assert (!frame->prev_p);
This assert prohibits any sniffer from trying to get the previous
frame, as getting the previous frame is likely to depend on the next
frame, I can understand why this assertion is a good thing, and I'm in
no rush to alter this rule.
The solution proposed here takes onboard feedback from both Pedro, and
Simon (see the links below). The get_prev_frame_if_no_cycle function
is renamed to get_prev_frame_maybe_check_cycle, and will now not do
cycle detection for inline frames, even when we spot a duplicate frame
it is still returned. This is fine, as, if the normal frame has a
duplicate frame-id then the inline frame will also have a duplicate
frame-id. And so, when we reject the inline frame, the duplicate
normal frame, which is previous to the inline frame, will also be
rejected.
In inline-frame.c the call to get_prev_frame_always is no longer
nested inside the call to get_frame_id. There are reasons why
get_prev_frame_always can return nullptr, for example, if there is a
memory error while trying to get the previous frame, if this should
happen then we now give a more informative error message.
Historical Links:
Patch v2: https://sourceware.org/pipermail/gdb-patches/2021-June/180208.html
Feedback: https://sourceware.org/pipermail/gdb-patches/2021-July/180651.htmlhttps://sourceware.org/pipermail/gdb-patches/2021-July/180663.html
Patch v3: https://sourceware.org/pipermail/gdb-patches/2021-July/181029.html
Feedback: https://sourceware.org/pipermail/gdb-patches/2021-July/181035.html
Additional input: https://sourceware.org/pipermail/gdb-patches/2021-September/182040.html
When running test-case gdb.base/dcache-flush.exp on ubuntu 18.04.5, I run into:
...
(gdb) PASS: gdb.base/dcache-flush.exp: p var2
info dcache^M
Dcache 4096 lines of 64 bytes each.^M
Contains data for Thread 0x7ffff7fc6b80 (LWP 3551)^M
Line 0: address 0x7fffffffd4c0 [47 hits]^M
Line 1: address 0x7fffffffd500 [31 hits]^M
Line 2: address 0x7fffffffd5c0 [7 hits]^M
Cache state: 3 active lines, 85 hits^M
(gdb) FAIL: gdb.base/dcache-flush.exp: check dcache before flushing
...
The regexp expects "Contains data for process $decimal".
This is another case of thread_db_target::pid_to_str being used.
Fix this by updating the regexp.
Tested on x86_64-linux.
The test-case gdb.threads/process-dies-while-detaching.exp takes about 20s
when using hw watchpoints, but when forcing sw watchpoints (using the patch
mentioned in PR28375#c0), the test-case takes instead 3m14s.
Also, it show a FAIL:
...
(gdb) continue^M
Continuing.^M
Cannot find user-level thread for LWP 10324: generic error^M
(gdb) FAIL: gdb.threads/process-dies-while-detaching.exp: single-process:
continue: watchpoint: continue
...
for which PR28375 was filed.
Modify the test-case to:
- add the hw/sw axis to the watchpoint testing, to ensure that we
observe the sw watchpoint behaviour also on can-use-hw-watchpoints
architectures.
- skip the hw breakpoint testing if not supported
- set the sw watchpoint later to avoid making the test
too slow. This still triggers the same PR, but now takes just 24s.
This patch adds a KFAIL for PR28375.
Tested on x86_64-linux.
Minimize gdb restarts, applying the following rules:
- don't use prepare_for_testing unless necessary
- don't use clean_restart unless necessary
Also, if possible, replace build_for_executable + clean_restart
with prepare_for_testing for brevity.
Touches 68 test-cases.
Tested on x86_64-linux.
This started out as changing thread_info::name to a unique_xmalloc_ptr.
That showed that almost all users of that field had the same logic to
get a thread's name: use thread_info::name if non-nullptr, else ask the
target. Factor out this logic in a new thread_name free function. Make
the field private (rename to m_name) and add some accessors.
Change-Id: Iebdd95f4cd21fbefc505249bd1d05befc466a2fc
I noticed that value_true is declared in language.h and defined in
language.c. However, as part of the value API, I think it would be
better in one of those files. And, because it is very short, I
changed it to be an inline function in value.h. I've also removed a
comment from the implementation, on the basis that it seems obsolete
-- if the change it suggests was needed, it probably would have been
done by now; and if it is needed in the future, odds are it would be
done differently anyway.
Finally, this patch also changes value_true and value_logical_not to
return a bool, and updates some uses.
By inspection, I noticed that this code in dcache.c is not
multi-target-aware:
/* If this is a different inferior from what we've recorded,
flush the cache. */
if (inferior_ptid != dcache->ptid)
This doesn't take into account that threads of different targets may
have the same ptid.
Fixed by also storing/comparing the process_stratum_target.
Tested on x86-64-linux-gnu, native and gdbserver.
Change-Id: I4d9d74052c696b72d28cb1c77b697b911725c8d3
We currently have one FAIL while running "make check-perf":
PerfTest::assemble, run ...
python Disassemble().run()
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/pedro/rocm/gdb/src/gdb/testsuite/gdb.perf/lib/perftest/perftest.py", line 64, in run
self.warm_up()
File "<string>", line 25, in warm_up
gdb.error: No symbol "ada_evaluate_subexp" in current context.
Error while executing Python code.
(gdb) FAIL: gdb.perf/disassemble.exp: python Disassemble().run()
...
The gdb.perf/disassemble.exp testcase debugs GDB with itself, runs to
main, and then disassembles a few GDB functions. The problem is that
most(!) functions it is trying to disassemble are now gone...
This commit fixes the issue by simply picking some other functions to
disassemble.
It would perhaps be better to come up with some test program to
disassemble, one that would stay the same throughout the years,
instead of disassembling GDB itself. I don't know why that wasn't
done to begin with. I'll have to leave that for another rainy day,
though.
gdb/testsuite/
yyyy-mm-dd Pedro Alves <pedro@palves.net>
* gdb.perf/disassemble.py (Disassemble::warm_up): Disassemble
evaluate_subexp_do_call instead of ada_evaluate_subexp.
(Disassemble::warm_up): Disassemble "captured_main",
"run_inferior_call" and "update_global_location_list" instead of
"evaluate_subexp_standard" and "c_parse_internal".
Change-Id: I89d1cca89ce2e495dea5096e439685739cc0d3df