2021-05-18 04:16:06 +08:00
|
|
|
/* DWARF CU data structure
|
|
|
|
|
2023-01-01 20:49:04 +08:00
|
|
|
Copyright (C) 2021-2023 Free Software Foundation, Inc.
|
2021-05-18 04:16:06 +08:00
|
|
|
|
|
|
|
This file is part of GDB.
|
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; either version 3 of the License, or
|
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#include "defs.h"
|
|
|
|
#include "dwarf2/cu.h"
|
|
|
|
#include "dwarf2/read.h"
|
2021-05-30 22:00:19 +08:00
|
|
|
#include "objfiles.h"
|
gdb: add "id" fields to identify symtabs and subfiles
Printing macros defined in the main source file doesn't work reliably
using various toolchains, especially when DWARF 5 is used. For example,
using the binaries produced by either of these commands:
$ gcc --version
gcc (GCC) 11.2.0
$ ld --version
GNU ld (GNU Binutils) 2.38
$ gcc test.c -g3 -gdwarf-5
$ clang --version
clang version 13.0.1
$ clang test.c -gdwarf-5 -fdebug-macro
I get:
$ ./gdb -nx -q --data-directory=data-directory a.out
(gdb) start
Temporary breakpoint 1 at 0x111d: file test.c, line 6.
Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out
Temporary breakpoint 1, main () at test.c:6
6 return ZERO;
(gdb) p ZERO
No symbol "ZERO" in current context.
When starting to investigate this (taking the gcc-compiled binary as an
example), we see that GDB fails to look up the appropriate macro scope
when evaluating the expression. While stopped in
macro_lookup_inclusion:
(top-gdb) p name
$1 = 0x62100011a980 "test.c"
(top-gdb) p source.filename
$2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c"
`source` is the macro_source_file that we would expect GDB to find.
`name` comes from the symtab::filename field of the symtab we are
stopped in. GDB doesn't find the appropriate macro_source_file because
the name of the macro_source_file doesn't match exactly the name of the
symtab.
The name of the main symtab comes from the compilation unit's
DW_AT_name, passed to the buildsym_compunit's constructor:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630
The contents of DW_AT_name, in this case, is "test.c". It is typically
(what I witnessed all compilers do) the same string that was passed to
the compiler on the command-line.
The name of the macro_source_file comes from the line number program
header's file table, from the call to the line_header::file_file_name
method:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65
line_header::file_file_name prepends the directory path that the file
entry refers to, in the file table (if the file name is not already
absolute). In this case, the file name is "test.c", appended to the
directory "/home/simark/build/binutils-gdb-one-target/gdb".
Because the symtab's name is not created the same way as the
macro_source_file's name is created, we get this mismatch. GDB fails to
find the appropriate macro scope for the symtab, and we can't print
macros when stopped in that symtab.
To make this work, we must ensure that paths produced in these two ways
end up identical. This can be tricky because of the different ways a
path can be passed to the compiler by the user.
Another thing to consider is that while the main symtab's name (or
subfile, before it becomes a symtab) is created using DW_AT_name, the
main symtab is also referred to using its entry in the line table
header's file table, when processing the line table. We must therefore
ensure that the same name is produced in both cases, so that a call to
"start_subfile" for the main subfile will correctly find the
already-created subfile, created by buildsym_compunit's constructor. If
we fail to do that, things still often work, because of a fallback: the
watch_main_source_file_lossage method. This method determines that if
the main subfile has no symbols but there exists another subfile with
the same basename (e.g. "test.c") that does have symbols, it's probably
because there was some filename mismatch. So it replaces the main
subfile with that other subfile. I think that heuristic is useful as a
last effort to work around any bug or bad debug info, but I don't think
we should design things such as to rely on it. It's a heuristic, it can
get things wrong. So in my search for a fix, it is important that given
some good debug info, we don't end up relying on that for things to
work.
A first attempt at fixing this was to try to prepend the compilation
directory here or not prepend it there. In practice, because of all the
possible combinations of debug info the compilers produce, it was not
possible to get something that would produce reliable, consistent paths.
Another attempt at fixing this was to make both macro_source_file
objects and symtab objects use the most complete form of path possible.
That means to prepend directories at least until we get an absolute
path. In theory, we should end up with the same path in all cases.
This generally worked, but because it changed the symtab names, it
resulted in user-visible changes (for example, paths to source files in
Breakpoint hit messages becoming always absolute). I didn't find this
very good, first because there is a "set filename-display" setting that
lets the user control how they want the paths to be displayed, and that
would suddenly make this setting completely ineffective (although even
today, it is a bit dependent on the debug info). Second, it would
require a good amount of testsuite tweaks to make tests accept these
suddenly absolute paths.
This new patch is a slight variation of that: it adds a new field called
"filename_for_id" in struct symtab and struct subfile, next to the
existing filename field. The goal is to separate the internal ids used
for finding objects from the names used for presentation. This field is
used for identifying subfiles, symtabs and macro_source_files
internally. For DWARF symtabs, this new field is meant to contain the
"most complete possible" path, as discussed above. So for a given file,
it must always be in the same form, everywhere. The existing
symtab::filename field remains the one used for printing to the user, so
there shouldn't be any change in how paths are printed.
Changes in the core symtab files are:
- Add "name_for_id" and "filename_for_id" fields to "struct subfile"
and "struct symtab", next to existing "name" and "filename" fields.
- Make buildsym_compunit::buildsym_compunit and
buildsym_compunit::start_subfile accept a "name_for_id" parameter
next to the existing "name" ones.
- Make buildsym_compunit::start_subfile use "name_for_id" for looking
up existing subfiles. This is the key thing for making calls
to start_subfile for the main source file look up the existing
subfile successfully, and avoid relying on
watch_main_source_file_lossage.
- Make sal_macro_scope pass "filename_for_id", rather than "filename",
to macro_lookup_inclusion. This is the key thing to making the
lookup work and macro printing work.
Changes in the DWARF files are:
- Make line_header::file_file_name return the "most complete possible"
name. The only pre-existing user of this method is the macro code,
to give the macro_source_file objects their name. And we now want
them to have this "most complete possible" name, which will match the
corresponding symtab's "filename_for_id".
- Make dwarf2_cu::start_compunit_symtab pass the "most complete
possible" name for the main symtab's "filename_for_id". In this
context, where the info comes from the compilation unit's DW_AT_name
/ DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if
DW_AT_name is not already absolute.
- Change dwarf2_start_subfile to build a name_for_id for the subfile
being started. The simplest way is to re-use
line_header::file_file_name, since the callers always have a
file_entry handy. This ensures that it will get the exact same path
representation as the macro code does, for the same file (since it
also uses line_header::file_file_name).
- Update calls to allocate_symtab to pass the "name_for_id" from the
subfile.
Tests exercising all this are added by the following patch.
Of all the cases I tried, the only one I found that ends up relying on
watch_main_source_file_lossage is the following one:
$ clang --version
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang ./test.c -g3 -O0 -gdwarf-4
$ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1" a.out
...
[symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c)
[symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile
As we can see, there are two forms used for "test.c", one with a "." and
one without. This comes from the fact that the compilation unit DIE
contains:
DW_AT_name ("test.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
without a ".", and the line table for that file contains:
include_directories[ 1] = "."
file_names[ 1]:
name: "test.c"
dir_index: 1
When assembling the filename from that entry, we get a ".".
It is a bit unexpected that the main filename resulting from the line
table header does not match exactly the name in the compilation unit.
For instance, gcc uses "./test.c" for the DW_AT_name, which gives
identical paths in the compilation unit and in the line table header.
Similarly, with DWARF 5:
$ clang ./test.c -g3 -O0 -gdwarf-5
clang create two entries that refer to the same file but are of in a different
form.
include_directories[ 0] = "/home/simark/build/binutils-gdb-one-target/gdb"
include_directories[ 1] = "."
file_names[ 0]:
name: "test.c"
dir_index: 0
file_names[ 1]:
name: "test.c"
dir_index: 1
The first file name produces a path without a "." while the second does.
This is not caught by watch_main_source_file_lossage, because of
dwarf_decode_lines that creates a symtab for each file entry in the line
table. It therefore appears as "non-empty" to
watch_main_source_file_lossage. This results in two symtabs:
(gdb) maintenance info symtabs
{ objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00)
{ ((struct compunit_symtab *) 0x62100011aca0)
debugformat DWARF 5
producer clang version 13.0.1
name test.c
dirname /home/simark/build/binutils-gdb-one-target/gdb
blockvector ((struct blockvector *) 0x621000129ec0)
user ((struct compunit_symtab *) (null))
{ symtab test.c ((struct symtab *) 0x62100011ad20)
fullname (null)
linetable ((struct linetable *) 0x0)
}
{ symtab ./test.c ((struct symtab *) 0x62100011ad60)
fullname (null)
linetable ((struct linetable *) 0x621000129ef0)
}
}
}
I am not sure what is the consequence of this, but this is also what
happens before my patch, so I think its acceptable to leave it as-is.
To handle these two cases nicely, I think we will need a function that
removes the unnecessary "." from path names, something that can be done
later.
Finally, I made a change in find_file_and_directory is necessary to
avoid breaking test
gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42
Without that change, we would get:
(gdb) info source
Current source file is /dir/d/dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
whereas the expected result is:
(gdb) info source
Current source file is dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
This test was added here:
https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html
Long story short, GCC <= 4.2 apparently had a bug where it would
generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S")
and no DW_AT_comp_dir. The line table has one entry with filename
"dw2-compdir-oldgcc42.S", which refers to directory 0. Directory 0
normally refers to the compilation unit's comp dir, but it is
non-existent in this case.
This caused some symtab lookup problems, and to work around them, some
workaround was added, which today reads as:
if (res.get_comp_dir () == nullptr
&& producer_is_gcc_lt_4_3 (cu)
&& res.get_name () != nullptr
&& IS_ABSOLUTE_PATH (res.get_name ()))
res.set_comp_dir (ldirname (res.get_name ()));
Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432
It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no
DW_AT_comp_dir and DW_AT_name is absolute.
Prior to my patch, a subfile would get created with filename
"/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get
created with filename "dw2-compdir-oldgcc42.S" from the line table's
file table. Then watch_main_source_file_lossage would kick in and merge
them, keeping only the "dw2-compdir-oldgcc42.S" one:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S)
[symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile
And so "info source" would show "dw2-compdir-oldgcc42.S" as the
filename.
With my patch applied, but without the change in
find_file_and_directory, both DW_AT_name and the line table would try to
start a subfile with the same filename_for_id, and there was no need for
watch_main_source_file_lossage - which is what we want:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming
from DW_AT_name, gets created first, it wins, and the symtab ends up
with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows
"/dir/d/dw2-compdir-oldgcc42.S" and the test breaks.
This is not wrong per-se, after all DW_AT_name is
"/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the
current source file as "/dir/d/dw2-compdir-oldgcc42.S". If you compile
a file passing "/an/absolute/path.c", DW_AT_name typically contains (at
least with GCC) "/an/absolute/path.c" and GDB tells you that the source
file is "/an/absolute/path.c". But we can also keep the existing
behavior fairly easily with a little change in find_file_and_directory.
When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now
modify the name to just keep the file part. The result is coherent with
what compilers do when you compile a file by just passing its filename
("gcc path.c -g"):
DW_AT_name ("path.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
With this change, filename_for_id is still the full name,
"/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile /
symtab (what ends up shown by "info source") is just
"dw2-compdir-oldgcc42.S", and that makes the test happy.
Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307
2022-07-29 00:34:47 +08:00
|
|
|
#include "filenames.h"
|
|
|
|
#include "gdbsupport/pathstuff.h"
|
2021-05-18 04:16:06 +08:00
|
|
|
|
|
|
|
/* Initialize dwarf2_cu to read PER_CU, in the context of PER_OBJFILE. */
|
|
|
|
|
|
|
|
dwarf2_cu::dwarf2_cu (dwarf2_per_cu_data *per_cu,
|
|
|
|
dwarf2_per_objfile *per_objfile)
|
|
|
|
: per_cu (per_cu),
|
|
|
|
per_objfile (per_objfile),
|
2021-05-18 04:16:06 +08:00
|
|
|
m_mark (false),
|
2021-05-18 04:16:06 +08:00
|
|
|
has_loclist (false),
|
|
|
|
checked_producer (false),
|
|
|
|
producer_is_gxx_lt_4_6 (false),
|
|
|
|
producer_is_gcc_lt_4_3 (false),
|
gdb: work around negative DW_AT_data_member_location GCC 11 bug
g++ 11.1.0 has a bug where it will emit a negative
DW_AT_data_member_location in some cases:
$ cat test.cpp
#include <memory>
int
main()
{
std::unique_ptr<int> ptr;
}
$ g++ -g test.cpp
$ llvm-dwarfdump -F a.out
...
0x00000964: DW_TAG_member
DW_AT_name [DW_FORM_strp] ("_M_head_impl")
DW_AT_decl_file [DW_FORM_data1] ("/usr/include/c++/11.1.0/tuple")
DW_AT_decl_line [DW_FORM_data1] (125)
DW_AT_decl_column [DW_FORM_data1] (0x27)
DW_AT_type [DW_FORM_ref4] (0x0000067a "default_delete<int>")
DW_AT_data_member_location [DW_FORM_sdata] (-1)
...
This leads to a GDB crash (when built with ASan, otherwise probably
garbage results), since it tries to read just before (to the left, in
ASan speak) of the value's buffer:
==888645==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000c52af at pc 0x7f711b239f4b bp 0x7fff356bd470 sp 0x7fff356bcc18
READ of size 1 at 0x6020000c52af thread T0
#0 0x7f711b239f4a in __interceptor_memcpy /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x555c4977efa1 in value_contents_copy_raw /home/simark/src/binutils-gdb/gdb/value.c:1347
#2 0x555c497909cd in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3126
#3 0x555c478f2eaa in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:333
#4 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#5 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#6 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#10 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#11 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#12 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#13 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
#14 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
#15 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
#16 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#17 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#18 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#19 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#20 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#21 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
#22 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
#23 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
#24 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#25 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#26 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#27 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
#28 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
#29 0x555c4760f04c in c_value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:587
#30 0x555c483ff954 in language_defn::value_print(value*, ui_file*, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:614
#31 0x555c49759f61 in value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1189
#32 0x555c48950f70 in print_formatted /home/simark/src/binutils-gdb/gdb/printcmd.c:337
#33 0x555c48958eda in print_value(value*, value_print_options const&) /home/simark/src/binutils-gdb/gdb/printcmd.c:1258
#34 0x555c48959891 in print_command_1 /home/simark/src/binutils-gdb/gdb/printcmd.c:1367
#35 0x555c4895a3df in print_command /home/simark/src/binutils-gdb/gdb/printcmd.c:1458
#36 0x555c4767f974 in do_simple_func /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:97
#37 0x555c47692e25 in cmd_func(cmd_list_element*, char const*, int) /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2475
#38 0x555c4936107e in execute_command(char const*, int) /home/simark/src/binutils-gdb/gdb/top.c:670
#39 0x555c485f1bff in catch_command_errors /home/simark/src/binutils-gdb/gdb/main.c:523
#40 0x555c485f249c in execute_cmdargs /home/simark/src/binutils-gdb/gdb/main.c:618
#41 0x555c485f6677 in captured_main_1 /home/simark/src/binutils-gdb/gdb/main.c:1317
#42 0x555c485f6c83 in captured_main /home/simark/src/binutils-gdb/gdb/main.c:1338
#43 0x555c485f6d65 in gdb_main(captured_main_args*) /home/simark/src/binutils-gdb/gdb/main.c:1363
#44 0x555c46e41ba8 in main /home/simark/src/binutils-gdb/gdb/gdb.c:32
#45 0x7f71198bcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
#46 0x555c46e4197d in _start (/home/simark/build/binutils-gdb-one-target/gdb/gdb+0x77f197d)
0x6020000c52af is located 1 bytes to the left of 8-byte region [0x6020000c52b0,0x6020000c52b8)
allocated by thread T0 here:
#0 0x7f711b2b7459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x555c470acdc9 in xcalloc /home/simark/src/binutils-gdb/gdb/alloc.c:100
#2 0x555c49b775cd in xzalloc(unsigned long) /home/simark/src/binutils-gdb/gdbsupport/common-utils.cc:29
#3 0x555c4977bdeb in allocate_value_contents /home/simark/src/binutils-gdb/gdb/value.c:1029
#4 0x555c4977be25 in allocate_value(type*) /home/simark/src/binutils-gdb/gdb/value.c:1040
#5 0x555c4979030d in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3092
#6 0x555c478f6280 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:501
#7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#10 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#11 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#12 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#13 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#14 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#15 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
#16 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
#17 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
#18 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513
#19 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161
#20 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#21 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#22 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#23 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
#24 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151
#25 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335
#26 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383
#27 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438
#28 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632
#29 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048
Since there are some binaries with this in the wild, I think it would be
useful for GDB to work around this. I did the obvious simple thing, if
the DW_AT_data_member_location's value is -1, replace it with 0. I
added a producer check to only apply this fixup for GCC 11. The idea is
that if some other compiler ever uses a DW_AT_data_member_location value
of -1 by mistake, we don't know (before analyzing the bug at least) if
they did mean 0 or some other value. So I wouldn't want to apply the
fixup in that case.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28063
Change-Id: Ieef3459b0b9bbce8bdad838ba83b4b64e7269d42
2022-01-28 06:35:26 +08:00
|
|
|
producer_is_gcc_11 (false),
|
2021-05-18 04:16:06 +08:00
|
|
|
producer_is_icc (false),
|
|
|
|
producer_is_icc_lt_14 (false),
|
|
|
|
producer_is_codewarrior (false),
|
Fix jump on uninit producer_is_clang bit of cu.h dwarf2_cu struct.
Valgrind reports a "jump on unitialised bit error" when running
e.g. gdb.base/macro-source-path.exp (see details below).
Fix this by initializing producer_is_clang member variable of dwarf2_cu.
Tested on amd64/debian11 and re-running gdb.base/macro-source-path.exp
under valgrind.
==2140965== Conditional jump or move depends on uninitialised value(s)
==2140965== at 0x5211F7: dwarf_decode_macro_bytes(dwarf2_per_objfile*, buildsym_compunit*, bfd*, unsigned char const*, unsigned char const*, macro_source_file*, line_header const*, dwarf2_section_info const*, int, int, unsigned int, dwarf2_section_info*, dwarf2_section_info*, gdb::optional<unsigned long>, htab*, dwarf2_cu*) (macro.c:676)
==2140965== by 0x52158A: dwarf_decode_macros(dwarf2_per_objfile*, buildsym_compunit*, dwarf2_section_info const*, line_header const*, unsigned int, unsigned int, dwarf2_section_info*, dwarf2_section_info*, gdb::optional<unsigned long>, int, dwarf2_cu*) (macro.c:967)
==2140965== by 0x523BC4: dwarf_decode_macros(dwarf2_cu*, unsigned int, int) (read.c:23379)
==2140965== by 0x552AB5: read_file_scope(die_info*, dwarf2_cu*) (read.c:9687)
==2140965== by 0x54F7B2: process_die(die_info*, dwarf2_cu*) (read.c:8660)
==2140965== by 0x5569C7: process_full_comp_unit (read.c:8429)
==2140965== by 0x5569C7: process_queue (read.c:7675)
==2140965== by 0x5569C7: dw2_do_instantiate_symtab (read.c:2063)
==2140965== by 0x5569C7: dw2_instantiate_symtab(dwarf2_per_cu_data*, dwarf2_per_objfile*, bool) (read.c:2085)
==2140965== by 0x55700B: dw2_expand_symtabs_matching_one(dwarf2_per_cu_data*, dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>, gdb::function_view<bool (compunit_symtab*)>) (read.c:3984)
==2140965== by 0x557EA3: cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) (read.c:18781)
==2140965== by 0x778977: objfile::lookup_symbol(block_enum, char const*, domain_enum) (symfile-debug.c:276)
....
==2140965== Uninitialised value was created by a heap allocation
==2140965== at 0x4839F01: operator new(unsigned long) (vg_replace_malloc.c:434)
==2140965== by 0x533A64: cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) (read.c:6264)
==2140965== by 0x5340C2: load_full_comp_unit(dwarf2_per_cu_data*, dwarf2_per_objfile*, dwarf2_cu*, bool, language) (read.c:7729)
==2140965== by 0x548338: load_cu(dwarf2_per_cu_data*, dwarf2_per_objfile*, bool) (read.c:2021)
==2140965== by 0x55634C: dw2_do_instantiate_symtab (read.c:2048)
==2140965== by 0x55634C: dw2_instantiate_symtab(dwarf2_per_cu_data*, dwarf2_per_objfile*, bool) (read.c:2085)
==2140965== by 0x55700B: dw2_expand_symtabs_matching_one(dwarf2_per_cu_data*, dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>, gdb::function_view<bool (compunit_symtab*)>) (read.c:3984)
==2140965== by 0x557EA3: cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) (read.c:18781)
==2140965== by 0x778977: objfile::lookup_symbol(block_enum, char const*, domain_enum) (symfile-debug.c:276)
....
2022-11-26 19:43:58 +08:00
|
|
|
producer_is_clang (false),
|
[gdb/symtab] Work around gas PR28629
When running test-case gdb.tui/tui-layout-asm-short-prog.exp on AlmaLinux 9.2
ppc64le, I run into:
...
FAIL: gdb.tui/tui-layout-asm-short-prog.exp: check asm box contents
...
The problem is that we get:
...
7 [ No Assembly Available ]
...
because tui_get_begin_asm_address doesn't succeed.
In more detail, tui_get_begin_asm_address calls:
...
find_line_pc (sal.symtab, sal.line, &addr);
...
with:
...
(gdb) p *sal.symtab
$5 = {next = 0x130393c0, m_compunit = 0x130392f0, m_linetable = 0x0,
filename = "tui-layout-asm-short-prog.S",
filename_for_id = "$gdb/build/gdb/testsuite/tui-layout-asm-short-prog.S",
m_language = language_asm, fullname = 0x0}
(gdb) p sal.line
$6 = 1
...
The problem is the filename_for_id which is the source file prefixed with the
compilation dir rather than the source dir.
This is due to faulty debug info generated by gas, PR28629:
...
<1a> DW_AT_name : tui-layout-asm-short-prog.S
<1e> DW_AT_comp_dir : $gdb/build/gdb/testsuite
<22> DW_AT_producer : GNU AS 2.35.2
...
The DW_AT_name is relative, and it's relative to the DW_AT_comp_dir entry,
making the effective name $gdb/build/gdb/testsuite/tui-layout-asm-short-prog.S.
The bug is fixed starting version 2.38, where we get instead:
...
<1a> DW_AT_name :
$gdb/src/gdb/testsuite/gdb.tui/tui-layout-asm-short-prog.S
<1e> DW_AT_comp_dir : $gdb/build/gdb/testsuite
<22> DW_AT_producer : GNU AS 2.38
...
Work around the faulty debug info by constructing the filename_for_id using
the second directory from the directory table in the .debug_line header:
...
The Directory Table (offset 0x22, lines 2, columns 1):
Entry Name
0 $gdb/build/gdb/testsuite
1 $gdb/src/gdb/testsuite/gdb.tui
...
Note that the used gas contains a backport of commit 3417bfca676 ("GAS:
DWARF-5: Ensure that the 0'th entry in the directory table contains the
current working directory."), because directory 0 is correct. With the
unpatched 2.35.2 release the directory 0 entry is incorrect: it's a copy of
entry 1.
Add a dwarf assembly test-case that reflects the debug info as generated by
unpatched gas 2.35.2.
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
2023-11-01 07:33:12 +08:00
|
|
|
producer_is_gas_lt_2_38 (false),
|
[gdb/symtab] Work around PR gas/29517
When using glibc debuginfo generated with gas 2.39, we run into PR gas/29517:
...
$ gdb -q -batch a.out -ex start -ex "p (char *)strstr (\"haha\", \"ah\")"
Temporary breakpoint 1 at 0x40051b: file hello.c, line 6.
Temporary breakpoint 1, main () at hello.c:6
6 printf ("hello\n");
Invalid cast.
...
while without glibc debuginfo installed we get the expected result:
...
$n = 0x7ffff7daa1b1 "aha"
...
and likewise with glibc debuginfo generated with gas 2.40.
The strstr ifunc resolves to __strstr_sse2_unaligned. The problem is that gas
generates dwarf that states that the return type is void:
...
<1><3e1e58>: Abbrev Number: 2 (DW_TAG_subprogram)
<3e1e59> DW_AT_name : __strstr_sse2_unaligned
<3e1e5d> DW_AT_external : 1
<3e1e5e> DW_AT_low_pc : 0xbbd2e
<3e1e66> DW_AT_high_pc : 0xbc1c3
...
while the return type should be a DW_TAG_unspecified_type, as is the case
with gas 2.40.
We can still use the workaround of casting to another function type for both
__strstr_sse2_unaligned:
...
(gdb) p ((char * (*) (const char *, const char *))__strstr_sse2_unaligned) \
("haha", "ah")
$n = 0x7ffff7daa211 "aha"
...
and strstr (which requires using *strstr to dereference the ifunc before we
cast):
...
gdb) p ((char * (*) (const char *, const char *))*strstr) ("haha", "ah")
$n = 0x7ffff7daa251 "aha"
...
but that's a bit cumbersome to use.
Work around this in the dwarf reader, such that we have instead:
...
(gdb) p (char *)strstr ("haha", "ah")
$n = 0x7ffff7daa1b1 "aha"
...
This also requires fixing producer_is_gcc to stop returning true for
producer "GNU AS 2.39.0".
Tested on x86_64-linux.
Approved-By: Andrew Burgess <aburgess@redhat.com>
PR symtab/30911
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30911
2023-10-16 22:32:28 +08:00
|
|
|
producer_is_gas_2_39 (false),
|
[gdb/symtab] Fix Dwarf Error: cannot find DIE
When loading the debug info package
libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we
run into a dwarf error:
...
$ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug
Dwarf Error: Cannot not find DIE at 0x18a936e7 \
[from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug]
...
The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52.
No error message is printed when using -readnow.
What happens is the following:
- a dwarf2_per_cu_data P is created for the CU.
- a dwarf2_cu A is created for the same CU.
- another dwarf2_cu B is created for the same CU.
- the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that
per_objfile->get_cu (P) returns B.
- P->load_all_dies is set to 1.
- all dies are read into the A->partial_dies htab
- dwarf2_cu A is destroyed.
- we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies.
We can't find it, but do not try to load all dies, because P->load_all_dies
is already set to 1.
- an error message is generated.
The question is why we're creating dwarf2_cu A and B for the same CU.
The dwarf2_cu A is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7028
...
And the dwarf2_cu B is created here:
...
(gdb) bt
#0 dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30,
per_objfile=0x1ad01b0) at dwarf2/cu.c:38
#1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50,
this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
#2 0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436
#3 0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054),
offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391
#4 0x000000000069755b in partial_die_info::fixup (this=0x9096900,
cu=0xa6a85f0) at dwarf2/read.c:19512
#5 0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0,
cu=0xa6a85f0) at dwarf2/read.c:19516
#6 0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7563
#7 0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0,
lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
at dwarf2/read.c:7580
#8 0x0000000000676b82 in process_psymtab_comp_unit_reader
(reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0,
pretend_language=language_minimal) at dwarf2/read.c:6954
#9 0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30,
per_objfile=0x1ad01b0, want_partial_unit=false,
pretend_language=language_minimal) at dwarf2/read.c:7057
...
So in frame #9, a cutu_reader is created with dwarf2_cu A. Then a fixup takes
us to the following CU @ 0x18aa33d6, in frame #5. And a similar fixup in
frame #4 takes us back to CU @ 0x18a23e52. At that point, there's no
information available that we're already trying to read that CU, and we end up
creating another cutu_reader with dwarf2_cu B.
It seems that there are two related problems:
- creating two dwarf2_cu's is not optimal
- the unoptimal case is not handled correctly
This patch addresses the last problem, by moving the load_all_dies flag from
dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies
field, which ensures that the two can be kept in sync.
Tested on x86_64-linux.
gdb/ChangeLog:
2021-05-27 Tom de Vries <tdevries@suse.de>
PR symtab/27898
* dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init.
* dwarf2/cu.h (dwarf2_cu): Add load_all_dies field.
* dwarf2/read.c (load_partial_dies, find_partial_die): Update.
* dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove
load_all_dies init.
(dwarf2_per_cu_data): Remove load_all_dies field.
2021-05-27 21:22:38 +08:00
|
|
|
processing_has_namespace_info (false),
|
|
|
|
load_all_dies (false)
|
2021-05-18 04:16:06 +08:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
/* See cu.h. */
|
|
|
|
|
|
|
|
struct type *
|
|
|
|
dwarf2_cu::addr_sized_int_type (bool unsigned_p) const
|
|
|
|
{
|
|
|
|
int addr_size = this->per_cu->addr_size ();
|
2021-10-19 02:15:21 +08:00
|
|
|
return objfile_int_type (this->per_objfile->objfile, addr_size, unsigned_p);
|
2021-05-18 04:16:06 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Start a symtab for DWARF. NAME, COMP_DIR, LOW_PC are passed to the
|
|
|
|
buildsym_compunit constructor. */
|
|
|
|
|
|
|
|
struct compunit_symtab *
|
2022-03-29 06:17:17 +08:00
|
|
|
dwarf2_cu::start_compunit_symtab (const char *name, const char *comp_dir,
|
|
|
|
CORE_ADDR low_pc)
|
2021-05-18 04:16:06 +08:00
|
|
|
{
|
|
|
|
gdb_assert (m_builder == nullptr);
|
|
|
|
|
gdb: add "id" fields to identify symtabs and subfiles
Printing macros defined in the main source file doesn't work reliably
using various toolchains, especially when DWARF 5 is used. For example,
using the binaries produced by either of these commands:
$ gcc --version
gcc (GCC) 11.2.0
$ ld --version
GNU ld (GNU Binutils) 2.38
$ gcc test.c -g3 -gdwarf-5
$ clang --version
clang version 13.0.1
$ clang test.c -gdwarf-5 -fdebug-macro
I get:
$ ./gdb -nx -q --data-directory=data-directory a.out
(gdb) start
Temporary breakpoint 1 at 0x111d: file test.c, line 6.
Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out
Temporary breakpoint 1, main () at test.c:6
6 return ZERO;
(gdb) p ZERO
No symbol "ZERO" in current context.
When starting to investigate this (taking the gcc-compiled binary as an
example), we see that GDB fails to look up the appropriate macro scope
when evaluating the expression. While stopped in
macro_lookup_inclusion:
(top-gdb) p name
$1 = 0x62100011a980 "test.c"
(top-gdb) p source.filename
$2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c"
`source` is the macro_source_file that we would expect GDB to find.
`name` comes from the symtab::filename field of the symtab we are
stopped in. GDB doesn't find the appropriate macro_source_file because
the name of the macro_source_file doesn't match exactly the name of the
symtab.
The name of the main symtab comes from the compilation unit's
DW_AT_name, passed to the buildsym_compunit's constructor:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630
The contents of DW_AT_name, in this case, is "test.c". It is typically
(what I witnessed all compilers do) the same string that was passed to
the compiler on the command-line.
The name of the macro_source_file comes from the line number program
header's file table, from the call to the line_header::file_file_name
method:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65
line_header::file_file_name prepends the directory path that the file
entry refers to, in the file table (if the file name is not already
absolute). In this case, the file name is "test.c", appended to the
directory "/home/simark/build/binutils-gdb-one-target/gdb".
Because the symtab's name is not created the same way as the
macro_source_file's name is created, we get this mismatch. GDB fails to
find the appropriate macro scope for the symtab, and we can't print
macros when stopped in that symtab.
To make this work, we must ensure that paths produced in these two ways
end up identical. This can be tricky because of the different ways a
path can be passed to the compiler by the user.
Another thing to consider is that while the main symtab's name (or
subfile, before it becomes a symtab) is created using DW_AT_name, the
main symtab is also referred to using its entry in the line table
header's file table, when processing the line table. We must therefore
ensure that the same name is produced in both cases, so that a call to
"start_subfile" for the main subfile will correctly find the
already-created subfile, created by buildsym_compunit's constructor. If
we fail to do that, things still often work, because of a fallback: the
watch_main_source_file_lossage method. This method determines that if
the main subfile has no symbols but there exists another subfile with
the same basename (e.g. "test.c") that does have symbols, it's probably
because there was some filename mismatch. So it replaces the main
subfile with that other subfile. I think that heuristic is useful as a
last effort to work around any bug or bad debug info, but I don't think
we should design things such as to rely on it. It's a heuristic, it can
get things wrong. So in my search for a fix, it is important that given
some good debug info, we don't end up relying on that for things to
work.
A first attempt at fixing this was to try to prepend the compilation
directory here or not prepend it there. In practice, because of all the
possible combinations of debug info the compilers produce, it was not
possible to get something that would produce reliable, consistent paths.
Another attempt at fixing this was to make both macro_source_file
objects and symtab objects use the most complete form of path possible.
That means to prepend directories at least until we get an absolute
path. In theory, we should end up with the same path in all cases.
This generally worked, but because it changed the symtab names, it
resulted in user-visible changes (for example, paths to source files in
Breakpoint hit messages becoming always absolute). I didn't find this
very good, first because there is a "set filename-display" setting that
lets the user control how they want the paths to be displayed, and that
would suddenly make this setting completely ineffective (although even
today, it is a bit dependent on the debug info). Second, it would
require a good amount of testsuite tweaks to make tests accept these
suddenly absolute paths.
This new patch is a slight variation of that: it adds a new field called
"filename_for_id" in struct symtab and struct subfile, next to the
existing filename field. The goal is to separate the internal ids used
for finding objects from the names used for presentation. This field is
used for identifying subfiles, symtabs and macro_source_files
internally. For DWARF symtabs, this new field is meant to contain the
"most complete possible" path, as discussed above. So for a given file,
it must always be in the same form, everywhere. The existing
symtab::filename field remains the one used for printing to the user, so
there shouldn't be any change in how paths are printed.
Changes in the core symtab files are:
- Add "name_for_id" and "filename_for_id" fields to "struct subfile"
and "struct symtab", next to existing "name" and "filename" fields.
- Make buildsym_compunit::buildsym_compunit and
buildsym_compunit::start_subfile accept a "name_for_id" parameter
next to the existing "name" ones.
- Make buildsym_compunit::start_subfile use "name_for_id" for looking
up existing subfiles. This is the key thing for making calls
to start_subfile for the main source file look up the existing
subfile successfully, and avoid relying on
watch_main_source_file_lossage.
- Make sal_macro_scope pass "filename_for_id", rather than "filename",
to macro_lookup_inclusion. This is the key thing to making the
lookup work and macro printing work.
Changes in the DWARF files are:
- Make line_header::file_file_name return the "most complete possible"
name. The only pre-existing user of this method is the macro code,
to give the macro_source_file objects their name. And we now want
them to have this "most complete possible" name, which will match the
corresponding symtab's "filename_for_id".
- Make dwarf2_cu::start_compunit_symtab pass the "most complete
possible" name for the main symtab's "filename_for_id". In this
context, where the info comes from the compilation unit's DW_AT_name
/ DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if
DW_AT_name is not already absolute.
- Change dwarf2_start_subfile to build a name_for_id for the subfile
being started. The simplest way is to re-use
line_header::file_file_name, since the callers always have a
file_entry handy. This ensures that it will get the exact same path
representation as the macro code does, for the same file (since it
also uses line_header::file_file_name).
- Update calls to allocate_symtab to pass the "name_for_id" from the
subfile.
Tests exercising all this are added by the following patch.
Of all the cases I tried, the only one I found that ends up relying on
watch_main_source_file_lossage is the following one:
$ clang --version
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang ./test.c -g3 -O0 -gdwarf-4
$ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1" a.out
...
[symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c)
[symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile
As we can see, there are two forms used for "test.c", one with a "." and
one without. This comes from the fact that the compilation unit DIE
contains:
DW_AT_name ("test.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
without a ".", and the line table for that file contains:
include_directories[ 1] = "."
file_names[ 1]:
name: "test.c"
dir_index: 1
When assembling the filename from that entry, we get a ".".
It is a bit unexpected that the main filename resulting from the line
table header does not match exactly the name in the compilation unit.
For instance, gcc uses "./test.c" for the DW_AT_name, which gives
identical paths in the compilation unit and in the line table header.
Similarly, with DWARF 5:
$ clang ./test.c -g3 -O0 -gdwarf-5
clang create two entries that refer to the same file but are of in a different
form.
include_directories[ 0] = "/home/simark/build/binutils-gdb-one-target/gdb"
include_directories[ 1] = "."
file_names[ 0]:
name: "test.c"
dir_index: 0
file_names[ 1]:
name: "test.c"
dir_index: 1
The first file name produces a path without a "." while the second does.
This is not caught by watch_main_source_file_lossage, because of
dwarf_decode_lines that creates a symtab for each file entry in the line
table. It therefore appears as "non-empty" to
watch_main_source_file_lossage. This results in two symtabs:
(gdb) maintenance info symtabs
{ objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00)
{ ((struct compunit_symtab *) 0x62100011aca0)
debugformat DWARF 5
producer clang version 13.0.1
name test.c
dirname /home/simark/build/binutils-gdb-one-target/gdb
blockvector ((struct blockvector *) 0x621000129ec0)
user ((struct compunit_symtab *) (null))
{ symtab test.c ((struct symtab *) 0x62100011ad20)
fullname (null)
linetable ((struct linetable *) 0x0)
}
{ symtab ./test.c ((struct symtab *) 0x62100011ad60)
fullname (null)
linetable ((struct linetable *) 0x621000129ef0)
}
}
}
I am not sure what is the consequence of this, but this is also what
happens before my patch, so I think its acceptable to leave it as-is.
To handle these two cases nicely, I think we will need a function that
removes the unnecessary "." from path names, something that can be done
later.
Finally, I made a change in find_file_and_directory is necessary to
avoid breaking test
gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42
Without that change, we would get:
(gdb) info source
Current source file is /dir/d/dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
whereas the expected result is:
(gdb) info source
Current source file is dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
This test was added here:
https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html
Long story short, GCC <= 4.2 apparently had a bug where it would
generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S")
and no DW_AT_comp_dir. The line table has one entry with filename
"dw2-compdir-oldgcc42.S", which refers to directory 0. Directory 0
normally refers to the compilation unit's comp dir, but it is
non-existent in this case.
This caused some symtab lookup problems, and to work around them, some
workaround was added, which today reads as:
if (res.get_comp_dir () == nullptr
&& producer_is_gcc_lt_4_3 (cu)
&& res.get_name () != nullptr
&& IS_ABSOLUTE_PATH (res.get_name ()))
res.set_comp_dir (ldirname (res.get_name ()));
Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432
It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no
DW_AT_comp_dir and DW_AT_name is absolute.
Prior to my patch, a subfile would get created with filename
"/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get
created with filename "dw2-compdir-oldgcc42.S" from the line table's
file table. Then watch_main_source_file_lossage would kick in and merge
them, keeping only the "dw2-compdir-oldgcc42.S" one:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S)
[symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile
And so "info source" would show "dw2-compdir-oldgcc42.S" as the
filename.
With my patch applied, but without the change in
find_file_and_directory, both DW_AT_name and the line table would try to
start a subfile with the same filename_for_id, and there was no need for
watch_main_source_file_lossage - which is what we want:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming
from DW_AT_name, gets created first, it wins, and the symtab ends up
with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows
"/dir/d/dw2-compdir-oldgcc42.S" and the test breaks.
This is not wrong per-se, after all DW_AT_name is
"/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the
current source file as "/dir/d/dw2-compdir-oldgcc42.S". If you compile
a file passing "/an/absolute/path.c", DW_AT_name typically contains (at
least with GCC) "/an/absolute/path.c" and GDB tells you that the source
file is "/an/absolute/path.c". But we can also keep the existing
behavior fairly easily with a little change in find_file_and_directory.
When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now
modify the name to just keep the file part. The result is coherent with
what compilers do when you compile a file by just passing its filename
("gcc path.c -g"):
DW_AT_name ("path.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
With this change, filename_for_id is still the full name,
"/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile /
symtab (what ends up shown by "info source") is just
"dw2-compdir-oldgcc42.S", and that makes the test happy.
Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307
2022-07-29 00:34:47 +08:00
|
|
|
std::string name_for_id_holder;
|
|
|
|
const char *name_for_id = name;
|
|
|
|
|
|
|
|
/* Prepend the compilation directory to the filename if needed (if not
|
|
|
|
absolute already) to get the "name for id" for our main symtab. The name
|
|
|
|
for the main file coming from the line table header will be generated using
|
|
|
|
the same logic, so will hopefully match what we pass here. */
|
|
|
|
if (!IS_ABSOLUTE_PATH (name) && comp_dir != nullptr)
|
|
|
|
{
|
|
|
|
name_for_id_holder = path_join (comp_dir, name);
|
|
|
|
name_for_id = name_for_id_holder.c_str ();
|
|
|
|
}
|
|
|
|
|
2021-05-18 04:16:06 +08:00
|
|
|
m_builder.reset (new struct buildsym_compunit
|
|
|
|
(this->per_objfile->objfile,
|
gdb: add "id" fields to identify symtabs and subfiles
Printing macros defined in the main source file doesn't work reliably
using various toolchains, especially when DWARF 5 is used. For example,
using the binaries produced by either of these commands:
$ gcc --version
gcc (GCC) 11.2.0
$ ld --version
GNU ld (GNU Binutils) 2.38
$ gcc test.c -g3 -gdwarf-5
$ clang --version
clang version 13.0.1
$ clang test.c -gdwarf-5 -fdebug-macro
I get:
$ ./gdb -nx -q --data-directory=data-directory a.out
(gdb) start
Temporary breakpoint 1 at 0x111d: file test.c, line 6.
Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out
Temporary breakpoint 1, main () at test.c:6
6 return ZERO;
(gdb) p ZERO
No symbol "ZERO" in current context.
When starting to investigate this (taking the gcc-compiled binary as an
example), we see that GDB fails to look up the appropriate macro scope
when evaluating the expression. While stopped in
macro_lookup_inclusion:
(top-gdb) p name
$1 = 0x62100011a980 "test.c"
(top-gdb) p source.filename
$2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c"
`source` is the macro_source_file that we would expect GDB to find.
`name` comes from the symtab::filename field of the symtab we are
stopped in. GDB doesn't find the appropriate macro_source_file because
the name of the macro_source_file doesn't match exactly the name of the
symtab.
The name of the main symtab comes from the compilation unit's
DW_AT_name, passed to the buildsym_compunit's constructor:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630
The contents of DW_AT_name, in this case, is "test.c". It is typically
(what I witnessed all compilers do) the same string that was passed to
the compiler on the command-line.
The name of the macro_source_file comes from the line number program
header's file table, from the call to the line_header::file_file_name
method:
https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65
line_header::file_file_name prepends the directory path that the file
entry refers to, in the file table (if the file name is not already
absolute). In this case, the file name is "test.c", appended to the
directory "/home/simark/build/binutils-gdb-one-target/gdb".
Because the symtab's name is not created the same way as the
macro_source_file's name is created, we get this mismatch. GDB fails to
find the appropriate macro scope for the symtab, and we can't print
macros when stopped in that symtab.
To make this work, we must ensure that paths produced in these two ways
end up identical. This can be tricky because of the different ways a
path can be passed to the compiler by the user.
Another thing to consider is that while the main symtab's name (or
subfile, before it becomes a symtab) is created using DW_AT_name, the
main symtab is also referred to using its entry in the line table
header's file table, when processing the line table. We must therefore
ensure that the same name is produced in both cases, so that a call to
"start_subfile" for the main subfile will correctly find the
already-created subfile, created by buildsym_compunit's constructor. If
we fail to do that, things still often work, because of a fallback: the
watch_main_source_file_lossage method. This method determines that if
the main subfile has no symbols but there exists another subfile with
the same basename (e.g. "test.c") that does have symbols, it's probably
because there was some filename mismatch. So it replaces the main
subfile with that other subfile. I think that heuristic is useful as a
last effort to work around any bug or bad debug info, but I don't think
we should design things such as to rely on it. It's a heuristic, it can
get things wrong. So in my search for a fix, it is important that given
some good debug info, we don't end up relying on that for things to
work.
A first attempt at fixing this was to try to prepend the compilation
directory here or not prepend it there. In practice, because of all the
possible combinations of debug info the compilers produce, it was not
possible to get something that would produce reliable, consistent paths.
Another attempt at fixing this was to make both macro_source_file
objects and symtab objects use the most complete form of path possible.
That means to prepend directories at least until we get an absolute
path. In theory, we should end up with the same path in all cases.
This generally worked, but because it changed the symtab names, it
resulted in user-visible changes (for example, paths to source files in
Breakpoint hit messages becoming always absolute). I didn't find this
very good, first because there is a "set filename-display" setting that
lets the user control how they want the paths to be displayed, and that
would suddenly make this setting completely ineffective (although even
today, it is a bit dependent on the debug info). Second, it would
require a good amount of testsuite tweaks to make tests accept these
suddenly absolute paths.
This new patch is a slight variation of that: it adds a new field called
"filename_for_id" in struct symtab and struct subfile, next to the
existing filename field. The goal is to separate the internal ids used
for finding objects from the names used for presentation. This field is
used for identifying subfiles, symtabs and macro_source_files
internally. For DWARF symtabs, this new field is meant to contain the
"most complete possible" path, as discussed above. So for a given file,
it must always be in the same form, everywhere. The existing
symtab::filename field remains the one used for printing to the user, so
there shouldn't be any change in how paths are printed.
Changes in the core symtab files are:
- Add "name_for_id" and "filename_for_id" fields to "struct subfile"
and "struct symtab", next to existing "name" and "filename" fields.
- Make buildsym_compunit::buildsym_compunit and
buildsym_compunit::start_subfile accept a "name_for_id" parameter
next to the existing "name" ones.
- Make buildsym_compunit::start_subfile use "name_for_id" for looking
up existing subfiles. This is the key thing for making calls
to start_subfile for the main source file look up the existing
subfile successfully, and avoid relying on
watch_main_source_file_lossage.
- Make sal_macro_scope pass "filename_for_id", rather than "filename",
to macro_lookup_inclusion. This is the key thing to making the
lookup work and macro printing work.
Changes in the DWARF files are:
- Make line_header::file_file_name return the "most complete possible"
name. The only pre-existing user of this method is the macro code,
to give the macro_source_file objects their name. And we now want
them to have this "most complete possible" name, which will match the
corresponding symtab's "filename_for_id".
- Make dwarf2_cu::start_compunit_symtab pass the "most complete
possible" name for the main symtab's "filename_for_id". In this
context, where the info comes from the compilation unit's DW_AT_name
/ DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if
DW_AT_name is not already absolute.
- Change dwarf2_start_subfile to build a name_for_id for the subfile
being started. The simplest way is to re-use
line_header::file_file_name, since the callers always have a
file_entry handy. This ensures that it will get the exact same path
representation as the macro code does, for the same file (since it
also uses line_header::file_file_name).
- Update calls to allocate_symtab to pass the "name_for_id" from the
subfile.
Tests exercising all this are added by the following patch.
Of all the cases I tried, the only one I found that ends up relying on
watch_main_source_file_lossage is the following one:
$ clang --version
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang ./test.c -g3 -O0 -gdwarf-4
$ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1" a.out
...
[symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
[symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c)
[symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile
As we can see, there are two forms used for "test.c", one with a "." and
one without. This comes from the fact that the compilation unit DIE
contains:
DW_AT_name ("test.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
without a ".", and the line table for that file contains:
include_directories[ 1] = "."
file_names[ 1]:
name: "test.c"
dir_index: 1
When assembling the filename from that entry, we get a ".".
It is a bit unexpected that the main filename resulting from the line
table header does not match exactly the name in the compilation unit.
For instance, gcc uses "./test.c" for the DW_AT_name, which gives
identical paths in the compilation unit and in the line table header.
Similarly, with DWARF 5:
$ clang ./test.c -g3 -O0 -gdwarf-5
clang create two entries that refer to the same file but are of in a different
form.
include_directories[ 0] = "/home/simark/build/binutils-gdb-one-target/gdb"
include_directories[ 1] = "."
file_names[ 0]:
name: "test.c"
dir_index: 0
file_names[ 1]:
name: "test.c"
dir_index: 1
The first file name produces a path without a "." while the second does.
This is not caught by watch_main_source_file_lossage, because of
dwarf_decode_lines that creates a symtab for each file entry in the line
table. It therefore appears as "non-empty" to
watch_main_source_file_lossage. This results in two symtabs:
(gdb) maintenance info symtabs
{ objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00)
{ ((struct compunit_symtab *) 0x62100011aca0)
debugformat DWARF 5
producer clang version 13.0.1
name test.c
dirname /home/simark/build/binutils-gdb-one-target/gdb
blockvector ((struct blockvector *) 0x621000129ec0)
user ((struct compunit_symtab *) (null))
{ symtab test.c ((struct symtab *) 0x62100011ad20)
fullname (null)
linetable ((struct linetable *) 0x0)
}
{ symtab ./test.c ((struct symtab *) 0x62100011ad60)
fullname (null)
linetable ((struct linetable *) 0x621000129ef0)
}
}
}
I am not sure what is the consequence of this, but this is also what
happens before my patch, so I think its acceptable to leave it as-is.
To handle these two cases nicely, I think we will need a function that
removes the unnecessary "." from path names, something that can be done
later.
Finally, I made a change in find_file_and_directory is necessary to
avoid breaking test
gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42
Without that change, we would get:
(gdb) info source
Current source file is /dir/d/dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
whereas the expected result is:
(gdb) info source
Current source file is dw2-compdir-oldgcc42.S
Compilation directory is /dir/d
This test was added here:
https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html
Long story short, GCC <= 4.2 apparently had a bug where it would
generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S")
and no DW_AT_comp_dir. The line table has one entry with filename
"dw2-compdir-oldgcc42.S", which refers to directory 0. Directory 0
normally refers to the compilation unit's comp dir, but it is
non-existent in this case.
This caused some symtab lookup problems, and to work around them, some
workaround was added, which today reads as:
if (res.get_comp_dir () == nullptr
&& producer_is_gcc_lt_4_3 (cu)
&& res.get_name () != nullptr
&& IS_ABSOLUTE_PATH (res.get_name ()))
res.set_comp_dir (ldirname (res.get_name ()));
Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432
It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no
DW_AT_comp_dir and DW_AT_name is absolute.
Prior to my patch, a subfile would get created with filename
"/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get
created with filename "dw2-compdir-oldgcc42.S" from the line table's
file table. Then watch_main_source_file_lossage would kick in and merge
them, keeping only the "dw2-compdir-oldgcc42.S" one:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S)
[symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile
And so "info source" would show "dw2-compdir-oldgcc42.S" as the
filename.
With my patch applied, but without the change in
find_file_and_directory, both DW_AT_name and the line table would try to
start a subfile with the same filename_for_id, and there was no need for
watch_main_source_file_lossage - which is what we want:
[symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
[symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
[symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming
from DW_AT_name, gets created first, it wins, and the symtab ends up
with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows
"/dir/d/dw2-compdir-oldgcc42.S" and the test breaks.
This is not wrong per-se, after all DW_AT_name is
"/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the
current source file as "/dir/d/dw2-compdir-oldgcc42.S". If you compile
a file passing "/an/absolute/path.c", DW_AT_name typically contains (at
least with GCC) "/an/absolute/path.c" and GDB tells you that the source
file is "/an/absolute/path.c". But we can also keep the existing
behavior fairly easily with a little change in find_file_and_directory.
When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now
modify the name to just keep the file part. The result is coherent with
what compilers do when you compile a file by just passing its filename
("gcc path.c -g"):
DW_AT_name ("path.c")
DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
With this change, filename_for_id is still the full name,
"/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile /
symtab (what ends up shown by "info source") is just
"dw2-compdir-oldgcc42.S", and that makes the test happy.
Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307
2022-07-29 00:34:47 +08:00
|
|
|
name, comp_dir, name_for_id, lang (), low_pc));
|
2021-05-18 04:16:06 +08:00
|
|
|
|
|
|
|
list_in_scope = get_builder ()->get_file_symbols ();
|
|
|
|
|
2021-11-23 09:57:42 +08:00
|
|
|
/* DWARF versions are restricted to [2, 5], thanks to the check in
|
|
|
|
read_comp_unit_head. */
|
|
|
|
gdb_assert (this->header.version >= 2 && this->header.version <= 5);
|
|
|
|
static const char *debugformat_strings[] = {
|
|
|
|
"DWARF 2",
|
|
|
|
"DWARF 3",
|
|
|
|
"DWARF 4",
|
|
|
|
"DWARF 5",
|
|
|
|
};
|
|
|
|
const char *debugformat = debugformat_strings[this->header.version - 2];
|
|
|
|
|
|
|
|
get_builder ()->record_debugformat (debugformat);
|
2021-05-18 04:16:06 +08:00
|
|
|
get_builder ()->record_producer (producer);
|
|
|
|
|
|
|
|
processing_has_namespace_info = false;
|
|
|
|
|
|
|
|
return get_builder ()->get_compunit_symtab ();
|
|
|
|
}
|
|
|
|
|
|
|
|
/* See read.h. */
|
|
|
|
|
|
|
|
struct type *
|
|
|
|
dwarf2_cu::addr_type () const
|
|
|
|
{
|
|
|
|
struct objfile *objfile = this->per_objfile->objfile;
|
2023-03-12 00:39:58 +08:00
|
|
|
struct type *void_type = builtin_type (objfile)->builtin_void;
|
2021-05-18 04:16:06 +08:00
|
|
|
struct type *addr_type = lookup_pointer_type (void_type);
|
|
|
|
int addr_size = this->per_cu->addr_size ();
|
|
|
|
|
2022-09-21 23:05:21 +08:00
|
|
|
if (addr_type->length () == addr_size)
|
2021-05-18 04:16:06 +08:00
|
|
|
return addr_type;
|
|
|
|
|
|
|
|
addr_type = addr_sized_int_type (addr_type->is_unsigned ());
|
|
|
|
return addr_type;
|
|
|
|
}
|
2021-05-18 04:16:06 +08:00
|
|
|
|
|
|
|
/* A hashtab traversal function that marks the dependent CUs. */
|
|
|
|
|
|
|
|
static int
|
|
|
|
dwarf2_mark_helper (void **slot, void *data)
|
|
|
|
{
|
|
|
|
dwarf2_per_cu_data *per_cu = (dwarf2_per_cu_data *) *slot;
|
|
|
|
dwarf2_per_objfile *per_objfile = (dwarf2_per_objfile *) data;
|
|
|
|
dwarf2_cu *cu = per_objfile->get_cu (per_cu);
|
|
|
|
|
|
|
|
/* cu->m_dependencies references may not yet have been ever read if
|
|
|
|
QUIT aborts reading of the chain. As such dependencies remain
|
|
|
|
valid it is not much useful to track and undo them during QUIT
|
|
|
|
cleanups. */
|
|
|
|
if (cu != nullptr)
|
|
|
|
cu->mark ();
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* See dwarf2/cu.h. */
|
|
|
|
|
|
|
|
void
|
|
|
|
dwarf2_cu::mark ()
|
|
|
|
{
|
|
|
|
if (!m_mark)
|
|
|
|
{
|
|
|
|
m_mark = true;
|
|
|
|
if (m_dependencies != nullptr)
|
|
|
|
htab_traverse (m_dependencies, dwarf2_mark_helper, per_objfile);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* See dwarf2/cu.h. */
|
|
|
|
|
|
|
|
void
|
|
|
|
dwarf2_cu::add_dependence (struct dwarf2_per_cu_data *ref_per_cu)
|
|
|
|
{
|
|
|
|
void **slot;
|
|
|
|
|
|
|
|
if (m_dependencies == nullptr)
|
|
|
|
m_dependencies
|
|
|
|
= htab_create_alloc_ex (5, htab_hash_pointer, htab_eq_pointer,
|
|
|
|
NULL, &comp_unit_obstack,
|
|
|
|
hashtab_obstack_allocate,
|
|
|
|
dummy_obstack_deallocate);
|
|
|
|
|
|
|
|
slot = htab_find_slot (m_dependencies, ref_per_cu, INSERT);
|
|
|
|
if (*slot == nullptr)
|
|
|
|
*slot = ref_per_cu;
|
|
|
|
}
|
[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again
This is another attempt at fixing the problem described in commit 4cf88725da1
"[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder()", which was
reverted in commit 3db19b2d724.
First off, some context.
A DWARF CU can be viewed as a symbol table: toplevel children of a CU DIE
represent symbol table entries for that CU. Furthermore, there is a
hierarchy: a symbol table entry such as a function itself has a symbol table
containing parameters and local variables.
The dwarf reader maintains a notion of current symbol table (that is: the
symbol table a new symbol needs to be entered into) in dwarf2_cu member
list_in_scope.
A problem then presents itself when reading inter-CU references:
- a new symbol read from a CU B needs to be entered into the symbol table of
another CU A.
- the notion of current symbol table is tracked on a per-CU basis.
This is addressed in inherit_abstract_dies by temporarily overwriting the
list_in_scope for CU B with the one for CU A.
The current symbol table is one aspect of the current dwarf reader context
that is tracked, but there are more, f.i. ones that are tracked via the
dwarf2_cu member m_builder, f.i. m_builder->m_local_using_directives.
A similar problem exists in relation to inter-CU references, but a different
solution was chosen:
- to keep track of an ancestor field in dwarf2_cu, which is updated
when traversing inter-CU references, and
- to use the ancestor field in dwarf2_cu::get_builder to return the m_builder
in scope.
There is no actual concept of a CU having an ancestor, it just marks the most
recent CU from which a CU was inter-CU-referenced. Consequently, when
following inter-CU references from a CU A to another CU B and back to CU A,
the ancestors form a cycle, which causes dwarf2_cu::get_builder to hang or
segfault, as reported in PR26327.
ISTM that the ancestor implementation is confusing and fragile, and should
go. Furthermore, it seems that keeping track of the m_builder in scope can be
handled simply with a per-objfile variable.
Fix the hang / segfault by:
- keeping track of the m_builder in scope using a new variable
per_obj->sym_cu, and
- using it in dwarf2_cu::get_builder.
Tested on x86_64-linux (openSUSE Leap 15.2), no regressions for config:
- using default gcc version 7.5.0
(with 5 unexpected FAILs)
- gcc 10.3.0 and target board
unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects
(with 1000 unexpected FAILs)
gdb/ChangeLog:
2021-06-16 Tom de Vries <tdevries@suse.de>
PR symtab/26327
* dwarf2/cu.h (dwarf2_cu::ancestor): Remove.
(dwarf2_cu::get_builder): Declare and move ...
* dwarf2/cu.c (dwarf2_cu::get_builder): ... here. Use sym_cu instead
of ancestor. Assert return value is non-null.
* dwarf2/read.c (read_file_scope): Set per_objfile->sym_cu.
(follow_die_offset, follow_die_sig_1): Remove setting of ancestor.
(dwarf2_per_objfile): Add sym_cu field.
2021-06-16 18:44:30 +08:00
|
|
|
|
|
|
|
/* See dwarf2/cu.h. */
|
|
|
|
|
|
|
|
buildsym_compunit *
|
|
|
|
dwarf2_cu::get_builder ()
|
|
|
|
{
|
|
|
|
/* If this CU has a builder associated with it, use that. */
|
|
|
|
if (m_builder != nullptr)
|
|
|
|
return m_builder.get ();
|
|
|
|
|
|
|
|
if (per_objfile->sym_cu != nullptr)
|
|
|
|
return per_objfile->sym_cu->m_builder.get ();
|
|
|
|
|
|
|
|
gdb_assert_not_reached ("");
|
|
|
|
}
|