dwarf2/read.h includes cooked-index.h, but it doesn't need to. This
patch removes the inclusion from this header, and adds one to
index-write.c to make up for the absence.
I noticed a few more style issues in commit 8b9c08edda ("[gdb/symtab] Add
name_of_main and language_of_main to the DWARF index"), after checking it
with gcc's check_GNU_style.{sh,py}.
Fix these.
Build on x86_64-linux.
The recent change to record the DWARF language in the per-CU data
yielded a race warning in my testing:
ThreadSanitizer: data race ../../binutils-gdb/gdb/dwarf2/read.c:21779 in prepare_one_comp_unit
This patch fixes the bug by applying the same style of fix that was
done for the ordinary (gdb) language.
I wonder if this code could be improved. Requiring an atomic for the
language in particular seems unfortunate, as it is often consulted
during index finalization. However, I haven't investigated this.
Regression tested on x86-64 Fedora 38.
Reviewed-by: Tom de Vries <tdevries@suse.de>
Post-commit review pointed out a few style issues in commit 8b9c08edda
("[gdb/symtab] Add name_of_main and language_of_main to the DWARF index").
Fix these.
Tested on x86_64-linux.
Reported-By: Tom Tromey <tom@tromey.com>
Approved-By: Tom Tromey <tom@tromey.com>
This patch adds a new section to the DWARF index containing the name
and the language of the main function symbol, gathered from
`cooked_index::get_main`, if available. Currently, for lack of a better name,
this section is called the "shortcut table". The way this name is both saved and
applied upon an index being loaded in mirrors how it is done in
`cooked_index_functions`, more specifically, the full name of the main function
symbol is saved and `set_objfile_main_name` is used to apply it after it is
loaded.
The main use case for this patch is in improving startup times when dealing with
large binaries. Currently, when an index is used, GDB has to expand symtabs
until it finds out what the language of the main function symbol is. For some
large executables, this may take a considerable amount of time to complete,
slowing down startup. This patch bypasses that operation by having both the name
and language of the main function symbol be provided ahead of time by the index.
In my testing (a binary with about 1.8GB worth of DWARF data) this change brings
startup time down from about 34 seconds to about 1.5 seconds.
When testing the patch with target board cc-with-gdb-index, test-case
gdb.fortran/nested-funcs-2.exp starts failing, but this is due to a
pre-existing issue, filed as PR symtab/30946.
Tested on x86_64-linux, with target board unix and cc-with-gdb-index.
PR symtab/24549
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24549
Approved-By: Tom de Vries <tdevries@suse.de>
There are two methods to factor out type information in a dwarf4 executable:
- use -fdebug-info-types to generate type units in a .debug_types section, and
- use dwz to create partial units.
The dwz method has an extra benefit: it also allows to factor out information
between executables into a newly created .dwz file, pointed to by a
.gnu_debugaltlink section.
There is nothing prohibiting a .gnu_debugaltlink file to contain a
.debug_types section.
It's just not generated by dwz or any other tool atm, and consequently gdb has
no support for it. Enhancement PR symtab/30838 is open about the lack of
support.
Make the current situation explicit by emitting a dwarf error:
...
(gdb) file struct-with-sig-2^M
Reading symbols from struct-with-sig-2...^M
Dwarf Error: .debug_types section not supported in dwz file^M
...
and add an assert in write_gdbindex:
...
+ /* See enhancement PR symtab/30838. */
+ gdb_assert (!(per_cu->is_dwz && per_cu->is_debug_types));
...
to clarify why we can use:
...
data_buf &cu_list = (per_cu->is_debug_types
? types_cu_list
: per_cu->is_dwz ? dwz_cu_list : objfile_cu_list);
...
The test-case is a modified copy from gdb.dwarf2/struct-with-sig.exp, so it
keeps the copyright years range.
Tested on x86_64-linux.
Tested-By: Guinevere Larsen <blarsen@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30838
Add a unit test which checks that write_gdb_index_1 will throw
an error when the size of the file would exceed the maximum value
capable of being represented by 'offset_type'.
The unit test fails on 32-bit systems due to wrapping overflow. Fix this by
changing the type of total_len in write_gdbindex_1 from size_t to uint64_t.
Tested on x86_64-linux.
Co-Authored-By: Kevin Buettner <kevinb@redhat.com>
Approved-by: Kevin Buettner <kevinb@redhat.com>
The header in a .gdb_index section uses 32-bit unsigned offsets to
refer to other areas of the section. Thus, there is a size limit of
2^32-1 which is currently unaccounted for by GDB's code for outputting
these sections.
At the moment, when GDB creates an overly large section, it will exit
abnormally due to an internal error, which is caused by a failed
assert in assert_file_size, which in turn is called from
write_gdbindex_1, both of which are in gdb/dwarf2/index-write.c.
This is what happens when that assert fails:
$ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/"
Reading symbols from ./libgraph_tool_inference.so...
No executable file now.
Discard symbol table from `libgraph_tool_inference.so'? (y or n) n
Not confirmed.
../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x55fddb4d78b0 gdb_internal_backtrace_1
../../gdb/bt-utils.c:122
0x55fddb4d78b0 _Z22gdb_internal_backtracev
../../gdb/bt-utils.c:168
0x55fddb98b5d4 internal_vproblem
../../gdb/utils.c:396
0x55fddb98b8de _Z15internal_verrorPKciS0_P13__va_list_tag
../../gdb/utils.c:476
0x55fddbb71654 _Z18internal_error_locPKciS0_z
../../gdbsupport/errors.cc:58
0x55fddb5a0f23 assert_file_size
../../gdb/dwarf2/index-write.c:1069
0x55fddb5a1ee0 assert_file_size
/usr/include/c++/13/bits/stl_iterator.h:1158
0x55fddb5a1ee0 write_gdbindex_1
../../gdb/dwarf2/index-write.c:1119
0x55fddb5a51be write_gdbindex
../../gdb/dwarf2/index-write.c:1273
[...]
---------------------
../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed.
This problem was encountered while building the python-graph-tool
package on Fedora. The Fedora bugzilla bug can be found here:
https://bugzilla.redhat.com/show_bug.cgi?id=1773651
This commit prevents the internal error from occurring by calling error()
when the file size exceeds 2^32-1.
Using a gdb built with this commit, I now see this behavior instead:
$ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/"
Reading symbols from ./libgraph_tool_inference.so...
No executable file now.
Discard symbol table from `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so'? (y or n) n
Not confirmed.
Error while writing index for `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so': gdb-index maximum file size of 4294967295 exceeded
(gdb)
I wish I could provide a test case, but due to the sizes of both the
input and output files, I think that testing resources would be
strained or exceeded in many environments.
My testing on Fedora 38 shows no regressions.
Approved-by: Tom Tromey <tom@tromey.com>
With test-case gdb.ada/same_enum.exp and target board dwarf4-gdb-index we run
into:
...
(gdb) print red^M
No definition of "red" in current context.^M
(gdb) FAIL: gdb.ada/same_enum.exp: print red
...
[ This is a regression since commit 844a72efbc ("Simplify gdb_index writing"),
so this is broken in gdb 12 and 13. ]
The easiest way to see what's going wrong is with readelf. We have in section
.gdb_index:
...
[7194] pck__red:
2 [static, variable]
3 [static, variable]
...
which points to the CUs 2 and 3 in the CU list (shown using "2" and "3"), but
should be pointing to the TUs 2 and 3 in the TU list (shown using "T2" and
"T3").
Fix this by removing the counter / types_counter distinction in
write_gdbindex, such that we get the expected:
...
[7194] pck__red:
T2 [static, variable]
T3 [static, variable]
...
[ While reading write_gdbindex I noticed a few oddities related to dwz
handling, I've filed PR30829 about this. ]
Tested on x86_64-linux.
Approved-By: Tom Tromey <tom@tromey.com>
PR symtab/30827
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30827
When running test-case gdb.python/py-symbol.exp with target board
cc-with-gdb-index, we run into:
...
(gdb) python print (len (gdb.lookup_static_symbols ('rr')))^M
1^M
(gdb) FAIL: gdb.python/py-symbol.exp: print (len (gdb.lookup_static_symbols ('rr')))
...
[ Note that the test-case contains rr in both py-symtab.c:
...
static int __attribute__ ((used)) rr = 42; /* line of rr */
...
and py-symtab-2.c:
...
static int __attribute__ ((used)) rr = 99; /* line of other rr */
... ]
This passes with gdb-12-branch, and fails with gdb-13-branch.
AFAIU the current code in symtab_index_entry::minimize makes the assumption
that it's fine to store only one copy of rr in the gdb-index, because
"print rr" will only ever print one, and always the same.
But that fails to recognize that gdb supports gdb.lookup_static_symbols, which
returns a list of variables rather than the first one.
In other words, the current approach breaks feature parity between cooked
index and gdb-index.
Note btw that also debug-names has both instances:
...
[ 5] #00597969 rr:
<4> DW_TAG_variable DW_IDX_compile_unit=3 DW_IDX_GNU_internal=1
<4> DW_TAG_variable DW_IDX_compile_unit=4 DW_IDX_GNU_internal=1
...
Fix this in symtab_index_entry::minimize, by not deduplicating variables.
Tested on x86_64-linux, with target boards unix and cc-with-gdb-index.
Reviewed-by: Kevin Buettner <kevinb@redhat.com>
PR symtab/30720
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30720
When running test-case gdb.dwarf2/pr13961.exp with target-board
cc-with-debug-names, I run into:
...
Running gdb.dwarf2/pr13961.exp ...
gdb compile failed, gdb/dwarf2/index-write.c:1305: internal-error: \
write_debug_names: Assertion `counter == per_bfd->all_units.size ()' failed.
...
This is a regression since commit 542a33e348 ("Only use the per-BFD object to
write a DWARF index"), which did:
...
- gdb_assert (counter == per_objfile->per_bfd->all_comp_units.size ());
+ gdb_assert (counter == per_bfd->all_units.size ());
...
Fix this by reverting to using all_comp_units:
...
gdb_assert (counter == per_bfd->all_comp_units.size ());
...
Tested on x86_64-linux, using target boards unix and cc-with-debug-names.
PR symtab/30741
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30741
I filed PR29179 about the following FAIL in test-case
gdb.ada/O2_float_param.exp with target board cc-with-gdb-index:
...
(gdb) break increment^M
Function "increment" not defined.^M
Make breakpoint pending on future shared library load? (y or [n]) n^M
(gdb) FAIL: gdb.ada/O2_float_param.exp: scenario=all: gdb_breakpoint: \
set breakpoint at increment
...
The FAIL was a regression since commit 2cf349be0e ("Do not put linkage names
into .gdb_index").
Before that commit we had:
...
$ readelf -w foo > READELF
$ grep callee.*increment READELF
[1568] callee__increment: 5 [global, function]
[3115] callee.increment: 5 [global, function]
...
but after only:
...
$ grep callee.*increment READELF
[3115] callee.increment: 5 [global, function]
...
The regression was fixed by commit 67e83a0dee ("Fix regression in
c-linkage-name.exp with gdb index"), which got us again:
...
$ grep callee.*increment READELF
[1568] callee__increment: 5 [global, function]
[3115] callee.increment: 5 [global, function]
...
The commit however did not claim that particular PR. A subsequent commit,
commit 5fea979432 ("Improve Ada support in .gdb_index") did claim to fix it,
together with commit dd05fc7071 ("Change .gdb_index de-duplication
implementation").
The commit 5fea979432 contained the following addition in write_cooked_index:
...
+ if (entry->per_cu->lang () == language_ada)
+ {
+ /* We want to ensure that the Ada main function's name
+ appears verbatim in the index. However, this name will
+ be of the form "_ada_mumble", and will be rewritten by
+ ada_decode. So, recognize it specially here and add it
+ to the index by hand. */
+ if (entry->tag == DW_TAG_subprogram
+ && strcmp (main_for_ada, name) == 0)
+ {
+ /* Leave it alone. */
+ }
+ else
+ {
+ /* In order for the index to work when read back into
+ gdb, it has to use the encoded name, with any
+ suffixes stripped. */
+ std::string encoded = ada_encode (name, false);
+ name = obstack_strdup (&symtab->m_string_obstack,
+ encoded.c_str ());
+ }
+ }
...
The code contains some special handling related to the Ada main function, so
let's look at that one: foo. Before commit 67e83a0dee we have:
...
$ grep foo.*function READELF
[3733] foo: 7 [global, function]
...
and after:
...
$ grep foo.*function READELF
[2738] _ada_foo: 7 [global, function]
[3733] foo: 7 [global, function]
...
so that looks identical to the callee.increment case.
At commit 5fea979432, we have slightly different index numbers:
...
$ grep foo.*function READELF
[1658] foo: 7 [global, function]
[2738] _ada_foo: 7 [global, function]
...
but otherwise the same result.
If we disable the special handling of the Ada main function like so:
...
- if (entry->tag == DW_TAG_subprogram
+ if (false && entry->tag == DW_TAG_subprogram
...
we still have the exact same result because:
...
(gdb) p main_for_ada
$1 = 0x352e6a0 "_ada_foo"
...
and ada_encode ("_ada_foo", false) == "_ada_foo".
The comment seems to be copied from debug_names::insert, which does indeed use
ada_decode, while the code in write_cooked_index uses ada_encode instead.
Remove the superfluous special handling of Ada main in write_cooked_index.
Tested on x86_64-linux, with target boards unix and cc-with-gdb-index.
Approved-By: Tom Tromey <tom@tromey.com>
The DWARF index does not need access to the objfile or per-objfile
objects when writing -- it's entirely based on the objfile-independent
per-BFD data.
This patch implements this idea by changing the entire API to only be
passed the per-BFD object. This simplifies some lifetime reasoning
for the next patch.
This patch removes some code that ensures that the BFD came from a
file. It seems to me that checking for the existence of a build-id is
good enough for the index cache.
Users of addrmap::find and addrmap::foreach that have a const addrmap
should ideally receive const pointers to objects, to indicate they
should not be modified. However, users that have a non-const addrmap
should still receive a non-const pointer.
To achieve this, without adding more virtual methods, make the existing
find and foreach virtual methods private and prefix them with "do_".
Add small non-const and const wrappers for find and foreach.
Obviously, the const can be cast away, but if using static_cast
instead of C-style casts, then the compiler won't let you cast
the const away. I changed all the callers of addrmap::find and
addrmap::foreach I could find to make them use static_cast.
Change-Id: Ia8e69d022564f80d961413658fe6068edc71a094
This commit is the result of running the gdb/copyright.py script,
which automated the update of the copyright year range for all
source files managed by the GDB project to be updated to include
year 2023.
PR symtab/29694 points out a regression caused by the new DWARF
scanner when the cc-with-gdb-index target board is used.
What happens here is that an older version of gdb will make an index
describing the "A" type as:
[737] A: 1 [global, type]
whereas the new gdb says:
[1008] A: 0 [global, type]
Here the old one is correct because the A in CU 0 is just a
declaration without a size:
<1><45>: Abbrev Number: 10 (DW_TAG_structure_type)
<46> DW_AT_name : A
<48> DW_AT_declaration : 1
<48> DW_AT_sibling : <0x6d>
This patch fixes the problem by introducing the idea of a "type
declaration". I think gdb still needs to recurse into these types,
searching for methods, but by marking the type itself as a
declaration, gdb can skip this type during lookups and when writing
the index.
Regression tested on x86-64 using the cc-with-gdb-index board.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29694
While investigating PR symtab/29179, I found that one Ada test failed
because, although a certain symbol was present in the index, with the
new DWARF reader it pointed to a different CU than was chosen by
earlier versions of gdb.
This patch changes how symbol de-duplication is done, deferring the
process until the entire symbol table has been constructed. This way,
it's possible to always choose the lower-numbered CU among duplicates,
which is how gdb (implicitly) previously worked.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29179
The cooked index work changed how .gdb_index is constructed, and in
the process broke .gdb_index support. This is PR symtab/29179.
This patch partially fixes the problem. It arranges for Ada names to
be encoded in the form expected by the index code. In particular,
linkage names for Ada are emitted, including the "main" name; names
are Ada-encoded; and names are no longer case-folded, something that
prevented operator names from round-tripping correctly.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29179
c-linkage-name.exp started failing with the gdb-index target board due
to an earlier patch. The problem here is that some linkage names must
be in the index -- but, based on inspection, not C++ linkage names.
This patch updates the code to exclude only these.
The class debug_names has two 'insert' overloads, but only one of them
is ever called externally, and it simply forwards to the other
implementation. It seems cleaner to me to have a single method, so
this patch merges the two.
Add all_comp_units/all_type_units views on all_units.
Having the views allows us to:
- easily get the number of CUs or TUs in all_units, and
- easily access the nth CU or TU.
This minimizes the use of tu_stats.nr_tus.
Tested on x86_64-linux.
This changes struct objfile to use a gdb_bfd_ref_ptr. In addition to
removing some manual memory management, this fixes a use-after-free
that was introduced by the registry rewrite series. The issue there
was that, in some cases, registry shutdown could refer to memory that
had already been freed. This help fix the bug by delaying the
destruction of the BFD reference (and thus the per-bfd object) until
after the registry has been shut down.
With gdb build with -fsanitize=thread and test-case
gdb.dwarf2/dw4-sig-types.exp and target board cc-with-dwz-m we run into a data
race between:
...
Write of size 4 at 0x7b2800002268 by thread T4:^M
#0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6236 \
(gdb+0x82f525)^M
...
and this read:
...
Previous read of size 4 at 0x7b2800002268 by thread T1:^M
#0 dwarf2_find_containing_comp_unit gdb/dwarf2/read.c:23444 \
(gdb+0x86e22e)^M
...
In other words, between this write:
...
this_cu->length = cu->header.get_length ();
...
and this read:
...
&& mid_cu->sect_off + mid_cu->length > sect_off))
...
of per_cu->length.
Fix this similar to the per_cu->dwarf_version case, by only setting it if
needed, and otherwise verifying that the same value is used. [ Note that the
same code is already present in the other cutu_reader constructor. ]
Move this logic into into a member function set_length to make sure it's used
consistenly, and make the field private in order to enforce access through the
member functions, and rename it to m_length.
This exposes (running test-case gdb.dwarf2/fission-reread.exp) that in
fill_in_sig_entry_from_dwo_entry, the m_length field is overwritten.
For now, allow for that exception.
While we're at it, make sure that the length is set before read.
Tested on x86_64-linux.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29344
The dwarf2_per_cu_data fields lang and unit_type both have a dont-know
initial value (respectively language_unknown and (dwarf_unit_type)0), which
allows us to add certain checks, f.i. checking that that a field is not read
before written.
Add get/set member functions for the two fields as a convenient location to
add such checks, make the fields private to enforce using the member
functions, and add the m_ prefix.
Tested on x86_64-linux.
My patches yesterday to unify the DWARF index base classes had a bug
-- namely, I did the wholesale dynamic_cast-to-static_cast too hastily
and introduced a crash. This can be seen by trying to add an index to
a file that has an index, or by running a test like gdb-index-cxx.exp
using the cc-with-debug-names.exp target board.
This patch fixes the crash by introducing a new virtual method and
removing some of the static casts.
This de-duplicates variables and types in .gdb_index, making the new
index closer to what gdb generated before the new DWARF scanner
series. Spot-checking the resulting index for gdb itself, it seems
that the new scanner picks up some extra symbols not detected by the
old one. I tested both the new and old versions of gdb on both new
and old versions of the index, and startup time in all cases is
roughly the same (it's worth noting that, for gdb itself, the index no
longer provides any benefit over the DWARF scanner). So, I think this
fixes the size issue with the new index writer.
Regression tested on x86-64 Fedora 34.
At AdaCore, we run the internal gdb test suite in several modes,
including one using the .debug_names index. This caught a regression
caused by the new DWARF indexer.
First, the psymtabs-based .debug_names generator was completely wrong.
However, to avoid making the rewrite series even bigger (fixing the
writer will also require rewriting the .debug_names reader), it
attempted to preserve the weirdness.
However, this was not done properly. For example the old writer did
this:
- case STRUCT_DOMAIN:
- return DW_TAG_structure_type;
The new code, instead, simply preserves the actual DWARF tag -- but
this makes future lookups fail, because the .debug_names reader only
looks for DW_TAG_structure_type.
This patch attempts to revert to the old behavior in the writer.
The DWARF index code currently uses 'stat' to see if an objfile
represents a real file. However, I think it's more correct to check
OBJF_NOT_FILENAME instead.
Regression tested on x86-64 Fedora 34.
The dwarf2_per_bfd object has a separate field for each possible kind
of index. Until an earlier patch in this series, two of these were
even derived from a common base class, but still had separate slots.
This patch unifies all the index fields using the common base class
that was introduced earlier in this series. This makes it more
obvious that only a single index can be active at a time, and also
removes some code from dwarf2_initialize_objfile.
This updates the .gdb_index writer to work with the new DWARF scanner.
The .debug_names writer is deferred to another patch, to make review
simpler.
This introduces a small hack to psyms_seen_size, but is
inconsequential because this function will be deleted in a subsequent
patch.
This updates the DWARF index writing code to make the addrmap-writing
a bit more generic. Now, it can handle multiple maps, and it can work
using the maps generated by the new indexer.
Note that the new addrmap_index_data::using_index field will be
deleted in a future patch, when the rest of the DWARF psymtab code is
removed.
To support the removal of partial symtabs from the DWARF index writer,
this makes a small change to have write_address_map accept the address
map as a parameter, rather than assuming it always comes from the
per-BFD object.
In order to change the DWARF index writer to avoid partial symtabs,
this patch changes the key type in psym_index_map (and renames that
type as well). Using the dwarf2_per_cu_data as the key makes it
simpler to reuse this code with the new indexer.
This commit brings all the changes made by running gdb/copyright.py
as per GDB's Start of New Year Procedure.
For the avoidance of doubt, all changes in this commits were
performed by the script.
With current trunk and target board cc-with-debug-names we have:
...
(gdb) file dw2-ranges-psym^M
Reading symbols from dw2-ranges-psym...^M
warning: Section .debug_names in dw2-ranges-psym has abbreviation_table of \
size 1 vs. written as 28, ignoring .debug_names.^M
(gdb) set complaints 0^M
(gdb) FAIL: gdb.dwarf2/dw2-ranges-psym.exp: No complaints
...
The executable has 8 compilation units:
...
$ readelf -wi dw2-ranges-psym | grep @
Compilation Unit @ offset 0x0:
Compilation Unit @ offset 0x2e:
Compilation Unit @ offset 0xa5:
Compilation Unit @ offset 0xc7:
Compilation Unit @ offset 0xd2:
Compilation Unit @ offset 0x145:
Compilation Unit @ offset 0x150:
Compilation Unit @ offset 0x308:
...
of which the ones at 0xc7 and 0x145 are dummy CUs (that is, they do not
contain a single DIE), which were added by recent commit 5ef670d81f
"[gdb/testsuite] Add dummy start and end CUs in dwarf assembly".
The .debug_names section contains this CU table:
...
[ 0] 0x0
[ 1] 0x2e
[ 2] 0xa5
[ 3] 0xd2
[ 4] 0x150
[ 5] 0x308
[ 6] 0x1
[ 7] 0x0
...
The last two entries are incorrect, and the entries for the dummy CUs are
missing.
The last two entries are incorrect because here in write_debug_names we write
the dimension of the CU list as 8:
...
/* comp_unit_count - The number of CUs in the CU list. */
header.append_uint (4, dwarf5_byte_order,
per_objfile->per_bfd->all_comp_units.size ()
- per_objfile->per_bfd->tu_stats.nr_tus);
...
while the actual dimension of the CU list is 6.
The discrepancy is caused by this code which skips the dummy CUs:
...
for (int i = 0; i < per_objfile->per_bfd->all_comp_units.size (); ++i)
{
...
/* CU of a shared file from 'dwz -m' may be unused by this main
file. It may be referenced from a local scope but in such
case it does not need to be present in .debug_names. */
if (psymtab == NULL)
continue;
...
because they have a null partial symtab.
We can fix this by writing the actual dimension of the CU list, but that still
leaves the dummy CUs out of the CU list. The purpose of having these is to
delimit the end of preceding CUs.
So, fix this by:
- removing the code that skips the dummy CUs (note that the same change
was done for .gdb_index in commit efba5c2319 '[gdb/symtab] Handle PU
without import in "save gdb-index"'.
- verifying that all units are represented in the CU/TU lists
- using the actual CU list size when writing the dimension of the CU list
(and likewise for the TU list).
Tested on x86_64-linux with native and target board cc-with-debug-names.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28261
When comparing the sizes of the index files generated for shlib
outputs/gdb.dwarf2/dw2-zero-range/shr1.sl, I noticed a large difference
between .debug_names:
...
$ gdb -q -batch $shlib -ex "save gdb-index -dwarf-5 ."
$ du -b -h shr1.sl.debug_names shr1.sl.debug_str
61 shr1.sl.debug_names
0 shr1.sl.debug_str
...
and .gdb_index:
...
$ gdb -q -batch $shlib -ex "save gdb-index ."
$ du -b -h shr1.sl.gdb-index
8.2K shr1.sl.gdb-index
...
The problem is that the .gdb_index contains a non-empty symbol table with only
empty entries.
Fix this by making the symbol table empty, such that we have instead:
...
$ du -b -h shr1.sl.gdb-index
184 shr1.sl.gdb-index
...
Tested on x86_64-linux.
This changes the .debug_names writer to find the TU indices in the
main loop over all CUs and TUs. (An earlier patch applied this same
treatment to the .gdb_index writer.)
write_gdbindex writes the CUs first, then walks the signatured type
hash table to write out the TUs. However, now that CUs and TUs are
unified in the DWARF reader, it's simpler to handle both of these in
the same loop.
My recent patch to unify CUs and TUs introduced an oddity in
write_gdbindex. Here, we pass 'i' to recursively_write_psymbols, but
we must instead pass 'counter', to handle the situation where a TU is
mixed in with the CUs.
I am not sure a test case for this is possible. I think it can only
happen when using DWARF 5, where a TU appears in .debug_info.
However, this situation is already not handled correctly by
.gdb_index. I filed a bug about this.