Commit Graph

72 Commits

Author SHA1 Message Date
Torbjörn SVENSSON
998a4f589d libctf: Sanitize error types for PR 30836
Made sure there is no implicit conversion between signed and unsigned
return value for functions setting the ctf_errno value.
An example of the problem is that in ctf_member_next, the "offset" value
is either 0L or (ctf_id_t)-1L, but it should have been 0L or -1L.
The issue was discovered while building a 64 bit ld binary to be
executed on the Windows platform.
Example object file that demonstrates the issue is attached in the PR.

libctf/
	Affected functions adjusted.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
2023-10-17 17:31:20 +02:00
Nick Alcock
d7474051e8 libctf: propagate errors from parents correctly
CTF dicts have per-dict errno values: as with other errno values these
are set on error and left unchanged on success.  This means that all
errors *must* set the CTF errno: if a call leaves it unchanged, the
caller is apt to find a previous, lingering error and misinterpret
it as the real error.

There are many places in libctf where we carry out operations on parent
dicts as a result of carrying out other user-requested operations on
child dicts (e.g. looking up information on a pointer to a type will
look up the type as well: the pointer might well be in a child and the
type it's a pointer to in the parent).  Those operations on the parent
might fail; if they do, the error must be correctly reflected on the
child that the user-visible operation was carried out on.  In many
places this was not happening.

So, audit and fix all those places.  Add tests for as many of those
cases as possible so they don't regress.

libctf/
	* ctf-create.c (ctf_add_slice): Use the original dict.
	* ctf-lookup.c (ctf_lookup_variable): Propagate errors.
	(ctf_lookup_symbol_idx): Likewise.
	* ctf-types.c (ctf_member_next): Likewise.
	(ctf_type_resolve_unsliced): Likewise.
	(ctf_type_aname): Likewise.
	(ctf_member_info): Likewise.
	(ctf_type_rvisit): Likewise.
	(ctf_func_type_info): Set the error on the right dict.
	(ctf_type_encoding): Use the original dict.
	* testsuite/libctf-writable/error-propagation.*: New test.
2023-04-08 16:07:17 +01:00
Alan Modra
d87bef3a7b Update year range in copyright notice of binutils files
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
2023-01-01 21:50:11 +10:30
Alan Modra
a2c5833233 Update year range in copyright notice of binutils files
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.

The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
2022-01-02 12:04:28 +10:30
Nick Alcock
49da556c65 libctf, include: support an alternative encoding for nonrepresentable types
Before now, types that could not be encoded in CTF were represented as
references to type ID 0, which does not itself appear in the
dictionary. This choice is annoying in several ways, principally that it
forces generators and consumers of CTF to grow special cases for types
that are referenced in valid dicts but don't appear.

Allow an alternative representation (which will become the only
representation in format v4) whereby nonrepresentable types are encoded
as actual types with kind CTF_K_UNKNOWN (an already-existing kind
theoretically but not in practice used for padding, with value 0).
This is backward-compatible, because CTF_K_UNKNOWN was not used anywhere
before now: it was used in old-format function symtypetabs, but these
were never emitted by any compiler and the code to handle them in libctf
likely never worked and was removed last year, in favour of new-format
symtypetabs that contain only type IDs, not type kinds.

In order to link this type, we need an API addition to let us add types
of unknown kind to the dict: we let them optionally have names so that
GCC can emit many different unknown types and those types with identical
names will be deduplicated together.  There are also small tweaks to the
deduplicator to actually dedup such types, to let opening of dicts with
unknown types with names work, to return the ECTF_NONREPRESENTABLE error
on resolution of such types (like ID 0), and to print their names as
something useful but not a valid C identifier, mostly for the sake of
the dumper.

Tests added in the next commit.

include/ChangeLog
2021-05-06  Nick Alcock  <nick.alcock@oracle.com>

	* ctf.h (CTF_K_UNKNOWN): Document that it can be used for
	nonrepresentable types, not just padding.
	* ctf-api.h (ctf_add_unknown): New.

libctf/ChangeLog
2021-05-06  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-open.c (init_types): Unknown types may have names.
	* ctf-types.c (ctf_type_resolve): CTF_K_UNKNOWN is as
	non-representable as type ID 0.
	(ctf_type_aname): Print unknown types.
	* ctf-dedup.c (ctf_dedup_hash_type): Do not early-exit for
	CTF_K_UNKNOWN types: they have real hash values now.
	(ctf_dedup_rwalk_one_output_mapping): Treat CTF_K_UNKNOWN types
	like other types with no referents: call the callback and do not
	skip them.
	(ctf_dedup_emit_type): Emit via...
	* ctf-create.c (ctf_add_unknown): ... this new function.
	* libctf.ver (LIBCTF_1.2): Add it.
2021-05-06 09:30:59 +01:00
Nick Alcock
08c428aff4 libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types.  This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed.  Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset.  The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.

This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for).  This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)

The only extra complexity here is ctf_add_type.  Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time.  Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.

We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.

This completes the elimination of dtd_u.

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
	<dtd_u>: Likewise.
	(ctf_dmdef_t): Remove.
	(struct ctf_next) <u.ctn_dmd>: Remove.
	* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
	vlen size.
	(ctf_add_enum): Use it.
	(ctf_dtd_delete): Do not free the (removed) dmd; remove string
	refs from the vlen on struct deletion.
	(ctf_add_struct_sized): Populate the vlen: do it by hand if
	promoting forwards.  Always populate the full-size
	lsizehi/lsizelo members.
	(ctf_add_union_sized): Likewise.
	(ctf_add_member_offset): Set up the vlen rather than the dmd.
	Expand it as needed, repointing string refs via
	ctf_str_move_pending. Add the member names as pending strings.
	Always populate the full-size lsizehi/lsizelo members.
	(membadd): Remove, folding back into...
	(ctf_add_type_internal): ... here, adding via an ordinary
	ctf_add_struct_sized and _next iteration rather than doing
	everything by hand.
	* ctf-serialize.c (ctf_copy_smembers): Remove this...
	(ctf_copy_lmembers): ... and this...
	(ctf_emit_type_sect): ... folding into here. Figure out if a
	ctf_stype_t is needed here, not in ctf_add_*_sized.
	(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
	way here.
	* ctf-types.c (ctf_member_next): Remove the dmd path and always
	use the vlen.  Force large-structure usage for dynamic types.
	(ctf_type_align): Likewise.
	(ctf_member_info): Likewise.
	(ctf_type_rvisit): Likewise.
	* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
	self-referential type to this test.
	* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
	accordingly.
	* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 12:40:40 +00:00
Nick Alcock
77d724a7ec libctf: eliminate dtd_u, part 4: enums
This is the first tricky one, the first complex multi-entry vlen
containing strings.  To handle this in vlen form, we have to handle
pending refs moving around on realloc.

We grow vlen regions using a new ctf_grow_vlen function, and iterate
through the existing enums every time a grow happens, telling the string
machinery the distance between the old and new vlen region and letting
it adjust the pending refs accordingly.  (This avoids traversing all
outstanding refs to find the refs that need adjusting, at the cost of
having to traverse one enum: an obvious major performance win.)

Addition of enums themselves (and also structs/unions later) is a bit
trickier than earlier forms, because the type might be being promoted
from a forward, and forwards have no vlen: so we have to spot that and
create it if needed.

Serialization of enums simplifies down to just telling the string
machinery about the string refs; all the enum type-lookup code loses all
its dynamic member lookup complexity entirely.

A new test is added that iterates over (and gets values of) an enum with
enough members to force a round of vlen growth.

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dtdef_t) <dtd_vlen_alloc>: New.
	(ctf_str_move_pending): Declare.
	* ctf-string.c (ctf_str_add_ref_internal): Fix error return.
	(ctf_str_move_pending): New.
	* ctf-create.c (ctf_grow_vlen): New.
	(ctf_dtd_delete): Zero out the vlen_alloc after free.  Free the
	vlen later: iterate over it and free enum name refs first.
	(ctf_add_generic): Populate dtd_vlen_alloc from vlen.
	(ctf_add_enum): populate the vlen; do it by hand if promoting
	forwards.
	(ctf_add_enumerator): Set up the vlen rather than the dmd.  Expand
	it as needed, repointing string refs via ctf_str_move_pending. Add
	the enumerand names as pending strings.
	* ctf-serialize.c (ctf_copy_emembers): Remove.
	(ctf_emit_type_sect): Copy the vlen into place and ref the
	strings.
	* ctf-types.c (ctf_enum_next): The dynamic portion now uses
	the same code as the non-dynamic.
	(ctf_enum_name): Likewise.
	(ctf_enum_value): Likewise.
	* testsuite/libctf-lookup/enum-many-ctf.c: New test.
	* testsuite/libctf-lookup/enum-many.lk: New test.
2021-03-18 12:40:40 +00:00
Nick Alcock
986e9e3aa0 libctf: do not corrupt strings across ctf_serialize
The preceding change revealed a new bug: the string table is sorted for
better compression, so repeated serialization with type (or member)
additions in the middle can move strings around.  But every
serialization flushes the set of refs (the memory locations that are
automatically updated with a final string offset when the strtab is
updated), so if we are not to have string offsets go stale, we must do
all ref additions within the serialization code (which walks the
complete set of types and symbols anyway). Unfortunately, we were adding
one ref in another place: the type name in the dynamic type definitions,
which has a ref added to it by ctf_add_generic.

So adding a type, serializing (via, say, one of the ctf_write
functions), adding another type with a name that sorts earlier, and
serializing again will corrupt the name of the first type because it no
longer had a ref pointing to its dtd entry's name when its string offset
was shifted later in the strtab to mae way for the other type.

To ensure that we don't miss strings, we also maintain a set of *pending
refs* that will be added later (during serialization), and remove
entries from that set when the ref is finally added.  We always use
ctf_str_add_pending outside ctf-serialize.c, ensure that ctf_serialize
adds all strtab offsets as refs (even those in the dtds) on every
serialization, and mandate that no refs are live on entry to
ctf_serialize and that all pending refs are gone before strtab
finalization.  (Of necessity ctf_serialize has to traverse all strtab
offsets in the dtds in order to serialize them, so adding them as refs
at the same time is easy.)

(Note that we still can't erase unused atoms when we roll back, though
we can erase unused refs: members and enums are still not removed by
rollbacks and might reference strings added after the snapshot.)

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-hash.c (ctf_dynset_elements): New.
	* ctf-impl.h (ctf_dynset_elements): Declare it.
	(ctf_str_add_pending): Likewise.
	(ctf_dict_t) <ctf_str_pending_ref>: New, set of refs that must be
	added during serialization.
	* ctf-string.c (ctf_str_create_atoms): Initialize it.
	(CTF_STR_ADD_REF): New flag.
	(CTF_STR_MAKE_PROVISIONAL): Likewise.
	(CTF_STR_PENDING_REF): Likewise.
	(ctf_str_add_ref_internal): Take a flags word rather than int
	params.  Populate, and clear out, ctf_str_pending_ref.
	(ctf_str_add): Adjust accordingly.
	(ctf_str_add_external): Likewise.
	(ctf_str_add_pending): New.
	(ctf_str_remove_ref): Also remove the potential ref if it is a
	pending ref.
	* ctf-serialize.c (ctf_serialize): Prohibit addition of strings
	with ctf_str_add_ref before serialization.  Ensure that the
	ctf_str_pending_ref set is empty before strtab finalization.
	(ctf_emit_type_sect): Add a ref to the ctt_name.
	* ctf-create.c (ctf_add_generic): Add the ctt_name as a pending
	ref.
	* testsuite/libctf-writable/reserialize-strtab-corruption.*: New test.
2021-03-18 12:40:40 +00:00
Nick Alcock
81982d20fa libctf: eliminate dtd_u, part 3: functions
One more member vanishes from the dtd_u, leaving only the member for
struct/union/enum members.

There's not much to do here, since as of commit afd78bd6f0 we use
the same representation (type sizes, etc) in the dtu_argv as we will
use in the final vlen, with one exception: the vlen has alignment
padding, and the dtu_argv did not.  Simplify things by adding suitable
padding in both cases.

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_argv>: Remove.
	* ctf-create.c (ctf_dtd_delete): No longer free it.
	(ctf_add_function): Use the dtd_vlen, not dtu_argv.  Properly align.
	* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
	* ctf-types.c (ctf_func_type_info): Just use the vlen.
	(ctf_func_type_args): Likewise.
2021-03-18 12:40:40 +00:00
Nick Alcock
534444b1ee libctf: eliminate dtd_u, part 2: arrays
This is even simpler than ints, floats and slices, with the only extra
complication being the need to manually transfer the array parameter in
the rarely-used function ctf_set_array.  (Arrays are unique in libctf in
that they can be modified post facto, not just created and appended to.
I'm not sure why they got this exemption, but it's easy to maintain.)

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_arr>: Remove.
	* ctf-create.c (ctf_add_array): Use the dtd_vlen, not dtu_arr.
	(ctf_set_array): Likewise.
	* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
	* ctf-types.c (ctf_array_info): Just use the vlen.
2021-03-18 12:40:40 +00:00
Nick Alcock
7879dd88ef libctf: eliminate dtd_u, part 1: int/float/slice
This series eliminates a lot of special-case code to handle dynamic
types (types added to writable dicts and not yet serialized).

Historically, when such types have variable-length data in their final
CTF representations, libctf has always worked by adding such types to a
special union (ctf_dtdef_t.dtd_u) in the dynamic type definition
structure, then picking the members out of this structure at
serialization time and packing them into their final form.

This has the advantage that the ctf_add_* code doesn't need to know
anything about the final CTF representation, but the significant
disadvantage that all code that looks up types in any way needs two code
paths, one for dynamic types, one for all others.  Historically libctf
"handled" this by not supporting most type lookups on dynamic types at
all until ctf_update was called to do a complete reserialization of the
entire dict (it didn't emit an error, it just emitted wrong results).
Since commit 676c3ecbad, which eliminated ctf_update in favour of
the internal-only ctf_serialize function, all the type-lookup paths
grew an extra branch to handle dynamic types.

We can eliminate this branch again by dropping the dtd_u stuff and
simply writing out the vlen in (close to) its final form at ctf_add_*
time: type lookup for types using this approach is then identical for
types in writable dicts and types that are in read-only ones, and
serialization is also simplified (we just need to write out the vlen
we already created).

The only complexity lies in type kinds for which multiple
vlen representations are valid depending on properties of the type,
e.g. structures.  But we can start simple, adjusting ints, floats,
and slices to work this way, and leaving everything else as is.

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove.
	<dtd_u.dtu_slice>: Likewise.
	<dtd_vlen>: New.
	* ctf-create.c (ctf_add_generic): Perhaps allocate it.  All
	callers adjusted.
	(ctf_dtd_delete): Free it.
	(ctf_add_slice): Use the dtd_vlen, not dtu_enc.
	(ctf_add_encoded): Likewise.  Assert that this must be an int or
	float.
	* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
	* ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not
	dtu_slice.
	* ctf-types.c (ctf_type_reference): Likewise.
	(ctf_type_encoding): Remove most dynamic-type-specific code: just
	get the vlen from the right place.  Report failure to look up the
	underlying type's encoding.
2021-03-18 12:40:36 +00:00
Nick Alcock
bf4c3185a5 libctf: split serialization and file writeout into its own file
The code to serialize CTF dicts just gets bigger and bigger as the
dictionary's complexity grows: adding symtypetabs almost doubled it on
its own.  It's long past time to split this out into its own source
file, accompanied by the functions that do the actual writeout.

This leaves ctf-create.c populated exclusively by functions related to
actual writable dict creation (ctf_add_*, ctf_create etc), and leaves
both files a much more reasonable size.

libctf/ChangeLog
2021-03-18  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (symtypetab_delete_nonstatic_vars): Move
	into ctf-serialize.c.
	(ctf_symtab_skippable): Likewise.
	(CTF_SYMTYPETAB_EMIT_FUNCTION): Likewise.
	(CTF_SYMTYPETAB_EMIT_PAD): Likewise.
	(CTF_SYMTYPETAB_FORCE_INDEXED): Likewise.
	(symtypetab_density): Likewise.
	(emit_symtypetab): Likewise.
	(emit_symtypetab_index): Likewise.
	(ctf_copy_smembers): Likewise.
	(ctf_copy_lmembers): Likewise.
	(ctf_copy_emembers): Likewise.
	(ctf_sort_var): Likewise.
	(ctf_serialize): Likewise.
	(ctf_gzwrite): Likewise.
	(ctf_compress_write): Likewise.
	(ctf_write_mem): Likewise.
	(ctf_write): Likewise.
	* ctf-serialize.c: New file.
	* Makefile.am (libctf_nobfd_la_SOURCES): Add it.
	* Makefile.in: Regenerate.
2021-03-18 12:37:53 +00:00
Nick Alcock
211bcd0133 bfd, ld, libctf: skip zero-refcount strings in CTF string reporting
This is a tricky one.  BFD, on the linker's behalf, reports symbols to
libctf via the ctf_new_symbol and ctf_new_dynsym callbacks, which
ultimately call ctf_link_add_linker_symbol.  But while this happens
after strtab offsets are finalized, it happens before the .dynstr is
actually laid out, so we can't iterate over it at this stage and
it is not clear what the reported symbols are actually called.  So
a second callback, examine_strtab, is called after the .dynstr is
finalized, which calls ctf_link_add_strtab and ultimately leads
to ldelf_ctf_strtab_iter_cb being called back repeatedly until the
offsets of every string in the .dynstr is passed to libctf.

libctf can then use this to get symbol names out of the input (which
usually stores symbol types in the form of a name -> type mapping at
this stage) and extract the types of those symbols, feeding them back
into their final form as a 1:1 association with the real symtab's
STT_OBJ and STT_FUNC symbols (with a few skipped, see
ctf_symtab_skippable).

This representation is compact, but has one problem: if libctf somehow
gets confused about the st_type of a symbol, it'll stick an entry into
the function symtypetab when it should put it into the object
symtypetab, or vice versa, and *every symbol from that one on* will have
the wrong CTF type because it's actually looking up the type for a
different symbol.

And we have just such a bug.  ctf_link_add_strtab was not taking the
refcounts of strings into consideration, so even strings that had been
eliminated from the strtab by virtue of being in objects eliminated via
--as-needed etc were being reported.  This is harmful because it can
lead to multiple strings with the same apparent offset, and if the last
duplicate to be reported relates to an eliminated symbol, we look up the
wrong symbol from the input and gets its type wrong: if it's unlucky and
the eliminated symbol is also of the wrong st_type, we will end up with
a corrupted symtypetab.

Thankfully the wrong-st_type case is already diagnosed by a
this-can-never-happen paranoid warning:

  CTF warning: Symbol 61a added to CTF as a function but is of type 1

or the converse

 * CTF warning: Symbol a3 added to CTF as a data object but is of type 2

so at least we can tell when the corruption has spread to more than one
symbol's type.

Skipping zero-refcounted strings is easy: teach _bfd_elf_strtab_str to
skip them, and ldelf_ctf_strtab_iter_cb to loop over skipped strings
until it falls off the end or finds one that isn't skipped.

bfd/ChangeLog
2021-03-02  Nick Alcock  <nick.alcock@oracle.com>

	* elf-strtab.c (_bfd_elf_strtab_str): Skip strings with zero refcount.

ld/ChangeLog
2021-03-02  Nick Alcock  <nick.alcock@oracle.com>

	* ldelfgen.c (ldelf_ctf_strtab_iter_cb): Skip zero-refcount strings.

libctf/ChangeLog
2021-03-02  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (symtypetab_density): Report the symbol name as
	well as index in the name != object error; note the likely
	consequences.
	* ctf-link.c (ctf_link_shuffle_syms): Report the symbol index
	as well as name.
2021-03-02 15:10:10 +00:00
Nick Alcock
f5060e5633 libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs.  The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held.  However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function.  To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.

This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.

Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.

This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.

libctf/ChangeLog
2021-03-02  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
	by the nondeduplicating linker.
	(ctf_add_type_mapping): Removed, now static.
	(ctf_type_mapping): Likewise.
	(ctf_dedup_type_mapping): New.
	(ctf_dedup_t) <cd_input_nums>: New.
	* ctf-dedup.c (ctf_dedup_init): Populate it.
	(ctf_dedup_fini): Free it again.  Emphasise that this has to be
	the last thing called.
	(ctf_dedup): Populate it.
	(ctf_dedup_populate_type_mapping): Removed.
	(ctf_dedup_populate_type_mappings): Likewise.
	(ctf_dedup_emit): No longer call it.  No longer call
	ctf_dedup_fini either.
	(ctf_dedup_type_mapping): New.
	* ctf-link.c (ctf_unnamed_cuname): New.
	(ctf_create_per_cu): Arguments must be non-null now.
	(ctf_in_member_cb_arg): Removed.
	(ctf_link): No longer populate it.  No longer discard the
	mapping table.
	(ctf_link_deduplicating_one_symtypetab): Use
	ctf_dedup_type_mapping, not ctf_type_mapping.  Use
	ctf_unnamed_cuname.
	(ctf_link_one_variable): Likewise.  Pass in args individually: no
	longer a ctf_variable_iter callback.
	(empty_link_type_mapping): Removed.
	(ctf_link_deduplicating_variables): Use ctf_variable_next, not
	ctf_variable_iter.  No longer pack arguments to
	ctf_link_one_variable into a struct.
	(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
	all link phases are done.
	(ctf_link_deduplicating): Likewise.
	(ctf_link_intern_extern_string): Improve comment.
	(ctf_add_type_mapping): Migrate...
	(ctf_type_mapping): ... these functions...
	* ctf-create.c (ctf_add_type_mapping): ... here...
	(ctf_type_mapping): ... and make static, for the sole use of
	ctf_add_type.
2021-03-02 15:10:07 +00:00
Nick Alcock
5dacd11ddc libctf: fix uninitialized variable in symbol serialization error handling
We declare a variable to hold errors at two scopes, and then initialize
the inner one and jump to a scope where only the outer one is in scope.

The consequences are minor: only the version of the error message
printed in the debugging stream is impacted.

libctf/ChangeLog
2021-01-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (ctf_serialize): Fix shadowing.
2021-02-04 16:01:53 +00:00
Nick Alcock
caa170493e libctf: prohibit nameless ints, floats, typedefs and forwards
Now that "anonymous typedef nodes" have been extirpated, we can mandate
that things that have names in C must have names in CTF too.  (Unlike
the no-forwards embarrassment, the deduplicator does nothing special
with names: types that have names in C will have the same name in CTF.
So we can assume that the CTF rules and the C rules are the same.)

include/ChangeLog
2021-01-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ECTF_NONAME): New.
	(ECTF_NERR): Adjust.

libctf/ChangeLog
2021-01-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (ctf_add_encoded): Add check for non-empty name.
	(ctf_add_forward): Likewise.
	(ctf_add_typedef): Likewise.
2021-02-04 16:01:53 +00:00
Nick Alcock
78f28b89e8 libctf: rip out dead code handling typedefs with no name
There is special code in libctf to handle typedefs with no name, which
the code calls "anonymous typedef nodes".

These monsters are obviously not something C programs can include: the
whole point of a ttypedef is to introduce a new name.  Looking back at
the history of DWARF in GCC, the only thing (outside C++ anonymous
namespaces) which can generate a DW_TAG_typedef without a DW_AT_name is
obsolete code to handle the long-removed -feliminate-dwarf2-dups option.
Looking at OpenSolaris, typedef nodes with no name couldn't be generated
by the DWARF->CTF converter at all (and its deduplicator barfed on
them): the only reason for the existence of this code is a special case
working around a peculiarity of stabs whereby types could sometimes be
referenced before they were introduced.

We don't need to carry code in libctf to handle special cases in an
obsolete OpenSolaris converter (that yields a format that isn't readable
by libctf anyway).  So drop it.

libctf/ChangeLog
2021-01-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-open.c (init_types): Rip out code to check anonymous typedef
	nodes.
	* ctf-create.c (ctf_add_reftype): Likewise.
	* ctf-lookup.c (refresh_pptrtab): Likewise.
2021-02-04 16:01:53 +00:00
Nick Alcock
35a01a0454 libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations).  We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.

Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.

Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported).  But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.

We clearly need to rethink this.  Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us.  So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.

(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)

There's another wrinkle, though.  It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation.  Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present.  So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed).  ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case.  (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)

Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.

Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)

include/ChangeLog
2021-01-26  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.

ld/ChangeLog
2021-01-26  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
	when appropriate.

libctf/ChangeLog
2021-01-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.c (_libctf_nonnull_): Add parameters.
	(LCTF_LINKING): New flag.
	(ctf_dict_t) <ctf_link_flags>: Mention it.
	* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
	(ctf_write): Likewise, including in child dictionaries.
	(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
	are no reported symbols.
	* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
	the variable has been reported as a symbol by the linker.
	(symtypetab_skippable): Mention relationship between SYMFP and the
	flags.
	(symtypetab_density): Adjust nonnullity.  Exit early if no symbols
	were reported and force-indexing is off (i.e., we are doing a
	final link).
	(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
	indexed, sorted symtypetab (and allow SYMFP to be NULL in this
	case).  Turn sorting off if this is a non-final link.  Only delete
	nonstatic vars if we are filtering symbols and the linker has
	reported some.
	* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
	New test of variable and symtypetab section population when
	ld -r is used.
	* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
	Likewise, when ld of an executable is used.
	* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
	Likewise, when ld -shared alone is used.
	* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
	Lookup programs for the above.
	* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
	test, testing survival of symbols across ctf_write paths.
	* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
	nonshared, suppressing linking of the SOURCE with -shared.
2021-02-04 16:01:53 +00:00
Nick Alcock
26503e2f5e libctf, create: fix ctf_type_add of structs with unnamed members
Our recent commit to support unnamed structure members better ditched
the old ctf_member_iter iterator body in favour of ctf_member_next.
However, these functions treat unnamed structure members differently:
ctf_member_iter just returned whatever the internal representation
contained, while ctf_member_next took care to always return "" rather
than sometimes returning "" and sometimes NULL depending on whether the
dict was dynamic (a product of ctf_create) or not (a product of
ctf_open).  After this commit, ctf_member_iter did the same.

It was always a bug for external callers not to treat a "" return from
these functions as if it were NULL, so only buggy callers could be
affected -- but one of those buggy callers was ctf_add_type, which
assumed that it could just take whatever name was returned from
ctf_member_iter and slam it directly into the internal representation of
a dynamic dict -- which expects NULL for unnamed members, not "".  The
net effect of all of this is that taking a struct containing unnamed
members and ctf_add_type'ing it into a dynamic dict produced a dict
whose unnamed members were inaccessible to ctf_member_info (though if
you wrote that dict out and then ctf_open'ed it, they would magically
reappear again).

Compensate for this by suitably transforming a "" name into NULL in the
internal representation, as should have been done all along.

libctf/ChangeLog
2021-01-19  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (membadd): Transform ""-named members into
	NULL-named ones.
	* testsuite/libctf-regression/type-add-unnamed-struct*: New test.
2021-01-19 12:45:20 +00:00
Nick Alcock
abe4ca69a1 libctf: fix lookups of pointers by name in parent dicts
When you look up a type by name using ctf_lookup_by_name, in most cases
libctf can just strip off any qualifiers and look for the name, but for
pointer types this doesn't work, since the caller will want the pointer
type itself.  But pointer types are nameless, and while they cite the
types they point to, looking up a type by name requires a link going the
*other way*, from the type pointed to to the pointer type that points to
it.

libctf has always built this up at open time: ctf_ptrtab is an array of
type indexes pointing from the index of every type to the index of the
type that points to it.  But because it is built up at open time (and
because it uses type indexes and not type IDs) it is restricted to
working within a single dict and ignoring parent/child
relationships. This is normally invisible, unless you manage to get a
dict with a type in the parent but the only pointer to it in a child.
The ctf_ptrtab will not track this relationship, so lookups of this
pointer type by name will fail.  Since which type is in the parent and
which in the child is largely opaque to the user (which goes where is up
to the deduplicator, and it can and does reshuffle things to save
space), this leads to a very bad user experience, with an
obviously-visible pointer type which ctf_lookup_by_name claims doesn't
exist.

The fix is to have another array, ctf_pptrtab, which is populated in
child dicts: like the parent's ctf_ptrtab, it has one element per type
in the parent, but is all zeroes except for those types which are
pointed to by types in the child: so it maps parent dict indices to
child dict indices.  The array is grown, and new child types scanned,
whenever a lookup happens and new types have been added to the child
since the last time a lookup happened that might need the pptrtab.
(So for non-writable dicts, this only happens once, since new types
cannot be added to non-writable dicts at all.)

Since this introduces new complexity (involving updating only part of
the ctf_pptrtab) which is only seen when a writable dict is in use, we
introduce a new libctf-writable testsuite that contains lookup tests
with no corresponding CTF-containing .c files (which can thus be run
even on platforms with no .ctf-section support in the linker yet), and
add a test to check that creation of pointers in children to types in
parents and a following lookup by name works as expected.  The non-
writable case is tested in a new libctf-regression testsuite which is
used to track now-fixed outright bugs in libctf.

libctf/ChangeLog
2021-01-05  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_pptrtab>: New.
	<ctf_pptrtab_len>: New.
	<ctf_pptrtab_typemax>: New.
	* ctf-create.c (ctf_serialize): Update accordingly.
	(ctf_add_reftype): Note that we don't need to update pptrtab here,
	despite updating ptrtab.
	* ctf-open.c (ctf_dict_close): Destroy the pptrtab.
	(ctf_import): Likewise.
	(ctf_import_unref): Likewise.
	* ctf-lookup.c (grow_pptrtab): New.
	(refresh_pptrtab): New, update a pptrtab.
	(ctf_lookup_by_name): Turn into a wrapper around (and rename to)...
	(ctf_lookup_by_name_internal): ... this: construct the pptrtab, and
	use it in addition to the parent's ptrtab when parent dicts are
	searched.
	* testsuite/libctf-regression/regression.exp: New testsuite for
	regression tests.
	* testsuite/libctf-regression/pptrtab*: New test.
	* testsuite/libctf-writable/writable.exp: New testsuite for tests of
	writable CTF dicts.
	* testsuite/libctf-writable/pptrtab*: New test.
2021-01-05 14:53:40 +00:00
Nick Alcock
ffeece6ac2 libctf, ld: prohibit getting the size or alignment of forwards
C allows you to do only a very few things with entities of incomplete
type (as opposed to pointers to them): make pointers to them and give
them cv-quals, roughly. In particular you can't sizeof them and you
can't get their alignment.

We cannot impose all the requirements the standard imposes on CTF users,
because the deduplicator can transform any structure type into a forward
for the purposes of breaking cycles: so CTF type graphs can easily
contain things like arrays of forward type (if you want to figure out
their size or alignment, you need to chase down the types this forward
might be a forward to in child TU dicts: we will soon add API functions
to make doing this much easier).

Nonetheless, it is still meaningless to ask for the size or alignment of
forwards: but libctf didn't prohibit this and returned nonsense from
internal implementation details when you asked (it returned the kind of
the pointed-to type as both the size and alignment, because forwards
reuse ctt_type as a type kind, and ctt_type and ctt_size overlap).  So
introduce a new error, ECTF_INCOMPLETE, which is returned when you try
to get the size or alignment of forwards: we also return it when you try
to do things that require libctf itself to get the size or alignment of
a forward, notably using a forward as an array index type (which C
should never do in any case) or adding forwards to structures without
specifying their offset explicitly.

The dumper will not emit size or alignment info for forwards any more.

(This should not be an API break since ctf_type_size and ctf_type_align
could both return errors before now: any code that isn't expecting error
returns is already potentially broken.)

include/ChangeLog
2021-01-05  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ECTF_INCOMPLETE): New.
	(ECTF_NERR): Adjust.

ld/ChangeLog
2021-01-05  Nick Alcock  <nick.alcock@oracle.com>

	* testsuite/ld-ctf/conflicting-cycle-1.parent.d: Adjust for dumper
	changes.
	* testsuite/ld-ctf/cross-tu-cyclic-conflicting.d: Likewise.
	* testsuite/ld-ctf/forward.c: New test...
	* testsuite/ld-ctf/forward.d: ... and results.

libctf/ChangeLog
2021-01-05  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-types.c (ctf_type_resolve): Improve comment.
	(ctf_type_size): Yield ECTF_INCOMPLETE when applied to forwards.
	Emit errors into the right dict.
	(ctf_type_align): Likewise.
	* ctf-create.c (ctf_add_member_offset): Yield ECTF_INCOMPLETE
	when adding a member without explicit offset when this member, or
	the previous member, is incomplete.
	* ctf-dump.c (ctf_dump_format_type): Do not try to print the size of
	forwards.
	(ctf_dump_member): Do not try to print their alignment.
2021-01-05 14:53:39 +00:00
Alan Modra
250d07de5c Update year range in copyright notice of binutils files 2021-01-01 10:31:05 +10:30
Nick Alcock
53651de80f libctf, include: support foreign-endianness symtabs with CTF
The CTF symbol lookup machinery added recently has one deficit: it
assumes the symtab is in the machine's native endianness.  This is
always true when the linker is writing out symtabs (because cross
linkers byteswap symbols only after libctf has been called on them), but
may be untrue in the cross case when the linker or another tool
(objdump, etc) is reading them.

Unfortunately the easy way to model this to the caller, as an endianness
field in the ctf_sect_t, is precluded because doing so would change the
size of the ctf_sect_t, which would be an ABI break.  So, instead, allow
the endianness of the symtab to be set after open time, by calling one
of the two new API functions ctf_symsect_endianness (for ctf_dict_t's)
or ctf_arc_symsect_endianness (for entire ctf_archive_t's).  libctf
calls these functions automatically for objects opened via any of the
BFD-aware mechanisms (ctf_bfdopen, ctf_bfdopen_ctfsect, ctf_fdopen,
ctf_open, or ctf_arc_open), but the various mechanisms that just take
raw ctf_sect_t's will assume the symtab is in native endianness and need
a later call to ctf_*symsect_endianness to adjust it if needed.  (This
call is basically free if the endianness is actually native: it only
costs anything if the symtab endianness was previously guessed wrong,
and there is a symtab, and we are using it directly rather than using
symtab indexing.)

Obviously, calling ctf_lookup_by_symbol or ctf_symbol_next before the
symtab endianness is correctly set will probably give wrong answers --
but you can set it at any time as long as it is before then.

include/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h: Style nit: remove () on function names in comments.
	(ctf_sect_t): Mention endianness concerns.
	(ctf_symsect_endianness): New declaration.
	(ctf_arc_symsect_endianness): Likewise.

libctf/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_symtab_little_endian>: New.
	(struct ctf_archive_internal) <ctfi_symsect_little_endian>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust for new field.
	* ctf-open.c (init_symtab): Note the semantics of repeated calls.
	(ctf_symsect_endianness): New.
	(ctf_bufopen_internal): Set ctf_symtab_little_endian suitably for
	the native endianness.
	(_Static_assert): Moved...
	(swap_thing): ... with this...
	* swap.h: ... to here.
	* ctf-util.c (ctf_elf32_to_link_sym): Use it, byteswapping the
	Elf32_Sym if the ctf_symtab_little_endian demands it.
	(ctf_elf64_to_link_sym): Likewise swap the Elf64_Sym if needed.
	* ctf-archive.c (ctf_arc_symsect_endianness): New, set the
	endianness of the symtab used by the dicts in an archive.
	(ctf_archive_iter_internal): Initialize to unknown (assumed native,
	do not call ctf_symsect_endianness).
	(ctf_dict_open_by_offset): Call ctf_symsect_endianness if need be.
	(ctf_dict_open_internal): Propagate the endianness down.
	(ctf_dict_open_sections): Likewise.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Get the endianness from the
	struct bfd and pass it down to the archive.
	* libctf.ver: Add ctf_symsect_endianness and
	ctf_arc_symsect_endianness.
2020-11-25 19:11:35 +00:00
Nick Alcock
8f235c90a2 libctf: error-handling fixes
libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (ctf_dtd_insert): Set ENOMEM on the dict if out of memory.
	(ctf_dvd_insert): Likewise.
	(ctf_add_function): Report ECTF_RDONLY if this dict is not writable.
	* ctf-subr.c (ctf_err_warn): Only debug-dump passed-in warnings if
	the passed-in error code is nonzero: the error on the dict for
	warnings may relate to a previous error.
2020-11-20 13:34:12 +00:00
Nick Alcock
1136c37971 libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types.  The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).

With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)

(Compatible) file format change:

The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf.  But
conveniently the compiler has never emitted this!  Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump.  (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)

So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.

This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types.  (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)

We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use.  A sufficiently new compiler will
always set this flag.  New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set.  If the flag
is not set on a dict being read in, new libctf will disregard the
function info section.  Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).

New API:

Symbol addition:
  ctf_add_func_sym: Add a symbol with a given name and type.  The
                    type must be of kind CTF_K_FUNCTION (a function
                    pointer).  Internally this adds a name -> type
                    mapping to the ctf_funchash in the ctf_dict.
  ctf_add_objt_sym: Add a symbol with a given name and type.  The type
                    kind can be anything, including function pointers.
		    This adds to ctf_objthash.

These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict.  Repeated relinks can add more symbols.

Variables that are also exposed as symbols are removed from the variable
section at serialization time.

CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically.  (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).

The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.

Iteration:
  ctf_symbol_next: Iterator which returns the types and names of symbols
                   one by one, either for function or data symbols.

This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.

(Compatible) changes in API:
  ctf_lookup_by_symbol: can now be called for object and function
                        symbols: never returns ECTF_NOTDATA (which is
			now not thrown by anything, but is kept for
                        compatibility and because it is a plausible
                        error that we might start throwing again at some
                        later date).

Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out.  This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_symbol_next): New.
	(ctf_add_objt_sym): Likewise.
	(ctf_add_func_sym): Likewise.
	* ctf.h: Document new function info section format.
	(CTF_F_NEWFUNCINFO): New.
	(CTF_F_IDXSORTED): New.
	(CTF_F_MAX): Adjust accordingly.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
	(_libctf_nonnull_): Likewise.
	(ctf_in_flight_dynsym_t): New.
	(ctf_dict_t) <ctf_funcidx_names>: Likewise.
	<ctf_objtidx_names>: Likewise.
	<ctf_nfuncidx>: Likewise.
	<ctf_nobjtidx>: Likewise.
	<ctf_funcidx_sxlate>: Likewise.
	<ctf_objtidx_sxlate>: Likewise.
	<ctf_objthash>: Likewise.
	<ctf_funchash>: Likewise.
	<ctf_dynsyms>: Likewise.
	<ctf_dynsymidx>: Likewise.
	<ctf_dynsymmax>: Likewise.
	<ctf_in_flight_dynsym>: Likewise.
	(struct ctf_next) <u.ctn_next>: Likewise.
	(ctf_symtab_skippable): New prototype.
	(ctf_add_funcobjt_sym): Likewise.
	(ctf_dynhash_sort_by_name): Likewise.
	(ctf_sym_to_elf64): Rename to...
	(ctf_elf32_to_link_sym): ... this, and...
	(ctf_elf64_to_link_sym): ... this.
	* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
	flag, and presence of index sections.  Refactor out
	ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them.  Use
	ctf_link_sym_t, not Elf64_Sym.  Skip initializing objt or func
	sxlate sections if corresponding index section is present.  Adjust
	for new func info section format.
	(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
	handling.  Report incorrect-length index sections.  Always do an
	init_symtab, even if there is no symtab section (there may be index
	sections still).
	(flip_objts): Adjust comment: func and objt sections are actually
	identical in structure now, no need to caveat.
	(ctf_dict_close):  Free newly-added data structures.
	* ctf-create.c (ctf_create): Initialize them.
	(ctf_symtab_skippable): New, refactored out of
	init_symtab, with st_nameidx_set check added.
	(ctf_add_funcobjt_sym): New, add a function or object symbol to the
	ctf_objthash or ctf_funchash, by name.
	(ctf_add_objt_sym): Call it.
	(ctf_add_func_sym): Likewise.
	(symtypetab_delete_nonstatic_vars): New, delete vars also present as
	data objects.
	(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
	this is a function emission, not a data object emission.
	(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
	pads for symbols with no type (only set for unindexed sections).
	(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
	always emit indexed.
	(symtypetab_density): New, figure out section sizes.
	(emit_symtypetab): New, emit a symtypetab.
	(emit_symtypetab_index): New, emit a symtypetab index.
	(ctf_serialize): Call them, emitting suitably sorted symtypetab
	sections and indexes.  Set suitable header flags.  Copy over new
	fields.
	* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
	order on symtypetab index sections.
	* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
	relating to code that was never committed.
	(ctf_link_one_variable): Improve variable name.
	(check_sym): New, symtypetab analogue of check_variable.
	(ctf_link_deduplicating_one_symtypetab): New.
	(ctf_link_deduplicating_syms): Likewise.
	(ctf_link_deduplicating): Call them.
	(ctf_link_deduplicating_per_cu): Note that we don't call them in
	this case (yet).
	(ctf_link_add_strtab): Set the error on the fp correctly.
	(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
	a linker symbol to the in-flight list.
	(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
	in-flight list into a mapping we can use, now its names are
	resolvable in the external strtab.
	* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
	external strtab offsets.
	(ctf_str_rollback): Adjust comment.
	(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
	writeout time...
	(ctf_str_add_external): ... to string addition time.
	* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
	(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
	<clik_names>: New member, a name table.
	(ctf_lookup_var): Adjust accordingly.
	(ctf_lookup_variable): Likewise.
	(ctf_lookup_by_id): Shuffle further up in the file.
	(ctf_symidx_sort_arg_cb): New, callback for...
	(sort_symidx_by_name): ... this new function to sort a symidx
	found to be unsorted (likely originating from the compiler).
	(ctf_symidx_sort): New, sort a symidx.
	(ctf_lookup_symbol_name): Support dynamic symbols with indexes
	provided by the linker.  Use ctf_link_sym_t, not Elf64_Sym.
	Check the parent if a child lookup fails.
	(ctf_lookup_by_symbol): Likewise.  Work for function symbols too.
	(ctf_symbol_next): New, iterate over symbols with types (without
	sorting).
	(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
	(ctf_try_lookup_indexed): New, attempt an indexed lookup.
	(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
	(ctf_func_args): Likewise.
	(ctf_get_dict): Move...
	* ctf-types.c (ctf_get_dict): ... here.
	* ctf-util.c (ctf_sym_to_elf64): Re-express as...
	(ctf_elf64_to_link_sym): ... this.  Add new st_symidx field, and
	st_nameidx_set (always 0, so st_nameidx can be ignored).  Look in
	the ELF strtab for names.
	(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
	(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
	* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
	ctf_add_func_sym.
2020-11-20 13:34:08 +00:00
Nick Alcock
3d16b64e28 bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.

The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names.  So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized.  The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.

Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy.  The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.

Some bits might be contentious.  The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen.  This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym.  We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.

There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in.  ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).

Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)

Tests forthcoming in a later commit in this series.

bfd/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* elflink.c (elf_finalize_dynstr): Call examine_strtab after
	dynstr finalization.
	(elf_link_swap_symbols_out): Don't call it here.  Call
	ctf_new_symbol before swap_symbol_out.
	(elf_link_output_extsym): Call ctf_new_dynsym before
	swap_symbol_out.
	(bfd_elf_final_link): Likewise.
	* elf.c (swap_out_syms): Pass in bfd_link_info.  Call
	ctf_new_symbol before swap_symbol_out.
	(_bfd_elf_compute_section_file_positions): Adjust.

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
	.symtab and .strtab.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* bfdlink.h (struct elf_sym_strtab): Replace with...
	(struct elf_internal_sym): ... this.
	(struct bfd_link_callbacks) <examine_strtab>: Take only a
	symstrtab argument.
	<ctf_new_symbol>: New.
	<ctf_new_dynsym>: Likewise.
	* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
	<st_nameidx>: Likewise.
	<st_nameidx_set>: Likewise.
	(ctf_link_iter_symbol_f): Removed.
	(ctf_link_shuffle_syms): Remove most parameters, just takes a
	ctf_dict_t now.
	(ctf_link_add_linker_symbol): New, split from
	ctf_link_shuffle_syms.
	* ctf.h (CTF_F_DYNSTR): New.
	(CTF_F_MAX): Adjust.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
	(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
	<syms>: Remove.
	<symcount>: Remove.
	<symstrtab>: Rename to...
	<strtab>: ... this.
	(ldelf_ctf_strtab_iter_cb): Adjust.
	(ldelf_ctf_symbols_iter_cb): Remove.
	(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
	symbol.
	(ldelf_examine_strtab_for_ctf): Rename to...
	(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
	portion and not symbols.
	* ldelfgen.h: Adjust declarations accordingly.
	* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
	(ldemul_acquire_strings_for_ctf): ... this.
	(ldemul_new_dynsym_for_ctf): New.
	* ldemul.h: Adjust declarations accordingly.
	* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
	(ldlang_ctf_acquire_strings): ... this.
	(ldlang_ctf_new_dynsym): New.
	(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
	the actual symbol shuffle.
	* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
	* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-link.c (ctf_link_shuffle_syms): Adjust.
	(ctf_link_add_linker_symbol): New, unimplemented stub.
	* libctf.ver: Add it.
	* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
	dicts.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
	symtab/strtab if not present, dynsym/dynstr otherwise.
	* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
	some arbitrary member of a CTF archive.
	* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 13:34:07 +00:00
Nick Alcock
139633c307 libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file".  Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago.  So the term "CTF file"
refers to something that is never a file!  This is at best confusing.

The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.

So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead.  Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.

Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
	* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_section_as_ctf): Likewise.  Use ctf_dict_close, not
	ctf_file_close.

gdb/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
	(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_file_t): Rename to...
	(ctf_dict_t): ... this.  Keep ctf_file_t around for compatibility.
	(struct ctf_file): Likewise rename to...
	(struct ctf_dict): ... this.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ... this, keeping compatibility function.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this, keeping compatibility function.
	All callers adjusted.
	* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
	(struct ctf_archive) <ctfa_nfiles>: Rename to...
	<ctfa_ndicts>: ... this.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (ctf_output): This is a ctf_dict_t now.
	(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
	(ldlang_open_ctf): Adjust comment.
	(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
	* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
	ctf_dict_t.  Change opaque declaration accordingly.
	* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
	* ldemul.h (examine_strtab_for_ctf): Likewise.
	(ldemul_examine_strtab_for_ctf): Likewise.
	* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
	adjusted.
	(ctf_fileops): Rename to...
	(ctf_dictops): ... this.
	(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
	<cd_id_to_dict_t>: ... this.
	(ctf_file_t): Fix outdated comment.
	<ctf_fileops>: Rename to...
	<ctf_dictops>: ... this.
	(struct ctf_archive_internal) <ctfi_file>: Rename to...
	<ctfi_dict>: ... this.
	* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
	Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
	Rename ctf_file_close to ctf_dict_close.  All users adjusted.
	* ctf-create.c: Likewise.  Refer to CTF dicts, not CTF containers.
	(ctf_bundle_t) <ctb_file>: Rename to...
	<ctb_dict): ... this.
	* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
	* ctf-dedup.c: Likewise.  Rename ctf_file_close to
	ctf_dict_close. Refer to CTF dicts, not CTF containers.
	* ctf-dump.c: Likewise.
	* ctf-error.c: Likewise.
	* ctf-hash.c: Likewise.
	* ctf-inlines.h: Likewise.
	* ctf-labels.c: Likewise.
	* ctf-link.c: Likewise.
	* ctf-lookup.c: Likewise.
	* ctf-open-bfd.c: Likewise.
	* ctf-string.c: Likewise.
	* ctf-subr.c: Likewise.
	* ctf-types.c: Likewise.
	* ctf-util.c: Likewise.
	* ctf-open.c: Likewise.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ...this.
	(ctf_file_close): New trivial wrapper around ctf_dict_close, for
	compatibility.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this.
	(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
	compatibility.
	* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 13:34:04 +00:00
Nick Alcock
926c9e7665 libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.

This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.

We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.

We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors.  (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)

ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.

The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them.  We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):

CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'

We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!

errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.

Debug-stream messages are not translated.  If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.

binutils/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf_archive_member): Move error-
	reporting...
	(dump_ctf_errs): ... into this separate function.
	(dump_ctf): Call it on open errors.
	* readelf.c (dump_ctf_archive_member): Move error-
	reporting...
	(dump_ctf_errs): ... into this separate function.  Support
	calls with NULL fp. Adjust for new err parameter to
	ctf_errwarning_next.
	(dump_section_as_ctf): Call it on open errors.

include/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_errwarning_next): New err parameter.

ld/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
	Adjust for new err parameter to ctf_errwarning_next.  Only
	check for assertion failures when fp is non-NULL.
	(ldlang_open_ctf): Call it on open errors.
	* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
	breaking the diags tests.

libctf/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-subr.c (open_errors): New list.
	(ctf_err_warn): Calls with NULL fp append to open_errors.  Add err
	parameter, and use it to decorate the debug stream with errmsgs.
	(ctf_err_warn_to_open): Splice errors from a CTF dict into the
	open_errors.
	(ctf_errwarning_next): Calls with NULL fp report from open_errors.
	New err param to report iteration errors (including end-of-iteration)
	when fp is NULL.
	(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
	parameter: gettextize.
	* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
	(LCTF_VBYTES): Adjust.
	(ctf_err_warn_to_open): New.
	(ctf_err_warn): Adjust.
	(ctf_bundle): Used in only one place: move...
	* ctf-create.c: ... here.
	(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
	down as needed.  Don't emit the errmsg.  Gettextize.
	(membcmp): Likewise.
	(ctf_add_type_internal): Likewise.
	(ctf_write_mem): Likewise.
	(ctf_compress_write): Likewise.  Report errors writing the header or
	body.
	(ctf_write): Likewise.
	* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
	ctf_dprintf, and gettextize, as above.
	(ctf_arc_write): Likewise.
	(ctf_arc_bufopen): Likewise.
	(ctf_arc_open_internal): Likewise.
	* ctf-labels.c (ctf_label_iter): Likewise.
	* ctf-open-bfd.c (ctf_bfdclose): Likewise.
	(ctf_bfdopen): Likewise.
	(ctf_bfdopen_ctfsect): Likewise.
	(ctf_fdopen): Likewise.
	* ctf-string.c (ctf_str_write_strtab): Likewise.
	* ctf-types.c (ctf_type_resolve): Likewise.
	* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
	(get_vbytes_v1): Pass down the ctf dict.
	(get_vbytes_v2): Likewise.
	(flip_ctf): Likewise.
	(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
	gettextize, as above.
	(upgrade_types_v1): Adjust calls.
	(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
	(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
	emitted into individual dicts into the open errors if this turns
	out to be a failed open in the end.
	* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
	argument.  Gettextize.  Don't emit the errmsg.
	(ctf_dump_funcs): Likewise.  Collapse err label into its only case.
	(ctf_dump_type): Likewise.
	* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
	argument.  Gettextize.  Don't emit the errmsg.
	(ctf_link_one_type): Likewise.
	(ctf_link_lazy_open): Likewise.
	(ctf_link_one_input_archive): Likewise.
	(ctf_link_deduplicating_count_inputs): Likewise.
	(ctf_link_deduplicating_open_inputs): Likewise.
	(ctf_link_deduplicating_close_inputs): Likewise.
	(ctf_link_deduplicating): Likewise.
	(ctf_link): Likewise.
	(ctf_link_deduplicating_per_cu): Likewise. Add some missed
	ctf_set_errnos to obscure error cases.
	* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
	err argument.  Gettextize.  Don't emit the errmsg.
	(ctf_dedup_populate_mappings): Likewise.
	(ctf_dedup_detect_name_ambiguity): Likewise.
	(ctf_dedup_init): Likewise.
	(ctf_dedup_multiple_input_dicts): Likewise.
	(ctf_dedup_conflictify_unshared): Likewise.
	(ctf_dedup): Likewise.
	(ctf_dedup_rwalk_one_output_mapping): Likewise.
	(ctf_dedup_id_to_target): Likewise.
	(ctf_dedup_emit_type): Likewise.
	(ctf_dedup_emit_struct_members): Likewise.
	(ctf_dedup_populate_type_mapping): Likewise.
	(ctf_dedup_populate_type_mappings): Likewise.
	(ctf_dedup_emit): Likewise.
	(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
	status setting.
	(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
	unknown-type-kind messages (which signify file corruption).
2020-08-27 13:15:43 +01:00
Eli Zaretskii
555adca2e3 libctf: compilation failure on MinGW due to missing errno values
This commit fixes a compilation failure in a couple of libctf files
due to the use of EOVERFLOW and ENOTSUP, which are not defined
when compiling on MinGW.

libctf/ChangeLog:

	PR binutils/25155:
	* ctf-create.c (EOVERFLOW): If not defined by system header,
	redirect to ERANGE as a poor man's substitute.
	* ctf-subr.c (ENOTSUP): If not defined, use ENOSYS instead.

(cherry picked from commit 50500ecfef)
2020-07-26 16:11:36 -07:00
Nick Alcock
8c419a91d7 libctf: fixes for systems on which sizeof (void *) > sizeof (long)
Systems like mingw64 have pointers that can only be represented by 'long
long'.  Consistently cast integers stored in pointers through uintptr_t
to cater for this.

libctf/
	* ctf-create.c (ctf_dtd_insert): Add uintptr_t casts.
	(ctf_dtd_delete): Likewise.
	(ctf_dtd_lookup): Likewise.
	(ctf_rollback): Likewise.
	* ctf-hash.c (ctf_hash_lookup_type): Likewise.
	* ctf-types.c (ctf_lookup_by_rawhash): Likewise.
2020-07-22 18:05:32 +01:00
Nick Alcock
0f0c11f7fc libctf, dedup: add deduplicator
This adds the core deduplicator that the ctf_link machinery calls
(possibly repeatedly) to link the CTF sections: it takes an array
of input ctf_file_t's and another array that indicates which entries in
the input array are parents of which other entries, and returns an array
of outputs.  The first output is always the ctf_file_t on which
ctf_link/ctf_dedup/etc was called: the other outputs are child dicts
that have the first output as their parent.

include/
	* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): No longer unimplemented.
libctf/
	* ctf-impl.h (ctf_type_id_key): New, the key in the
	cd_id_to_file_t.
	(ctf_dedup): New, core deduplicator state.
	(ctf_file_t) <ctf_dedup>: New.
	<ctf_dedup_atoms>: New.
	<ctf_dedup_atoms_alloc>: New.
	(ctf_hash_type_id_key): New prototype.
	(ctf_hash_eq_type_id_key): Likewise.
	(ctf_dedup_atoms_init): Likewise.
	* ctf-hash.c (ctf_hash_eq_type_id_key): New.
	(ctf_dedup_atoms_init): Likewise.
	* ctf-create.c (ctf_serialize): Adjusted.
	(ctf_add_encoded): No longer static.
	(ctf_add_reftype): Likewise.
	* ctf-open.c (ctf_file_close): Destroy the
	ctf_dedup_atoms_alloc.
	* ctf-dedup.c: New file.
        * ctf-decls.h [!HAVE_DECL_STPCPY]: Add prototype.
	* configure.ac: Check for stpcpy.
	* Makefile.am: Add it.
	* Makefile.in: Regenerate.
        * config.h.in: Regenerate.
        * configure: Regenerate.
2020-07-22 18:02:19 +01:00
Nick Alcock
6dd2819ffc libctf, link: add the ability to filter out variables from the link
The CTF variables section (containing variables that have no
corresponding symtab entries) can cause the string table to get very
voluminous if the names of variables are long.  Some callers want to
filter out particular variables they know they won't need.

So add a "variable filter" callback that does that: it's passed the name
of the variable and a corresponding ctf_file_t / ctf_id_t pair, and
should return 1 to filter it out.

ld doesn't use this machinery yet, but we could easily add it later if
desired.  (But see later for a commit that turns off CTF variable-
section linking in ld entirely by default.)

include/
	* ctf-api.h (ctf_link_variable_filter_t): New.
	(ctf_link_set_variable_filter): Likewise.

libctf/
	* libctf.ver (ctf_link_set_variable_filter): Add.
	* ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New.
	<ctf_link_variable_filter_arg>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust.
	* ctf-link.c (ctf_link_set_variable_filter): New, set it.
	(ctf_link_one_variable): Call it if set.
2020-07-22 18:02:18 +01:00
Nick Alcock
5f54462c6a libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.

The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).

The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.

So rejig the ctf-link machinery to do that.  Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link.  This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it.  (Nobody else should use it.)

Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.

include/
	* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.

libctf/
	* ctf-impl.h (ctf_file_t): Improve comments.
	<ctf_link_cu_mapping>: Split into...
	<ctf_link_in_cu_mapping>: ... this...
	<ctf_link_out_cu_mapping>: ... and this.
	* ctf-create.c (ctf_serialize): Adjust.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-link.c (ctf_create_per_cu): Look things up in the
	in_cu_mapping instead of the cu_mapping.
	(ctf_link_add_cu_mapping): The deduplicating link will define
	what happens if many FROMs share a TO.
	(ctf_link_add_cu_mapping): Create in_cu_mapping and
	out_cu_mapping. Do not create ctf_link_outputs here any more, or
	create per-CU dicts here: they are already created when needed.
	(ctf_link_one_variable): Log a debug message if we skip a
	variable due to its type being concealed in a CU-mapped link.
	(This is probably too common a case to make into a warning.)
	(ctf_link): Create empty per-CU dicts if requested.
2020-07-22 18:02:18 +01:00
Nick Alcock
8d2229ad1e libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:

First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them

Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory.  This is not an API change because a null
CTF argument was prohibited before now.

Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd.  The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.

This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.

Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.

We drop two members:

- arcname, which is difficult to construct and then only used in error
  messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
  share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
  of it

We rename others whose existing names were fairly dreadful:

- done_main_member -> done_parent, using consistent terminology for .ctf
  as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise

We add one new member, cu_mapped.

Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.

include/
	* ctf-api.h (ECTF_NEEDSBFD): New.
	(ECTF_NERR): Adjust.
	(ctf_link): Rename share_mode arg to flags.
libctf/
	* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
	* Makefile.in: Regenerated.
	* ctf-impl.h (ctf_link_input_name): New.
	(ctf_file_t) <ctf_link_flags>: New.
	* ctf-create.c (ctf_serialize): Adjust accordingly.
	* ctf-link.c: Define ctf_open as weak when PIC.
	(ctf_arc_close_thunk): Remove unnecessary thunk.
	(ctf_file_close_thunk): Likewise.
	(ctf_link_input_name): New.
	(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
	(ctf_link_input_close): Adjust accordingly.
	(ctf_link_add_ctf_internal): New, split from...
	(ctf_link_add_ctf): ... here.  Return error if lazy loading of
	CTF is not possible.  Change to just call...
	(ctf_link_add): ... this new function.
	(ctf_link_add_cu_mapping): Transition to ctf_err_warn.  Drop the
	ctf_file_close_thunk.
	(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
	<in_file_name>: ... this.
	<arcname>: Drop.
	<share_mode>: Likewise (migrated to ctf_link_flags).
	<done_main_member>: Rename to...
	<done_parent>: ... this.
	<main_input_fp>: Rename to...
	<in_fp_parent>: ... this.
	<cu_mapped>: New.
	(ctf_link_one_type): Adjuwt accordingly.  Transition to
	ctf_err_warn, removing a TODO.
	(ctf_link_one_variable): Note a case too common to warn about.
	Report in the debug stream if a cu-mapped link prevents addition
	of a conflicting variable.
	(ctf_link_one_input_archive_member): Adjust.
	(ctf_link_lazy_open): New, open a CTF archive for linking when
	needed.
	(ctf_link_close_one_input_archive): New, close it again.
	(ctf_link_one_input_archive): Adjust for lazy opening, member
	renames, and ctf_err_warn transition.  Move the
	empty_link_type_mapping call to...
	(ctf_link): ... here.  Adjut for renamings and thunk removal.
	Don't spuriously fail if some input contains no CTF data.
	(ctf_link_write): ctf_err_warn transition.
	* libctf.ver: Remove not-yet-stable comment.
2020-07-22 18:02:18 +01:00
Nick Alcock
1fa7a0c24e libctf: sort out potential refcount loops
When you link TUs that contain conflicting types together, the resulting
CTF section is an archive containing many CTF dicts.  These dicts appear
in ctf_link_outputs of the shared dict, with each ctf_import'ing that
shared dict.  ctf_importing a dict bumps its refcount to stop it going
away while it's in use -- but if the shared dict (whose refcount is
bumped) has the child dict (doing the bumping) in its ctf_link_outputs,
we have a refcount loop, since the child dict only un-ctf_imports and
drops the parent's refcount when it is freed, but the child is only
freed when the parent's refcount falls to zero.

(In the future, this will be able to go wrong on the inputs too, when an
ld -r'ed deduplicated output with conflicts is relinked.  Right now this
cannot happen because we don't ctf_import such dicts at all.  This will
be fixed in a later commit in this series.)

Fix this by introducing an internal-use-only ctf_import_unref function
that imports a parent dict *witthout* bumping the parent's refcount, and
using it when we create per-CU outputs.  This function is only safe to
use if you know the parent cannot go away while the child exists: but if
the parent *owns* the child, as here, this is necessarily true.

Record in the ctf_file_t whether a parent was imported via ctf_import or
ctf_import_unref, so that if you do another ctf_import later on (or a
ctf_import_unref) it can decide whether to drop the refcount of the
existing parent being replaced depending on which function you used to
import that one.  Adjust ctf_serialize so that rather than doing a
ctf_import (which is wrong if the original import was
ctf_import_unref'fed), we just copy the parent field and refcount over
and forcibly flip the unref flag on on the old copy we are going to
discard.

ctf_file_close also needs a bit of tweaking to only close the parent if
it was not imported with ctf_import_unref: while we're at it, guard
against repeated closes with a refcount of zero and stop them causing
double-frees, even if destruction of things freed *inside*
ctf_file_close cause such recursion.

Verified no leaks or accesses to freed memory after all of this with
valgrind.  (It was leak-happy before.)

libctf/
	* ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New.
	(ctf_import_unref): New.
	* ctf-open.c (ctf_file_close) Drop the refcount all the way to
	zero.  Don't recurse back in if the refcount is already zero.
	(ctf_import): Check ctf_parent_unreffed before deciding whether
	to close a pre-existing parent.  Set it to zero.
	(ctf_import_unreffed): New, as above, setting
	ctf_parent_unreffed to 1.
	* ctf-create.c (ctf_serialize): Do not ctf_import into the new
	child: use direct assignment, and set unreffed on the new and
	old children.
	* ctf-link.c (ctf_create_per_cu): Import the parent using
	ctf_import_unreffed.
2020-07-22 18:02:18 +01:00
Nick Alcock
8b37e7b63e libctf, ld, binutils: add textual error/warning reporting for libctf
This commit adds a long-missing piece of infrastructure to libctf: the
ability to report errors and warnings using all the power of printf,
rather than being restricted to one errno value.  Internally, libctf
calls ctf_err_warn() to add errors and warnings to a list: a new
iterator ctf_errwarning_next() then consumes this list one by one and
hands it to the caller, which can free it.  New errors and warnings are
added until the list is consumed by the caller or the ctf_file_t is
closed, so you can dump them at intervals.  The caller can of course
choose to print only those warnings it wants.  (I am not sure whether we
want objdump, readelf or ld to print warnings or not: right now I'm
printing them, but maybe we only want to print errors?  This entirely
depends on whether warnings are voluminous things describing e.g. the
inability to emit single types because of name clashes or something.
There are no users of this infrastructure yet, so it's hard to say.)

There is no internationalization here yet, but this at least adds a
place where internationalization can be added, to one of
ctf_errwarning_next or ctf_err_warn.

We also provide a new ctf_assert() function which uses this
infrastructure to provide non-fatal assertion failures while emitting an
assert-like string to the caller: to save space and avoid needlessly
duplicating unchanging strings, the assertion test is inlined but the
print-things-out failure case is not.  All assertions in libctf will be
converted to use this machinery in future commits and propagate
assertion-failure errors up, so that the linker in particular cannot be
killed by libctf assertion failures when it could perfectly well just
print warnings and drop the CTF section.

include/
	* ctf-api.h (ECTF_INTERNAL): Adjust error text.
	(ctf_errwarning_next): New.
libctf/
	* ctf-impl.h (ctf_assert): New.
	(ctf_err_warning_t): Likewise.
	(ctf_file_t) <ctf_errs_warnings>: Likewise.
	(ctf_err_warn): New prototype.
	(ctf_assert_fail_internal): Likewise.
	* ctf-inlines.h (ctf_assert_internal): Likewise.
	* ctf-open.c (ctf_file_close): Free ctf_errs_warnings.
	* ctf-create.c (ctf_serialize): Copy it on serialization.
	* ctf-subr.c (ctf_err_warn): New, add an error/warning.
	(ctf_errwarning_next): New iterator, free and pass back
	errors/warnings in succession.
	* libctf.ver (ctf_errwarning_next): Add.
ld/
	* ldlang.c (lang_ctf_errs_warnings): New, print CTF errors
	and warnings.  Assert when libctf asserts.
	(lang_merge_ctf): Call it.
	(land_write_ctf): Likewise.
binutils/
	* objdump.c (ctf_archive_member): Print CTF errors and warnings.
	* readelf.c (dump_ctf_archive_member): Likewise.
2020-07-22 18:02:17 +01:00
Nick Alcock
9850ce4d7b libctf: add ctf_forwardable_kind
The internals of the deduplicator want to know if something is a type
that can have a forward to it fairly often, often enough that inlining
it brings a noticeable performance gain.  Convert the one place in
libctf that can already benefit, even though it doesn't bring any sort
of performance gain there.

libctf/
	* ctf-inlines.h (ctf_forwardable_kind): New.
	* ctf-create.c (ctf_add_forward): Use it.
2020-07-22 17:57:48 +01:00
Nick Alcock
502e838ed9 libctf, types: support slices of anything terminating in an int
It is perfectly valid C to say e.g.

typedef u64 int;
struct foo_t
  {
    const volatile u64 wibble:2;
  };

i.e. bitfields have to be integral types, but they can be cv-qualified
integral types or typedefs of same, etc.

This is easy to fix: do a ctf_type_resolve_unsliced() at creation time
to ensure the ultimate type is integral, and ctf_type_resolve() at
lookup time so that if you somehow have e.g. a slice of a typedef of a
slice of a cv-qualified int, we pull the encoding that the topmost slice
is based on out of the subsidiary slice (and then modify it), not out of
the underlying int.  (This last bit is rather academic right now, since
all slices override exactly the same properties of the underlying type,
but it's still the right thing to do.)

libctf/
	* ctf-create.c (ctf_add_slice): Support slices of any kind that
	resolves to an integral type.
	* ctf-types.c (ctf_type_encoding): Resolve the type before
	fishing its encoding out.
2020-07-22 17:57:31 +01:00
Nick Alcock
dd987f0043 libctf, create: empty dicts are dirty to start with
Without this, an empty dict that is written out immediately never gets
any content at all: even the header is left empty.

libctf/
	* ctf-create.c (ctf_create): Mark dirty.
2020-07-22 17:57:29 +01:00
Nick Alcock
f47ca31135 libctf, create: fix addition of anonymous struct/union members
A Solaris-era bug causes us to check the offsets of types with no names
against the first such type when ctf_add_type()ing members to a struct
or union.  Members with no names (i.e. anonymous struct/union members)
can appear as many times as you like in a struct/union, so this check
should be skipped in this case.

libctf/
	* ctf-create.c (membcmp)  Skip nameless members.
2020-07-22 17:57:28 +01:00
Nick Alcock
ab769488e7 libctf, create: member names of "" and NULL should be the same
This matters for the case of unnamed bitfields, whose names are the null
string.  These are special in that they are the only members whose
"names" are allowed to be duplicated in a single struct, but we were
only handling this for the case where name == NULL.  Translate "" to
NULL to help callers.

libctf/
	* ctf-create.c (ctf_add_member_offset): Support names of ""
	as if they were the null pointer.
2020-07-22 17:57:27 +01:00
Nick Alcock
9943fa3a73 libctf, create: add explicit casts for variables' and slices' types
This is technically unnecessary -- the compiler is quite capable of
doing the range reduction for us -- but it does mean that all
assignments of a ctf_id_t to its final uint32_t representation now have
appropriate explicit casts.

libctf/
	* ctf-create.c (ctf_serialize): Add cast.
	(ctf_add_slice): Likewise.
2020-07-22 17:57:23 +01:00
Nick Alcock
afd78bd6f0 libctf, create: do not corrupt function types' arglists at insertion time
ctf_add_function assumes that function types' arglists are of type
ctf_id_t.  Since they are CTF IDs, they are 32 bits wide, a uint32_t:
unfortunately ctf_id_t is a forward-compatible user-facing 64 bits wide,
and should never ever reach the CTF storage level.

All the CTF code other than ctf_add_function correctly assumes that
function arglists outside dynamic containers are 32 bits wide, so the
serialization machinery ends up cutting off half the arglist, corrupting
all args but the first (a good sign is a bunch of args of ID 0, the
unimplemented type, popping up).

Fix this by copying the arglist into place item by item, casting it
properly, at the same time as we validate the arg types.  Fix the type
of the dtu_argv in the dynamic container and drop the now-unnecessary
cast in the serializer.

libctf/
	* ctf-impl.h (ctf_dtdef_t) <dtu_argv>: Fix type.
	* ctf-create.c (ctf_add_function): Check for unimplemented type
	and populate at the same time.  Populate one-by-one, not via
	memcpy.
	(ctf_serialize): Remove unnecessary cast.
	* ctf-types.c (ctf_func_type_info): Likewise.
	(ctf_func_type_args): Likewise.  Fix comment typo.
2020-07-22 17:57:22 +01:00
Nick Alcock
2361f1c859 libctf, create: support addition of references to the unimplemented type
The deduplicating linker adds types from the linker inputs to the output
via the same API everyone else does, so it's important that we can emit
everything that the compiler wants us to.  Unfortunately, the compiler
may represent the unimplemented type (used for compiler constructs that
CTF cannot currently encode) as type zero or as a type of kind
CTF_K_UNKNOWN, and we don't allow the addition of types that cite the
former.

Adding this support adds a tiny bit of extra complexity: additions of
structure members immediately following a member of the unimplemented
type must be via ctf_add_member_offset or ctf_add_member_encoded, since
we have no idea how big members of the unimplemented type are.
(Attempts to do otherwise return -ECTF_NONREPRESENTABLE, like other
attempts to do forbidden things with the unimplemented type.)

Even slices of the unimplemented type are permitted: this is the only
case in which you can slice a type that terminates in a non-integral
type, on the grounds that it was likely integral in the source code,
it's just that we can't represent that sort of integral type properly
yet.

libctf/
	* ctf-create.c (ctf_add_reftype): Support refs to type zero.
	(ctf_add_array): Support array contents of type zero.
	(ctf_add_function): Support arguments and return types of
	type zero.
	(ctf_add_typedef): Support typedefs to type zero.
	(ctf_add_member_offset): Support members of type zero,
	unless added at unspecified (naturally-aligned) offset.
2020-07-22 17:57:21 +01:00
Nick Alcock
c1401ecc29 libctf: add some missing #includes.
Causes warnings on (at least) recent FreeBSD.

libctf/
	* ctf-create.c: Include <unistd.h>.
	* ctf-open-bfd.c: Likewise.
2020-06-26 15:56:39 +01:00
Nick Alcock
8ffcdf1823 libctf: create: forwards are always in the namespace of their referent
The C namespace a forward is located in is always the same as the
namespace of the corresponding complete type: 'struct foo' is in the
struct namespace and does not collide with, say, 'union foo'.

libctf allowed for this in many places, but inconsistently: in
particular, forward *addition* never allowed for this, and was interning
forwards in the default namespace, which is always wrong, since you can
only forward structs, unions and enums, all of which are in their own
namespaces in C.

Forward removal needs corresponding adjustment to remove the names form
the right namespace, as does ctf_rollback.

libctf/
	* ctf-create.c (ctf_add_forward): Intern in the right namespace.
	(ctf_dtd_delete): Remove correspondingly.
	(ctf_rollback): Likewise.
2020-06-26 15:56:39 +01:00
Nick Alcock
d04a47ac53 libctf: create: ctf_add_type should hand back already-added non-SoUs
When we add a type from a dictionary and then try to add it again, we
should hand it back unchanged unless it is a structure, union or enum
with a different number of members.  That's what the comment says we do.

Instead, we hand it back unchanged *only* if it is a structure, union or
enum with the same number of members: non-structs, unions and enums are
unconditionally added.  This causes extreme type bloating and (in
conjunction with the bug fixed by the next commit) can easily lead to
the same type being mistakenly added to a dictionary more than once
(which, for forwards, was not banned and led to dictionary corruption).

libctf/
	* ctf-create.c (ctf_add_type_internal): Hand back existing types
	unchanged.
2020-06-26 15:56:39 +01:00
Nick Alcock
6bbf9da892 libctf: create: don't add forwards if the type added already exists
This is what ctf_add_forward is documented to do, but it's not what it
actually does: the code is quite happy to add forwards that duplicate
existing structs, etc.

This is obviously wrong and breaks both the nondeduplicating linker
and the upcoming deduplicator, as well as allowing ordinary callers of
ctf_add_type to corrupt the dictionary by just adding the same root-
visible forward more than once.

libctf/
	* ctf-create.c (ctf_add_forward): Don't add forwards to
	types that already exist.
2020-06-26 15:56:39 +01:00
Nick Alcock
fe4c2d5563 libctf: create: non-root-visible types should not appear in name tables
We were accidentally interning newly-added and newly-opened
non-root-visible types into name tables, and removing names from name
tables when such types were removed.  This is very wrong: the whole
point of non-root-visible types is they do not go in name tables and
cannot be looked up by name.  This bug made non-root-visible types
basically identical to root-visible types, right back to the earliest
days of libctf in the Solaris era.

libctf/
	* ctf-open.c (init_types): Only intern root-visible types.
	* ctf-create.c (ctf_dtd_insert): Likewise.
	(ctf_dtd_delete): Only remove root-visible types.
	(ctf_rollback): Likewise.
	(ctf_add_generic): Adjust.
	(ctf_add_struct_sized): Adjust comment.
	(ctf_add_union_sized): Likewise.
	(ctf_add_enum): Likewise.
	* ctf-impl.h (ctf_dtd_insert): Adjust prototype.
2020-06-26 15:56:39 +01:00
Alan Modra
b3adc24a07 Update year range in copyright notice of binutils files 2020-01-01 18:42:54 +10:30