This makes the initializer for __table in __from_chars_alnum_to_val
dependent in an artificial way, which works around the reported modules
testsuite ICE by preventing the compiler from evaluating the initializer
parse time.
Compared to the alternative workaround of using a non-local class type
for __table, this workaround has the advantage of slightly speeding up
compilation of <charconv>, since now the table won't get built (via
constexpr evaluation) until the integer std::from_chars overload is
instantiated.
PR c++/105297
PR c++/105322
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_alnum_to_val): Make
initializer for __table dependent in an artificial way.
I'm not sure what I was thinking when I added this assertion, maybe it
was supposed to be alignment == 1 (which is what the pmr::string actually
uses). The simplest fix is to just remove the assertion.
The assertion is no longer enabled by default on trunk, but it's still
there for the --enablke-libstdcxx-debug build, and is still wrong. The
fix is needed on the gcc-11 branch.
libstdc++-v3/ChangeLog:
PR libstdc++/105324
* src/c++17/floating_from_chars.cc (buffer_resource::do_allocate):
Remove assertion.
* testsuite/20_util/from_chars/pr105324.cc: New test.
When we compute LABEL_NUSES from scratch, mark_all_labels doesn't call
mark_jump_label on DEBUG_INSNs:
if (NONDEBUG_INSN_P (insn))
mark_jump_label (PATTERN (insn), insn, 0);
and so doesn't increment LABEL_NUSES from references in DEBUG_INSNs.
But, when we call emit_copy_of_insn_after e.g. when duplicating some
DEBUG_INSNs, we call it even on those, which then results in LABEL_NUSES
differences and -fcompare-debug failures.
The following patch makes sure we don't call it on DEBUG_INSNs.
2022-04-21 Jakub Jelinek <jakub@redhat.com>
PR debug/105203
* emit-rtl.cc (emit_copy_of_insn_after): Don't call mark_jump_label
on DEBUG_INSNs.
* gfortran.dg/g77/pr105203.f: New test.
If two arrays do not have the exact same element type including
qualification, this could be e.g. f(int (&&)[]) vs. f(int const (&)[]),
which can still be distinguished by the lvalue-rvalue tiebreaker.
By tightening this branch (in accordance with the letter of the Standard) we
fall through to the next branch, which tests whether they have different
element type ignoring qualification and returns 0 in that case; thus we only
actually fall through in the T[...] vs. T cv[...] case, eventually
considering the lvalue-rvalue tiebreaker at the end of compare_ics.
Signed-off-by: Ed Catmur <ed@catmur.uk>
PR c++/104996
gcc/cp/ChangeLog:
* call.cc (compare_ics): When comparing list-initialization
sequences, do not return early.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist129.C: New test.
The macro being tested here is wrong, but just happens to have the same
value as the one supposed to be tests.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string_view/operations/copy/char/constexpr.cc:
Check correct feature test macro.
This fixes missing libiconv symbols when libstdc++ is built on a system
that has libiconv installed. If the libiconv headers are found then
libstdc++ depends on libiconv_open etc instead of libc's iconv_open. But
without this fix libstdc++ is not linked to the libiconv library that
provides the definitions of those symbols.
As discussed in PR 93602 this changed means that libstdc++.so.6 might
have an rpath pointing to the location of the libiconv.so library. If
that is not desired, then GCC must be configured to link to a static
libiconv.a instead, using either --with-libiconv-type=static or an
in-tree build of libiconv.
libstdc++-v3/ChangeLog:
PR libstdc++/93602
* doc/xml/manual/prerequisites.xml: Document libiconv
workarounds.
* doc/html/manual/setup.html: Regenerate.
* src/Makefile.am (CXXLINK): Add $(LTLIBICONV).
* src/Makefile.in: Regenerate.
The following makes sure that when we build the versioning condition
for vectorization including the cost model check, we check for the
cost model and branch over other versioning checks. That is what
the cost modeling assumes, since the cost model check is the only
one accounted for in the scalar outside cost. Currently we emit
all checks as straight-line code combined with bitwise ops which
can result in surprising ordering of checks in the final assembly.
Since loop_version accepts only a single versioning condition
the splitting is done after the fact.
The result is a 1.5% speedup of 416.gamess on x86_64 when compiling
with -Ofast and tuning for generic or skylake. That's not enough
to recover from the slowdown when vectorizing but it now cuts off
the expensive alias versioning test.
2022-03-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/104912
* tree-vect-loop-manip.cc (vect_loop_versioning): Split
the cost model check to a separate BB to make sure it is
checked first and not combined with other version checks.
The following aligns ISEL VEC_COND_EXPR expansion using VCOND
with the optab query done by vector lowering. Instead of only
allowing the signed optab to provide EQ/NE compares we allow both
here though since there seems to be no documented canonicalization.
2022-04-20 Richard Biener <rguenther@suse.de>
PR tree-optimization/105312
* gimple-isel.cc (gimple_expand_vec_cond_expr): Query both
VCOND and VCONDU for EQ and NE.
* gcc.target/arm/pr105312.c: New testcase.
cgraph_node has a semantic_interposition flag which should mirror
opt_for_fn (decl, flag_semantic_interposition). But it actually is
initialized not from that, but from flag_semantic_interposition in the
explicit symtab_node (symtab_type t)
: type (t), resolution (LDPR_UNKNOWN), definition (false), alias (false),
...
semantic_interposition (flag_semantic_interposition),
...
x_comdat_group (NULL_TREE), x_section (NULL)
{}
ctor. I think that might be fine for varpool nodes, but since
flag_semantic_interposition is now implied from -Ofast it isn't correct
for cgraph nodes, unless we guarantee that cgraph node for a particular
function decl is always created while that function is
current_function_decl. That is often the case, but not always as the
following function shows.
Because symtab_node's ctor doesn't know for which decl the cgraph node
is being created, the following patch keeps that as is, but updates it from
opt_for_fn (decl, flag_semantic_interposition) when we know that, or for
clones copies that flag (often it is then overridden in
set_new_clone_decl_and_node_flags, but not always).
2022-04-20 Jakub Jelinek <jakub@redhat.com>
PR ipa/105306
* cgraph.cc (cgraph_node::create): Set node->semantic_interposition
to opt_for_fn (decl, flag_semantic_interposition).
* cgraphclones.cc (cgraph_node::create_clone): Copy over
semantic_interposition flag.
* g++.dg/opt/pr105306.C: New test.
TOPN metrics are histograms that contain overall count and per-bucket
count. Overall count can be negative when two profiles merge and some
of per-bucket metrics are disacarded.
Noticed as an ICE on python PGO build where gcc crashes as:
during IPA pass: modref
a.c:36:1: ICE: in stream_out_histogram_value, at value-prof.cc:340
36 | }
| ^
stream_out_histogram_value(output_block*, histogram_value_t*)
gcc/value-prof.cc:340
gcc/ChangeLog:
PR gcov-profile/105282
* value-prof.cc (stream_out_histogram_value): Allow negative counts
on HIST_TYPE_INDIR_CALL.
The following testcase ICEs, because the pic register is
(reg:DI 24 %i0 [109]) and is used in the delay slot of a return.
We invoke epilogue_renumber and that changes it to
(reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p
predicate, so we don't recognize the insn anymore.
The following patch fixes that by preserving ORIGINAL_REGNO if
specified, so we get (reg:DI 8 %o0 [109]) instead.
2022-04-19 Jakub Jelinek <jakub@redhat.com>
PR target/105257
* config/sparc/sparc.cc (epilogue_renumber): If ORIGINAL_REGNO,
use gen_raw_REG instead of gen_rtx_REG and copy over also
ORIGINAL_REGNO. Use return 0; instead of /* fallthrough */.
* gcc.dg/pr105257.c: New test.
The CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit is supposed to separate
PLACEHOLDER_EXPRs that should be replaced by one object or subobjects of it
(variable, TARGET_EXPR slot, ...) from other PLACEHOLDER_EXPRs that should
be replaced by different objects or subobjects.
The bit is set when finding PLACEHOLDER_EXPRs inside of a CONSTRUCTOR, not
looking into nested CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctors, and we prevent
elision of TARGET_EXPRs (through TARGET_EXPR_NO_ELIDE) whose initializer
is a CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctor. The following testcase ICEs
though, we don't replace the placeholders in there at all, because
CONSTRUCTOR_PLACEHOLDER_BOUNDARY isn't set on the TARGET_EXPR_INITIAL
ctor, but on a ctor nested in such a ctor. replace_placeholders should be
run on the whole TARGET_EXPR slot.
So, the following patch fixes it by moving the CONSTRUCTOR_PLACEHOLDER_BOUNDARY
bit from nested CONSTRUCTORs to the CONSTRUCTOR containing those (but only
if it is closely nested, if there is some other tree sandwiched in between,
it doesn't do it).
2022-04-19 Jakub Jelinek <jakub@redhat.com>
PR c++/105256
* typeck2.cc (process_init_constructor_array,
process_init_constructor_record, process_init_constructor_union): Move
CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to the
containing CONSTRUCTOR.
* g++.dg/cpp0x/pr105256.C: New test.
When doing BB vectorization the scalar cost compute is derailed
by patterns, causing lanes to be considered live and thus not
costed on the scalar side. For the testcase in PR104010 this
prevents vectorization which was done by GCC 11. PR103941
shows similar cases of missed optimizations that are fixed by
this patch.
2022-04-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/104010
PR tree-optimization/103941
* tree-vect-slp.cc (vect_bb_slp_scalar_cost): When
we run into stmts in patterns continue walking those
for uses outside of the vectorized region instead of
marking the lane live.
* gcc.target/i386/pr103941-1.c: New testcase.
* gcc.target/i386/pr103941-2.c: Likewise.
Assertions were originally enabled in the compiled-in floating-point
std::to_chars implementation to help shake out any bugs, but they
apparently impose a significant performance penalty, most notably for
the hex formatting which is around 25% slower with assertions enabled.
This seems too high a cost for unconditionally enabling them.
The newly added calls to __builtin_unreachable work around the compiler
no longer knowing that the set of valid values of 'fmt' is limited (which
was previously upheld by an assert).
libstdc++-v3/ChangeLog:
* src/c++17/floating_to_chars.cc (_GLIBCXX_ASSERTIONS): Don't
define.
(__floating_to_chars_shortest): Add __builtin_unreachable calls to
squelch false-positive -Wmaybe-uninitialized and -Wreturn-type
warnings.
(__floating_to_chars_precision): Likewise.
This improves the debug output for C++20 spans.
Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffffffb9a8, _M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}
Signed-off-by: Philipp Fent <fent@in.tum.de>
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (StdSpanPrinter): Define.
* testsuite/libstdc++-prettyprinters/cxx20.cc: Test it.
This renames the testcase to something picked up by the suites regexp.
2022-04-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/104880
* g++.dg/opt/pr104880.cc: Rename to ...
* g++.dg/opt/pr104880.C: ... this.
Using == instead of = causes a configuration error with dash as the
shell:
checking whether to build libbacktrace support... /home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77471: test: auto: unexpected operator
/home/devel/building/work/src/gcc-12-20220417/libstdc++-v3/configure: 77474: test: auto: unexpected operator
auto
This means we fail to change the value from "auto" to "no" and so this
test passes:
GLIBCXX_CONDITIONAL(ENABLE_BACKTRACE, [test "$enable_libstdcxx_backtrace" != no])
This leads to the libbacktrace directory being included in the build
without being configured properly, and bootstrap fails.
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Fix shell operators.
* configure: Regenerate.
That is, support for cris-linux-gnu was removed in gcc-11, but
install.texi wasn't adjusted accordingly. Also, unfortunately the
developer-related sites are gone with no replacements. And, CRIS is
used in other chip series as well, but allude rather than list.
The generated manpages, info, pdf and html were sanity-checked.
gcc:
* doc/install.texi <CRIS>: Remove references to removed websites and
adjust for cris-*-elf being the only remaining toolchain.
...and related options. These stale bits were overlooked when support
for "Linux/GNU" and CRIS v32 was removed, before the gcc-11 release.
Resulting pdf, html and info inspected for sanity.
gcc:
* doc/invoke.texi <CRIS>: Remove references to options for removed
subtarget cris-axis-linux-gnu and tweak wording accordingly.
This fixes a build issue on musl libc where the same signal number
is used for SIGIO and SIGPOLL. This causes a compilation error since
the signal numbers must be unique for the signal table.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400595
The musl libc uses signal 34 internally for setgid (similar to how glibc
uses signal 32 and signal 33). For this reason, special handling is
needed for this signal in the runtime. The gc implementation already
handles the signal accordingly. As such, this commit intends to
simply copy the behavior of the Google Go implementation to libgo.
See https://go.dev/issues/39343
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400594
In the first iteration of __from_chars_pow2_base's main loop, we need
to remember the value of the leading significant digit for sake of the
overflow check at the end (for base > 2).
This patch manually unrolls this first iteration so as to not encumber
the entire loop with logic that only the first iteration needs. This
seems to significantly improve performance:
Base Before After (seconds, lower is better)
2 9.36 9.37
8 3.66 2.93
16 2.93 1.91
32 2.39 2.24
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_pow2_base): Manually
unroll the first iteration of the main loop and simplify
accordingly.
This test case pr105250.c is like its related pr105140.c, which
suffers the error with message like "{AltiVec,vector} argument
passed to unprototyped" on powerpc and s390. So like commits
r12-8025 and r12-8039, this fix is to add the dg-skip-if for
powerpc*-*-* and s390*-*-*.
gcc/testsuite/ChangeLog:
PR testsuite/105266
* gcc.dg/pr105250.c: Skip for powerpc*-*-* and s390*-*-*.
Revert CL 245098. It caused incorrect initialization ordering.
Adjust the runtime package to work even with the CL reverted.
Original description of CL 245098:
This avoids requiring an init function to initialize the variable.
This can only be done if x is a static initializer.
The go1.15rc1 runtime package relies on this optimization.
The package has a variable "var maxSearchAddr = maxOffAddr".
The maxSearchAddr variable is used by code that runs before package
initialization is complete.
For golang/go#51913
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/395994
Here we issue a wrong error for the
template<typename T = S<C_many<int>>> void g();
line in the testcase. I surmise that's because we mistakenly parse
C_many<int> as a placeholder-type-specifier, and things go wrong from
there. We are in a default argument so we should reject parsing C_many<int>
as a placeholder-type-specifier, which would mean creating a new parameter.
We want C_many<int> to be a concept-id instead.
It's interesting to see why the same problem didn't occur for C_one<int>.
In that case, cp_parser_placeholder_type_specifier -> finish_type_constraints
-> build_type_constraint -> build_concept_check -> build_standard_check ->
coerce_template_parms fails the parse here:
8916 nargs = inner_args ? NUM_TMPL_ARGS (inner_args) : 0;
8917 if ((nargs - variadic_args_p > nparms && !variadic_p)
8918 || (nargs < nparms - variadic_p
8919 && require_all_args
8920 && !variadic_args_p
8921 && (!use_default_args
8922 || (TREE_VEC_ELT (parms, nargs) != error_mark_node
8923 && !TREE_PURPOSE (TREE_VEC_ELT (parms, nargs))))))
8924 {
8925 bad_nargs:
...
8943 return error_mark_node;
because nargs is 2 (the targs are <WILDCARD_DECL, int>) while nparms is
1 (for the one 'typename' in the tparam list of C_one). But for
C_many<int> variadic_p is true so we don't return error_mark_node but
<type_argument_pack>.
This patch does not issue any error for the !tentative case because
I didn't figure out how to trigger that. So it adds an assert instead.
PR c++/105268
gcc/cp/ChangeLog:
* parser.cc (cp_parser_placeholder_type_specifier): Return
error_mark_node when trying to build up a constrained parameter in
a default argument.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/variadic6.C: New test.
This applies the following optimizations to the integer std::from_chars
implementation:
1. Use a lookup table for converting an alphanumeric digit to its
base-36 value instead of using a range test (for 0-9) and switch
(for a-z and A-Z). The table is constructed using a C++14
constexpr function which doesn't assume a particular character
encoding or __CHAR_BIT__ value. This new conversion function
__from_chars_alnum_to_val is templated on whether we care
only about the decimal digits, in which case we can perform the
conversion with a single subtraction since the digit characters
are guaranteed to be contiguous (unlike the letters).
2. Generalize __from_chars_binary to handle all power-of-two bases.
This function (now named __from_chars_pow2_base) is also templated
on whether we care only about the decimal digits for the benefit of
faster digit conversion for base 2, 4 and 8.
3. In __from_chars_digit, use
static_cast<unsigned char>(__c - '0') < __base
instead of
'0' <= __c && __c <= ('0' + (__base - 1)).
as the digit recognition test (exhaustively verified that the two
tests are equivalent).
4. In __from_chars_alnum, use a nested loop to consume the rest of the
digits in the overflow case (mirroring __from_chars_digit) so that
the main loop doesn't have to maintain the overflow flag __valid.
At this point, __from_chars_digit is nearly identical to
__from_chars_alnum, so this patch merges the two functions by removing
the former and templatizing the latter according to whether we care only
about the decimal digits. Finally,
5. In __from_chars_alnum, maintain a lower bound on the number of
unused bits in the result and use it to omit the overflow check
when it's safe to do so.
In passing, this patch replaces the non-portable function ascii_to_hexit
used by __floating_from_chars_hex with the new conversion function.
Some runtime measurements for a simple 15-line benchmark that roundtrips
printing/parsing 200 million integers via std::to/from_chars (average of
5 runs):
Base Before After (seconds, lower is better)
2 9.37 9.37
3 15.79 12.13
8 4.15 3.67
10 4.90 3.86
11 6.84 5.03
16 4.14 2.93
32 3.85 2.39
36 5.22 3.26
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_alnum_to_val_table): Define.
(__from_chars_alnum_to_val): Define.
(__from_chars_binary): Rename to ...
(__from_chars_pow2_base): ... this. Generalize to handle any
power-of-two base using __from_chars_alnum_to_val.
(__from_chars_digit): Optimize digit recognition to a single
test instead of two tests. Use [[__unlikely___]] attribute.
(__from_chars_alpha_to_num): Remove.
(__from_chars_alnum): Use __from_chars_alnum_to_val. Use a
nested loop for the overflow case. Maintain a lower bound
on the number of available bits in the result and use it to
omit the overflow check.
(from_chars): Adjust appropriately.
* src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
(__floating_from_chars_hex): Use __from_chars_alnum_to_val
to recognize a hex digit instead.
Complile _mm_crc32_u8/16/32/64 intrinsics with -mcrc32
would meet target specific option mismatch. Correct target pragma
to fix.
gcc/ChangeLog:
* config/i386/smmintrin.h: Correct target pragma from sse4.1
and sse4.2 to crc32 for crc32 intrinsics.
gcc/testsuite/ChangeLog:
* gcc.target/i386/crc32-6.c: Adjust dg-error message.
* gcc.target/i386/crc32-7.c: New test.
There's been an extension for a long time to allow applying 'unsigned' to an
int typedef, but that was confusing the integer promotion code. Fixed by
forgetting about the typedef in that case.
I'm going to make this an unconditional pedwarn in stage 1.
PR c++/102804
gcc/cp/ChangeLog:
* decl.cc (grokdeclarator): Drop typedef used with 'unsigned'.
gcc/testsuite/ChangeLog:
* g++.dg/ext/unsigned-typedef1.C: New test.
The expression pretty-printing code crashed on a location wrapper with no
type, and didn't know what to do with a USING_DECL.
PR c++/102987
gcc/cp/ChangeLog:
* error.cc (dump_expr): Handle USING_DECL.
[VIEW_CONVERT_EXPR]: Just look through location wrapper.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/using1.C: New test.
PR analyzer/105264 reports that the analyzer can fail to treat
(PTR + IDX) and PTR[IDX] as referring to the same memory under
some situations.
There are various ways in which this can happen when IDX is a
symbolic value, due to having several ways in which such memory
regions can be referred to symbolically. I attempted to fix this by
being smarter when folding svalues and regions, but this fix
seems too fiddly to attempt in stage 4.
Instead, this less ambitious patch fixes a false positive from
-Wanalyzer-use-of-uninitialized-value by making the analyzer's escape
analysis smarter, so that it treats *PTR as escaping when
(PTR + OFFSET) is passed to an external function, and thus
it treats *PTR as possibly-initialized (the "passing &PTR[IDX]" case
was already working).
gcc/analyzer/ChangeLog:
PR analyzer/105264
* region-model-reachability.cc (reachable_regions::handle_parm):
Use maybe_get_deref_base_region rather than just region_svalue, to
handle pointer arithmetic also.
* svalue.cc (svalue::maybe_get_deref_base_region): New.
* svalue.h (svalue::maybe_get_deref_base_region): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/105264
* gcc.dg/analyzer/torture/symbolic-10.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Using names depended on <asm/ptrace.h>, which glibc includes somewhere
but musl did not. Change to just always use indexes.
Based on patch by Sören Tempel.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/400214
The constexpr constructor checking code got confused by the expansion of a
trivial copy constructor; we don't need to do that checking for defaulted
ctors, anyway.
PR c++/104646
gcc/cp/ChangeLog:
* constexpr.cc (maybe_save_constexpr_fundef): Don't do extra
checks for defaulted ctors.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-fno-elide-ctors1.C: New test.
Some targets use 'long long unsigned int' for unsigned HW int, and this
leads to a Werror=format= fail for two print cases in jit-playback.cc
introduced in r12-8117-g30f7c83e9cfe (Add support for bitcasts [PR104071])
As discussed on IRC, casting to (long) seems entirely reasonable for the
values (since they are type sizes).
tested that this fixes bootstrap on x86_64-darwin19 and running check-jit.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/jit/ChangeLog:
* jit-playback.cc (new_bitcast): Cast values returned by tree_to_uhwi
to 'long' to match the print format.
When a captured variable is type-dependent, we've expressed the type of the
capture field and proxy with a decltype variant. But if the type is "the
current instantiation", we need to be able to see that so that we can do
lookup inside it just like we could with the captured variable itself.
I also tried looking through lambda capture in
cp_parser_postfix_dot_deref_expression, but this way seems cleaner. I plan
to treat more types as deducible in stage 1.
I considered also using this in do_auto_deduction, but think that would be
wrong: [temp.dep.expr] says an id-expression is type-dependent if it is
"associated by name lookup with a variable declared with a type that
contains a placeholder type where the initializer is type-dependent". That
doesn't clearly exclude deducing a dependent type from the initializer, but
it seems like a barrier, and other implementations agree.
PR c++/82980
gcc/cp/ChangeLog:
* lambda.cc (type_deducible_expression_p): New.
(lambda_capture_field_type): Check it.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-current-inst1.C: New test.