Up until now, due to a latent bug in the code for the ifcvt pass,
irrespective of the branch taken in a conditional statement, the
original condition for the if statement was used in masking the
function call.
Thus, for code such as:
if (a[i] > limit)
b[i] = fixed_const;
else
b[i] = fn (a[i]);
we would generate the following (wrong) if-converted tree code:
_1 = a[i_1];
_2 = _1 > limit;
_3 = .MASK_CALL (fn, _1, _2);
cstore_4 = _2 ? fixed_const : _3;
as opposed to the correct expected sequence:
_1 = a[i_1];
_2 = _1 > limit;
_3 = ~_2;
_4 = .MASK_CALL (fn, _1, _3);
cstore_5 = _2 ? fixed_const : _4;
This patch ensures that the correct predicate mask generation is
carried out such that, upon autovectorization, the correct vector
lanes are selected in the vectorized function call.
gcc/ChangeLog:
* tree-if-conv.cc (predicate_statements): Fix handling of
predicated function calls.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-fncall-mask.c: New.
This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter
was making an assumption about the form of what a candidate for a dec_insn
should be, which caused an ICE.
This dec_insn is the instruction that decreases the loop counter inside a
decrementing loop and we expect it to have the following form:
(set (reg CONDCOUNT)
(plus (reg CONDCOUNT)
(const_int)))
Where CONDCOUNT is the loop counter, and const int is the negative constant
used to decrement it.
This patch also improves our search for a valid dec_insn. Before this patch
we'd only look for a dec_insn inside the loop header if the loop latch was
empty. We now also search the loop header if the loop latch is not empty but
the last instruction is not a valid dec_insn. This could potentially be improved
to search all instructions inside the loop latch.
gcc/ChangeLog:
* config/arm/arm.cc (check_dec_insn): New helper function containing
code hoisted from...
(arm_mve_dlstp_check_dec_counter): ... here. Use check_dec_insn to
check the validity of the candidate dec_insn.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/dlstp-loop-form.c: New test.
Jason pointed out that the fix I made for PR116722 via r15-3796
introduces a regression when running constexpr-dynamic10.C with
-fimplicit-constexpr.
The problem is that my change makes us leave cxx_eval_call_expression
early, and bypass the call to cxx_eval_thunk_call (through a recursive
call to cxx_eval_call_expression) that used to emit an error for that
testcase with -fimplicit-constexpr.
This patch emits the error if !ctx->quiet before bailing out because the
{con,de}structor belongs to a class with virtual bases.
PR c++/116722
gcc/cp/ChangeLog:
* constexpr.cc (cxx_bind_parameters_in_call): When !ctx->quiet,
emit error before bailing out due to a call to {con,de}structor
for a class with virtual bases.
I missed one that's actually hit quite a lot, hashing of the canonical
type TYPE_HASH.
gcc/cp/
* pt.cc (iterative_hash_template_arg): Use iterative_hash_hashval_t
to hash TYPE_HASH.
Switch exponential transformation in the switch conversion pass
currently generates
tmp1 = __builtin_popcount (var);
tmp2 = tmp1 == 1;
when inserting code to determine if var is power of two. If the target
doesn't support expanding the builtin as special instructions switch
conversion relies on this whole pattern being expanded as bitmagic.
However, it is possible that other GIMPLE optimizations move the two
statements of the pattern apart. In that case the builtin becomes a
libgcc call in the final binary. The call is slow and in case of
freestanding programs can result in linking error (this bug was
originally found while compiling Linux kernel).
This patch modifies switch conversion to insert the bitmagic
(var ^ (var - 1)) > (var - 1) instead of the builtin.
gcc/ChangeLog:
PR tree-optimization/116616
* tree-switch-conversion.cc (can_pow2p): Remove this function.
(gen_pow2p): Generate bitmagic instead of a builtin. Remove the
TYPE parameter.
(switch_conversion::is_exp_index_transform_viable): Don't call
can_pow2p.
(switch_conversion::exp_index_transform): Call gen_pow2p without
the TYPE parameter.
* tree-switch-conversion.h: Remove
m_exp_index_transform_pow2p_type.
gcc/testsuite/ChangeLog:
PR tree-optimization/116616
* gcc.target/i386/switch-exp-transform-1.c: Don't test for
presence of the POPCOUNT internal fn call.
Signed-off-by: Filip Kastl <fkastl@suse.cz>
Using iterative_hash_object is expensive compared to using
iterative_hash_hashval_t which is fit for integer sized values.
The following reduces the number of perf cycles spent in
iterative_hash_template_arg and iterative_hash combined by 20%.
gcc/cp/
* pt.cc (iterative_hash_template_arg): Avoid using
iterative_hash_object.
We can now SLP the loop. There's PR116583 tracking that this still
fails for VLA vectors when load-lanes doesn't support a group of
size 8. We can't express this right now so the testcase keeps
FAILing for aarch64 with SVE (but passes now for riscv).
* gcc.dg/vect/slp-12a.c: Adjust.
We can now vectorize the first loop with SLP when using V2SImode
vectors since then we can handle the non-power-of-two interleaving.
We can also SLP the second loop reliably now after adding induction
support for VLA vectors.
* gcc.dg/vect/slp-19c.c: Adjust expectation.
The condition on "vectorizing stmts using SLP" needs to match that
of "vectorized 1 loops", obviously.
PR testsuite/116596
* gcc.dg/vect/slp-11a.c: Fix.
PTA asserts that EAF_NO_DIRECT_READ is not set when flags are
set consistently which doesn't make sense. The following removes
the assert.
PR tree-optimization/113197
* tree-ssa-structalias.cc (handle_call_arg): Remove bougs
assert.
* gcc.dg/lto/pr113197_0.c: New testcase.
* gcc.dg/lto/pr113197_1.c: Likewise.
The testcase in PR114855 shows profile prediction to evaluate
the same SSA def via expr_expected_value for each condition or
switch in a function. The following patch caches the expected
value (and probability/predictor) for each visited SSA def,
also protecting against recursion and thus obsoleting the visited
bitmap. This reduces the time spent in branch prediction from
1.2s to 0.3s, though callgrind which was how I noticed this
seems to be comparatively very much happier about the change than
this number suggests.
PR tree-optimization/114855
* predict.cc (ssa_expected_value): New global.
(expr_expected_value): Do not take bitmap.
(expr_expected_value_1): Likewise. Use ssa_expected_value
to cache results for a SSA def.
(tree_predict_by_opcode): Adjust.
(tree_estimate_probability): Manage ssa_expected_value.
(tree_guess_outgoing_edge_probabilities): Likewise.
We were using the empty string "" for D_T_FMT and ERA_D_T_FMT in the C
locale, instead of "%a %b %e %T %Y" as the C standard requires. Set it
correctly for each locale implementation that defines time_members.cc.
We can also explicitly set the _M_era_xxx pointers to the same values as
the corresponding _M_xxx ones, rather than setting them to point to
identical string literals. This doesn't rely on the compiler merging
string literals, and makes it more explicit that they're the same in the
C locale.
libstdc++-v3/ChangeLog:
* config/locale/dragonfly/time_members.cc
(__timepunct<char>::_M_initialize_timepunc)
(__timepunct<wchar_t>::_M_initialize_timepunc): Set
_M_date_time_format for C locale. Set %Ex formats to the same
values as the %x formats.
* config/locale/generic/time_members.cc: Likewise.
* config/locale/gnu/time_members.cc: Likewise.
* testsuite/22_locale/time_get/get/char/5.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/5.cc: New test.
I noticed that chrono::parse was using duration_cast and time_point_cast
to convert the parsed value to the result. Those functions truncate
towards zero, which is not generally what you want. Especially for
negative times before the epoch, where truncating towards zero rounds
"up" towards the next duration/time_point. Using chrono::round is
typically better, as that rounds to nearest.
However, while testing the fix I realised that rounding to the nearest
can give surprising results in some cases. For example if we parse a
chrono::sys_days using chrono::parse("F %T", "2024-09-22 18:34:56", tp)
then we will round up to the next day, i.e. sys_days(2024y/09/23). That
seems surprising, and I think 2024-09-22 is what most users would
expect.
This change attempts to provide a hybrid rounding heuristic where we use
chrono::round for the general case, but when the result has a period
that is one of minutes, hours, days, weeks, or years then we truncate
towards negative infinity using chrono::floor. This means that we
truncate "2024-09-22 18:34:56" to the start of the current
minute/hour/day/week/year, instead of rounding up to 2024-09-23, or to
18:35, or 17:00. For a period of months chrono::round is used, because
the months duration is defined as a twelfth of a year, which is not
actually the length of any calendar month. We don't want to truncate to
a whole number of "months" if that can actually go from e.g. 2023-03-01
to 2023-01-31, because February is shorter than chrono::months(1).
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__detail::__use_floor): New
function.
(__detail::__round): New function.
(from_stream): Use __detail::__round.
* testsuite/std/time/clock/file/io.cc: Check for expected
rounding in parse.
* testsuite/std/time/clock/gps/io.cc: Likewise.
For 32-bit targets __INT64_TYPE__ expands to long long, which gives a
pedwarn for C++98 mode, causing:
FAIL: 17_intro/headers/c++1998/all_pedantic_errors.cc -std=gnu++98 (test for excess errors)
Excess errors:
.../bits/postypes.h:64: error: ISO C++ 1998 does not support 'long long' [-Wlong-long]
libstdc++-v3/ChangeLog:
* include/bits/postypes.h: Fix -Wlong-long warning.
The following adds SLP support for vectorizing single-lane inductions
with variable length vectors.
PR tree-optimization/116566
* tree-vect-loop.cc (vectorizable_induction): Handle single-lane
SLP for VLA vectors.
* gcc.dg/tree-ssa/reassoc-46.c: When using partial vectors
the dump-scan doesn't look for the required .COND_ADD so
skip for partial vectors.
The following patch implements the clang -Wheader-guard warning, which warns
if a valid multiple inclusion header guard's #ifndef/#if !defined directive
is immediately (no other non-line directives nor other (non-comment)
tokens in between) followed by #define directive for some different macro,
which in get_suggestion rules is close enough to the actual header guard
macro (i.e. likely misspelling), the #define is object-like with empty
definition (I've followed what clang implements) and the macro isn't defined
later on (at least not on the final #endif at the end of a header).
In this case it emits a warning, so that
#ifndef STDIO_H
#define STDOI_H
...
#endif
or similar misspellings can be caught.
clang enables this warning by default, but I've put it into -Wall instead
as it still seems to be a style warning, nothing more severe; if a header
doesn't survive multiple inclusion because of the misspelling, users will
get different diagnostics.
2024-10-02 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/96842
libcpp/
* include/cpplib.h (struct cpp_options): Add warn_header_guard member.
(enum cpp_warning_reason): Add CPP_W_HEADER_GUARD enumerator.
* internal.h (struct cpp_reader): Add mi_def_cmacro, mi_loc and
mi_def_loc members.
(_cpp_defined_macro_p): Constify type pointed by argument type.
Formatting fix.
* init.cc (cpp_create_reader): Clear
CPP_OPTION (pfile, warn_header_guard).
* directives.cc (struct if_stack): Add def_loc and mi_def_cmacro
members.
(DIRECTIVE_TABLE): Add IF_COND flag to define.
(do_define): Set ifs->mi_def_cmacro on a define immediately following
#ifndef directive for the guard. Clear pfile->mi_valid. Formatting
fix.
(do_endif): Copy over pfile->mi_def_cmacro and pfile->mi_def_loc
if ifs->mi_def_cmacro is set and pfile->mi_cmacro isn't a defined
macro.
(push_conditional): Clear mi_def_cmacro and mi_def_loc members.
* files.cc (_cpp_pop_file_buffer): Emit -Wheader-guard diagnostics.
gcc/
* doc/invoke.texi (Wheader-guard): Document.
gcc/c-family/
* c.opt (Wheader-guard): New option.
* c.opt.urls: Regenerated.
* c-ppoutput.cc (init_pp_output): Initialize also cb->get_suggestion.
gcc/testsuite/
* c-c++-common/cpp/Wheader-guard-1.c: New test.
* c-c++-common/cpp/Wheader-guard-1-1.h: New test.
* c-c++-common/cpp/Wheader-guard-1-2.h: New test.
* c-c++-common/cpp/Wheader-guard-1-3.h: New test.
* c-c++-common/cpp/Wheader-guard-1-4.h: New test.
* c-c++-common/cpp/Wheader-guard-1-5.h: New test.
* c-c++-common/cpp/Wheader-guard-1-6.h: New test.
* c-c++-common/cpp/Wheader-guard-1-7.h: New test.
* c-c++-common/cpp/Wheader-guard-1-8.h: New test.
* c-c++-common/cpp/Wheader-guard-1-9.h: New test.
* c-c++-common/cpp/Wheader-guard-1-10.h: New test.
* c-c++-common/cpp/Wheader-guard-1-11.h: New test.
* c-c++-common/cpp/Wheader-guard-1-12.h: New test.
* c-c++-common/cpp/Wheader-guard-2.c: New test.
* c-c++-common/cpp/Wheader-guard-2.h: New test.
* c-c++-common/cpp/Wheader-guard-3.c: New test.
* c-c++-common/cpp/Wheader-guard-3.h: New test.
It seems that we currently require
1) enabling at least c,c++,fortran,d in --enable-languages
2) first doing make html
before one can successfully regenerate-opt-urls, otherwise without 2)
one gets
make regenerate-opt-urls
make: *** No rule to make target '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gcc/Option-Index.html', needed by 'regenerate-opt-urls'. Stop.
or say if not configuring d after make html one still gets
make regenerate-opt-urls
make: *** No rule to make target '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gdc/Option-Index.html', needed by 'regenerate-opt-urls'. Stop.
Now, I believe neither 1) nor 2) is really necessary.
The regenerate-opt-urls goal has dependency on 3 Option-Index.html files,
but those files don't have dependencies how to generate them.
make html has dependency on $(HTMLS_BUILD) which adds
$(build_htmldir)/gcc/index.html and lang.html among other things, where
the former actually builds not just index.html but also Option-Index.html
and tons of other files, and lang.html is filled in by configure depending
on configured languages, so sometimes will include gfortran.html and
sometimes d.html.
The following patch adds dependencies of the Option-Index.html on their
corresponding index.html files and that is all that seems to be needed,
make regenerate-opt-urls then works even without prior make html and
even if just a subset of c/c++, fortran and d is enabled.
2024-10-02 Jakub Jelinek <jakub@redhat.com>
* Makefile.in ($(OPT_URLS_HTML_DEPS)): Add dependencies of the
Option-Index.html files on the corresponding index.html files.
Don't mention the requirement that all languages that have their own
HTML manuals to be enabled.
The problem here is remove_unused_var is called on a name that is
defined by a phi node but it deletes it like removing a normal statement.
remove_phi_node should be called rather than gsi_remove for phinodes.
Note there is a possibility of using simple_dce_from_worklist instead
but that is for another day.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116922
gcc/ChangeLog:
* gimple-ssa-backprop.cc (remove_unused_var): Handle phi
nodes correctly.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr116922.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
gcc.target/powerpc/p9-vec-length-full-8.c was expecting all loops to
use -with-len fully masked vectorization to avoid epilogues because
the loops needed peeling for gaps. With SLP we have improved things
here and the loops using V2D[IF]mode no longer need peeling for gaps
since the target can compose those vectors from two scalars and
in turn we generate better code and not need an epilogue either
(the iteration count divides by the VF).
PR testsuite/116654
* gcc.target/powerpc/p9-vec-length-full-8.c: Adjust.
With single-lane SLP we miss to use the power realing loads causing
some testsuite FAILs. r14-2430-g4736ddd11874fe exempted SLP of
non-grouped accesses because that could have been only splats
where the scheme isn't used anyway, but now with single-lane SLP
it can be contiguous accesses.
PR tree-optimization/116654
* tree-vect-data-refs.cc (vect_supportable_dr_alignment):
Treat non-grouped accesses like non-SLP.
Form 2:
#define DEF_SAT_S_SUB_FMT_2(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_sub_##T##_fmt_1 (T x, T y) \
{ \
T minus = (UT)x - (UT)y; \
if ((x ^ y) >= 0 || (minus ^ x) >= 0) \
return minus; \
return x < 0 ? MIN : MAX; \
}
DEF_SAT_S_SUB_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)
The below test are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_sub-2-i16.c: New test.
* gcc.target/riscv/sat_s_sub-2-i32.c: New test.
* gcc.target/riscv/sat_s_sub-2-i64.c: New test.
* gcc.target/riscv/sat_s_sub-2-i8.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i16.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i32.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i64.c: New test.
* gcc.target/riscv/sat_s_sub-run-2-i8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
There have been various problems with -std=c++14 -fconcepts; let's stop
defining the feature test macro in that case.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Don't define __cpp_concepts
before C++17.
Introduce two new unspecs, UNSPEC_COND_SMAX and UNSPEC_COND_SMIN,
corresponding to rtl operators smax and smin. UNSPEC_COND_SMAX is used
to generate fmaxnm instruction and UNSPEC_COND_SMIN is used to generate
fminnm instruction.
With these new unspecs, we can generate SVE2 max/min instructions using
existing generic unpredicated and predicated instruction patterns that
use optab attribute. Thus, we have removed specialised instruction
patterns for max/min instructions that were using
SVE_COND_FP_MAXMIN_PUBLIC iterator.
No new test cases as the existing test cases should be enough to test
this refactoring.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md
(<fmaxmin><mode>3): Remove this instruction pattern.
(cond_<fmaxmin><mode>): Remove this instruction pattern.
* config/aarch64/iterators.md: New unspecs and changes to
iterators and attrs to use the new unspecs
gcc/fortran/ChangeLog:
* check.cc (int_or_real_or_char_or_unsigned_check_f2003): New function.
(gfc_check_minval_maxval): Use it.
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxval): Handle
initial values for UNSIGNED.
* gfortran.texi: Document MINVAL and MAXVAL for unsigned.
libgfortran/ChangeLog:
* Makefile.am: Add minval and maxval files.
* Makefile.in: Regenerated.
* gfortran.map: Add new functions.
* generated/maxval_m1.c: New file.
* generated/maxval_m16.c: New file.
* generated/maxval_m2.c: New file.
* generated/maxval_m4.c: New file.
* generated/maxval_m8.c: New file.
* generated/minval_m1.c: New file.
* generated/minval_m16.c: New file.
* generated/minval_m2.c: New file.
* generated/minval_m4.c: New file.
* generated/minval_m8.c: New file.
gcc/testsuite/ChangeLog:
* gfortran.dg/unsigned_34.f90: New test.
The testcase is miscompiled with -O -flto beccause the three optimizations
NRV + RSO + inlining are applied to the same call: when the LHS of the call
is marked write-only before inlining, it will keep the mark after inlining
although it may be read in GIMPLE from that point on.
The fix is to apply the removal of the store, that would have been applied
later if the call was not inlined, right before inlining, which will prevent
the problematic references to the LHS from being generated during inlining.
gcc/
* tree-inline.cc (expand_call_inline): Remove the store to the
return slot if it is a global variable that is only written to.
gcc/testsuite/
* gnat.dg/lto28.adb: New test.
* gnat.dg/lto28_pkg1.ads: New helper.
* gnat.dg/lto28_pkg2.ads: Likewise.
* gnat.dg/lto28_pkg2.adb: Likewise.
* gnat.dg/lto28_pkg3.ads: Likewise.
0001-RISC-V-libgcc-Fix-incorrect-and-missing-.cfi_offset-.patch
From 06a370a0a2329dd4da0ffcab7c35ea7df2353baf Mon Sep 17 00:00:00 2001
From: Jim Lin <jim@andestech.com>
Date: Tue, 1 Oct 2024 14:42:56 +0800
Subject: [PATCH] RISC-V/libgcc: Fix incorrect and missing .cfi_offset for
__riscv_save_[0-3] on RV32.
libgcc/ChangeLog:
* config/riscv/save-restore.S: Fix .cfi_offset for
__riscv_save_[0-3] on RV32.
P2985R0 (C++26) introduces std::is_virtual_base_of; this is the compiler
builtin that will back up the library trait (which strictly requires
compiler support).
The name has been chosen to match LLVM/MSVC's, as per the discussion
here:
https://github.com/llvm/llvm-project/issues/98310
The actual user-facing type trait in libstdc++ will be added later.
gcc/cp/ChangeLog:
* constraint.cc (diagnose_trait_expr): New diagnostic.
* cp-trait.def (IS_VIRTUAL_BASE_OF): New builtin.
* cp-tree.h (enum base_access_flags): Add a new flag to be
able to request a search for a virtual base class.
* cxx-pretty-print.cc (pp_cxx_userdef_literal): Update the
list of GNU extensions to the grammar.
* search.cc (struct lookup_base_data_s): Add a field to
request searching for a virtual base class.
(dfs_lookup_base): Add the ability to look for a virtual
base class.
(lookup_base): Forward the flag to dfs_lookup_base.
* semantics.cc (trait_expr_value): Implement the builtin
by calling lookup_base with the new flag.
(finish_trait_expr): Handle the new builtin.
gcc/ChangeLog:
* doc/extend.texi: Document the new
__builtin_is_virtual_base_of builtin; amend the docs for
__is_base_of.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_virtual_base_of.C: New test.
* g++.dg/ext/is_virtual_base_of_diagnostic.C: New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Take:
```
if (t_3(D) != 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>
_8 = c_4(D) != 0;
_9 = (int) _8;
<bb 4>
# e_2 = PHI <_9(3), 0(2)>
```
We should factor out the conversion here as that will allow a simplfication to
`(t_3 != 0) & (c_4 != 0)`. Unlike most other types; `a ? b : CST` will simplify
for boolean result type to either `a | b` or `a & b` so allowing this conversion
for all operations will be always profitable.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Note on the phi-opt-7.c testcase change, we are now able to optimize this
and remove the if due to the factoring out now so this is an improvement.
PR tree-optimization/116890
gcc/ChangeLog:
* tree-ssa-phiopt.cc (factor_out_conditional_operation): Conversions
from bool is also should be considered as wanting to happen.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phi-opt-7.c: Update testcase for no ifs left.
* gcc.dg/tree-ssa/phi-opt-42.c: New test.
* gcc.dg/tree-ssa/phi-opt-43.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch introduces the procedure function FindIndice to library
module Indexing.
gcc/m2/ChangeLog:
* gm2-libs/Indexing.def (FindIndice): New procedure
function.
* gm2-libs/Indexing.mod (FindIndice): Implement new
procedure function.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
This patch fixes the syntax for the generated swig interface file.
The % characters in fprintf require escaping.
gcc/m2/ChangeLog:
PR modula2/116918
* gm2-compiler/M2Swig.mod (AnnotateProcedure): Capitalize
the generated comment, split comment into multiple lines and
terminate the comment with ". */".
(DoCheckUnbounded): Escape the % character with %%.
(DoWriteFile): Ditto.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
The ACLE defines a new scalar type, __mfp8. This is an opaque 8bit types that
can only be used by fp8 intrinsics. Additionally, the mfloat8_t type is made
available in arm_neon.h and arm_sve.h as an alias of the same.
This implementation uses an unsigned INTEGER_TYPE, with precision 8 to
represent __mfp8. Conversions to int and other types are disabled via the
TARGET_INVALID_CONVERSION hook.
Additionally, operations that are typically available to integer types are
disabled via TARGET_INVALID_UNARY_OP and TARGET_INVALID_BINARY_OP hooks.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc (aarch64_mfp8_type_node): Add node
for __mfp8 type.
(aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type.
(aarch64_init_fp8_types): New function to initialise fp8 types and
register with language backends.
* config/aarch64/aarch64.cc (aarch64_mangle_type): Add ABI mangling for
new type.
(aarch64_invalid_conversion): Add function implementing
TARGET_INVALID_CONVERSION hook that blocks conversion to and from the
__mfp8 type.
(aarch64_invalid_unary_op): Add function implementing TARGET_UNARY_OP
hook that blocks operations on __mfp8 other than &.
(aarch64_invalid_binary_op): Extend TARGET_BINARY_OP hook to disallow
operations on __mfp8 type.
(TARGET_INVALID_CONVERSION): Add define.
(TARGET_INVALID_UNARY_OP): Likewise.
* config/aarch64/aarch64.h (aarch64_mfp8_type_node): Add node for __mfp8
type.
(aarch64_mfp8_ptr_type_node): Add node for __mfp8 pointer type.
* config/aarch64/arm_private_fp8.h (mfloat8_t): Add typedef.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/fp8_mangling.C: New tests exercising mangling.
* g++.target/aarch64/fp8_scalar_typecheck_2.C: New tests in C++.
* gcc.target/aarch64/fp8_scalar_1.c: New tests in C.
* gcc.target/aarch64/fp8_scalar_typecheck_1.c: Likewise.
This is another case of load hoisting breaking UID order in the
preheader, this time between two hoistings. The easiest way out is
to do what we do for the main stmt - copy instead of move.
PR tree-optimization/116902
PR tree-optimization/116842
* tree-vect-stmts.cc (sort_after_uid): Remove again.
(hoist_defs_of_uses): Copy defs instead of hoisting them so
we can zero their UID.
(vectorizable_load): Separate analysis and transform call,
do transform on the stmt copy.
* g++.dg/torture/pr116902.C: New testcase.
The following avoids querying ranges of vector entities.
PR tree-optimization/116905
* tree-vect-stmts.cc (supportable_indirect_convert_operation):
Fix guard for vect_get_range_info.
* gcc.dg/pr116905.c: New testcase.
When we're computing ANTIC for PRE we treat edges to not yet visited
blocks as having a maximum ANTIC solution to get at an optimistic
solution in the iteration. That assumes the edges visted eventually
execute. This is a wrong assumption that can lead to wrong code
(and not only non-optimality) when possibly trapping expressions
are involved as the testcases in the PR show. The following mitigates
this by pruning trapping expressions from ANTIC computed when
maximum sets are involved.
PR tree-optimization/116906
* tree-ssa-pre.cc (prune_clobbered_mems): Add clean_traps
argument.
(compute_antic_aux): Direct prune_clobbered_mems to prune
all traps when any MAX solution was involved in the ANTIC
computation.
(compute_partial_antic_aux): Adjust.
* gcc.dg/pr116906-1.c: New testcase.
* gcc.dg/pr116906-2.c: Likewise.
Ranger cache during initialization reserves number of basic block slots
in the m_workback vector (in an inefficient way by doing create (0)
and safe_grow_cleared (n) and truncate (0) rather than say create (n)
or reserve (n) after create). The problem is that some passes split bbs and/or
create new basic blocks and so when one is unlucky, the quick_push into that
vector can ICE.
The following patch replaces those 4 quick_push calls with safe_push.
I've also gathered some statistics from compiling 63 gcc sources (picked those
that dependent on gimple-range-cache.h just because I had to rebuild them once
for the instrumentation), and that showed that in 81% of cases nothing has
been pushed into the vector at all (and note, not everything was small, there
were even cases with 10000+ basic blocks), another 18.5% of cases had just 1-4
elements in the vector at most, 0.08% had 5-8 and 19 out of 305386 cases
had at most 9-11 elements, nothing more. So, IMHO reserving number of basic
block in the vector is a waste of compile time memory and with the safe_push
calls, the patch just initializes the vector to vNULL.
2024-10-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/116899
* gimple-range-cache.cc (ranger_cache::ranger_cache): Set m_workback
to vNULL instead of creating it, growing and then truncating.
(ranger_cache::fill_block_cache): Use safe_push rather than quick_push
on m_workback.
(ranger_cache::range_from_dom): Likewise.
* gcc.dg/bitint-111.c: New test.
Some passes like the bitint lowering queue some statements on edges and only
commit them at the end of the pass. If they use ranger at the same time,
the ranger might see such SSA_NAMEs and ICE on those. The following patch
instead just punts on them.
2024-10-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/116898
* gimple-range-cache.cc (ranger_cache::block_range): If a SSA_NAME
with NULL def_bb isn't SSA_NAME_IS_DEFAULT_DEF, return false instead
of failing assertion. Formatting fix.
* gcc.dg/bitint-110.c: New test.
There are 100+ regressions when running the g++ testsuite for newlib
targets (probably excepting ARM-based ones) e.g cris-elf after commit
r15-3859-g63a598deb0c9fc "libstdc++: #ifdef out #pragma GCC
system_header", which effectively no longer silences warnings for
gcc-installed system headers. Some of these regressions are fixed by
r15-3928. For the remaining ones, there's in g++.log:
FAIL: g++.old-deja/g++.robertl/eb79.C -std=c++26 (test for excess errors)
Excess errors:
/gccobj/cris-elf/libstdc++-v3/include/cris-elf/bits/ctype_base.h:50:53: \
warning: overflow in conversion from 'int' to 'std::ctype_base::mask' \
{aka 'char'} changes value from '151' to '-105' [-Woverflow]
This is because the _B macro in newlib's ctype.h (from where the
"_<letter>" macros come) is bit 7, the sign-bit of 8-bit types:
#define _B 0200
Using it in an int-expression that is then truncated to 8 bits will
"change" the value to negative for a default-signed char. If this
code was created from scratch, it should have been an unsigned type,
however it's not advisable to change the type of mask as this affects
the API. The least ugly option seems to be to silence the warning by
explict casts in the initializer, and for consistency, doing it for
all members.
PR libstdc++/116895
* config/os/newlib/ctype_base.h: Avoid signed-overflow warnings by
explicitly casting initializer expressions to mask.
1) We're hitting the assert in cp_parser_placeholder_type_specifier.
It says that if it turns out to be false, we should do error() instead.
Do so, then.
2) lambda-targ8.C should compile fine, though. The problem was that
local_variables_forbidden_p wasn't cleared when we're about to parse
the optional template-parameter-list for a lambda in a default argument.
PR c++/109859
gcc/cp/ChangeLog:
* parser.cc (cp_parser_lambda_declarator_opt): Temporarily clear
local_variables_forbidden_p.
(cp_parser_placeholder_type_specifier): Turn an assert into an
error.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-defarg3.C: New test.
* g++.dg/cpp2a/lambda-targ8.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>