When SLP analysis scraps an instance because it fails to analyze we
can end up calling vectorizable_* in analysis mode on a node that
was analyzed during the analysis of that instance again.
vectorizable_simd_clone_call wasn't expecting that and instead
guarded analysis/transform code on populated data structures.
The following changes it so it survives re-analysis.
PR tree-optimization/116674
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Support
re-analysis.
* g++.dg/vect/pr116674.cc: New testcase.
Together with the preparatory compiler patches, this patch restores
unrolling in std::__find_if, but this time relying on the compiler to do
it by using:
#pragma GCC unroll 4
which should restore the majority of the regression relative to the
hand-unrolled version while still being vectorizable with WIP alignment
peeling enhancements.
On Neoverse V1 with LTO, this reduces the regression in xalancbmk (from
SPEC CPU 2017) from 5.8% to 1.7% (restoring ~71% of the lost
performance).
libstdc++-v3/ChangeLog:
PR libstdc++/116140
* include/bits/stl_algobase.h (std::__find_if): Add #pragma to
request GCC to unroll the loop.
When #pragma GCC unroll is processed in
tree-cfg.cc:replace_loop_annotate_in_block, we set both the loop->unroll
field (which is currently streamed out and back in during LTO) but also
the cfun->has_unroll flag.
cfun->has_unroll, however, is not currently streamed during LTO. This
patch fixes that.
Prior to this patch, loops marked with #pragma GCC unroll that would be
unrolled by RTL loop2_unroll in a non-LTO compilation didn't get
unrolled under LTO.
gcc/ChangeLog:
PR libstdc++/116140
* lto-streamer-in.cc (input_struct_function_base): Stream in
fn->has_unroll.
* lto-streamer-out.cc (output_struct_function_base): Stream out
fn->has_unroll.
gcc/testsuite/ChangeLog:
PR libstdc++/116140
* g++.dg/ext/pragma-unroll-lambda-lto.C: New test.
I noticed while working on a test that uses LTO and requests a dump
file, that we are failing to cleanup ltrans dump files in the testsuite.
E.g. the test I was working on compiles with -flto
-fdump-rtl-loop2_unroll, and we end up with the following file:
./gcc/testsuite/g++/pr116140.ltrans0.ltrans.287r.loop2_unroll
being left behind by the testsuite. This is problematic not just from a
"missing cleanup" POV, but also because it can cause the test to pass
spuriously when the test is re-run wtih an unpatched compiler (without
the bug fix). In the broken case, loop2_unroll isn't run at all, so we
end up scanning the old dumpfile (from the previous test run) and making
the dumpfile scan pass.
Running with `-v -v` in RUNTESTFLAGS we can see the following cleanup
attempt is made:
remove-build-file `pr116140.{C,exe}.{ltrans[0-9]*.,}[0-9][0-9][0-9]{l,i,r,t}.*'
looking again at the ltrans dump file above we can see this will fail for two
reasons:
- The actual dump file has no {C,exe} extension between the basename and
ltrans0.
- The actual dump file has an additional `.ltrans` component after `.ltrans0`.
This patch therefore relaxes the pattern constructed for cleaning up such
dumpfiles to also match dumpfiles with the above form.
Running the testsuite before/after this patch shows the number of files in
gcc/testsuite (in the build dir) with "ltrans" in the name goes from 1416 to 62
on aarch64.
gcc/testsuite/ChangeLog:
PR libstdc++/116140
* lib/gcc-dg.exp (schedule-cleanups): Relax ltrans dumpfile
cleanup pattern to handle missing cases.
For the testcase added with this patch, we would end up losing the:
#pragma GCC unroll 4
and emitting "warning: ignoring loop annotation". That warning comes
from tree-cfg.cc:replace_loop_annotate, and means that we failed to
process the ANNOTATE_EXPR in tree-cfg.cc:replace_loop_annotate_in_block.
That function walks backwards over the GIMPLE in an exiting BB for a
loop, skipping over the final gcond, and looks for any ANNOTATE_EXPRS
immediately preceding the gcond.
The function documents the following pre-condition:
/* [...] We assume that the annotations come immediately before the
condition in BB, if any. */
now looking at the exiting BB of the loop, we have:
<bb 8> :
D.4524 = .ANNOTATE (iftmp.1, 1, 4);
retval.0 = D.4524;
if (retval.0 != 0)
goto <bb 3>; [INV]
else
goto <bb 9>; [INV]
and crucially there is an intervening assignment between the gcond and
the preceding .ANNOTATE ifn call. To see where this comes from, we can
look to the IR given by -fdump-tree-original:
if (<<cleanup_point ANNOTATE_EXPR <first != last && !use_find(short
int*)::<lambda(short int)>::operator() (&pred, *first), unroll 4>>>)
goto <D.4518>;
else
goto <D.4516>;
here the problem is that we've wrapped a CLEANUP_POINT_EXPR around the
ANNOTATE_EXPR, meaning the ANNOTATE_EXPR is no longer the outermost
expression in the condition.
The CLEANUP_POINT_EXPR gets added by the following call chain:
finish_while_stmt_cond
-> maybe_convert_cond
-> condition_conversion
-> fold_build_cleanup_point_expr
this patch chooses to fix the issue by first introducing a new helper
class (annotate_saver) to save and restore outer chains of
ANNOTATE_EXPRs and then using it in maybe_convert_cond.
With this patch, we don't get any such warning and the loop gets unrolled as
expected at -O2.
gcc/cp/ChangeLog:
PR libstdc++/116140
* semantics.cc (anotate_saver): New. Use it ...
(maybe_convert_cond): ... here, to ensure any ANNOTATE_EXPRs
remain the outermost expression(s) of the condition.
gcc/testsuite/ChangeLog:
PR libstdc++/116140
* g++.dg/ext/pragma-unroll-lambda.C: New test.
The undefined std::ios_base_library_init() symbol that is referenced by
<iostream> is only supposed to be used for targets where symbol
versioning is supported.
The mingw-w64 target defaults to --enable-symvers=gnu due to using GNU
ld but doesn't actually support symbol versioning. This means it tries
to emit references to the std::ios_base_library_init() symbol, which
isn't really defined in the library. This causes problems when using lld
to link user binaries.
Disable the undefined symbol reference for non-ELF targets.
libstdc++-v3/ChangeLog:
PR libstdc++/116159
* include/std/iostream (ios_base_library_init): Only define for
ELF targets.
* src/c++98/ios_init.cc (ios_base_library_init): Likewise.
The changes to implement LWG 2579 (r10-327-gdb33efde17932f) made
std::string::assign use the propagate_on_container_copy_assignment
(POCCA) trait, for consistency with operator=(const basic_string&).
However, this also unintentionally affected operator=(basic_string&&)
which calls assign(str) to make a deep copy when performing a move is
not possible. The fix is for the move assignment operator to call
_M_assign(str) instead of assign(str), as this just does the deep copy
and doesn't check the POCCA trait first.
The bug only affects the unlikely/useless combination of POCCA==true and
POCMA==false, but we should fix it for correctness anyway. it should
also make move assignment slightly cheaper to compile and execute,
because we skip the extra code in assign(const basic_string&).
libstdc++-v3/ChangeLog:
PR libstdc++/116641
* include/bits/basic_string.h (operator=(basic_string&&)): Call
_M_assign instead of assign.
* testsuite/21_strings/basic_string/allocator/116641.cc: New
test.
The following testcase is miscompiled, because
get_member_function_from_ptrfunc
emits something like
(((FUNCTION.__pfn & 1) != 0)
? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
: FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
or so, so FUNCTION tree is used there 5 times. There is
if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
but in this case function doesn't have side-effects, just nested ARRAY_REFs.
Now, if all the FUNCTION trees would be shared, it would work fine,
FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
that isn't the case, both the BIT_AND_EXPR shortening and conversion to
bool done for build_conditional_expr actually unshare_expr that first
expression, but none of the other 4 are unshared. With -fsanitize=bounds,
.UBSAN_BOUNDS calls are added to the ARRAY_REFs and use save_expr to avoid
evaluating the argument multiple times, but because that FUNCTION tree is
first used in the second argument of COND_EXPR (i.e. conditionally), the
SAVE_EXPR initialization is done just there and then the third argument
of COND_EXPR just uses the uninitialized temporary and so does the first
argument computation as well.
The following patch fixes that by doing save_expr even if !TREE_SIDE_EFFECTS,
but to avoid doing that too often only if !nonvirtual and if the expression
isn't a simple decl.
2024-09-10 Jakub Jelinek <jakub@redhat.com>
PR c++/116449
* typeck.cc (get_member_function_from_ptrfunc): Use save_expr
on instance_ptr and function even if it doesn't have side-effects,
as long as it isn't a decl.
* g++.dg/ubsan/pr116449.C: New test.
Since r15-3532-g7cebc6384a0ad6 18_support/new_nothrow.cc fails in C++98 mode because G++
diagnoses missing exception specifications for the user-defined
(de)allocation functions. Add throw(std::bad_alloc) and throw() for
C++98 mode.
Similarly, 26_numerics/headers/numeric/synopsis.cc fails in C++20 mode
because the declarations of gcd and lcm are not noexcept.
libstdc++-v3/ChangeLog:
* testsuite/18_support/new_nothrow.cc (THROW_BAD_ALLOC): Define
macro to add exception specifications for C++98 mode.
(NOEXCEPT): Expand to throw() for C++98 mode.
* testsuite/26_numerics/headers/numeric/synopsis.cc (gcd, lcm):
Add noexcept.
Here we wrongly mark the reference temporary for g TREE_READONLY,
so it's put in .rodata and so we can't modify its subobject even
when the subobject is marked mutable. This is so since r9-869.
r14-1785 fixed a similar problem, but not in set_up_extended_ref_temp.
PR c++/116369
gcc/cp/ChangeLog:
* call.cc (set_up_extended_ref_temp): Don't mark a temporary
TREE_READONLY if its type is TYPE_HAS_MUTABLE_P.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt7.C: New test.
Permute nodes do not have a representative so we have to guard
vect_is_slp_load_node against those.
PR tree-optimization/116658
* tree-vect-slp.cc (vect_is_slp_load_node): Make sure
node isn't a permute.
* g++.dg/vect/pr116658.cc: New testcase.
Enable reporting an error when this new aspect/pragma is set to
True, and the sources are compiled without language extensions
allowed.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Call
Error_Msg_GNAT_Extension() to report an error when the aspect
First_Controlling_Parameter is set to True and the sources are
compiled without Core_Extensions_ Allowed.
* sem_prag.adb (Pragma_First_Controlling_Parameter): Call
subprogram Error_Msg_GNAT_Extension() to report an error when the
aspect First_Controlling_Parameter is set to True and the sources
are compiled without Core_Extensions_Allowed. Report an error when
the aspect pragma does not confirm an inherited True value.
The total number of characters on a source code line
is different on Windows and Linux based systems
(CRLF vs LF endings). Use the last non line change
character to adjust printing the spans that go over
the end of line.
gcc/ada/
* diagnostics-pretty_emitter.adb (Get_Last_Line_Char): New. Get
the last non line change character. Write_Span_Labels use the
adjusted line end pointer to calculate the length of the span.
When semantic checking mode is active, i.e. when switch -gnatc is
present or when the frontend is operating in the GNATprove mode,
we now rewrite calls to GNAT.Source_Info routines in evaluation
and not expansion (which is disabled in these modes).
This is needed to recognize constants initialized with calls to
GNAT.Source_Info as static constants, regardless of expansion being
enabled.
gcc/ada/
* exp_intr.ads, exp_intr.adb (Expand_Source_Info): Move
declaration to package spec.
* sem_eval.adb (Eval_Intrinsic_Call): Evaluate calls to
GNAT.Source_Info where possible.
When r14-303-gb9fedabe381cce was done, it was missed that some of the common parts could
be done in a template and a lambda could be used. This patch implements that. This new
function can be used later on to implement a simple ifcvt pass.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (execute_over_cond_phis): New template function,
moved the common parts from pass_phiopt::execute/pass_cselim::execute.
(pass_phiopt::execute): Move the functon specific parts of the loop
into an lamdba.
(pass_cselim::execute): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This converts the uses of PHI_RESULT in phiopt to be gimple_phi_result
instead. Since there was already a mismatch of uses here, it
would be good to use prefered one (gimple_phi_result) instead.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116643
gcc/ChangeLog:
* tree-ssa-phiopt.cc (replace_phi_edge_with_variable): s/PHI_RESULT/gimple_phi_result/.
(factor_out_conditional_operation): Likewise.
(minmax_replacement): Likewise.
(spaceship_replacement): Likewise.
(cond_store_replacement): Likewise.
(cond_if_else_store_replacement_1): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
It fix the regression by
a51f2fc0d8 is the first bad commit
commit a51f2fc0d8
Author: liuhongt <hongtao.liu@intel.com>
Date: Wed Sep 4 15:39:17 2024 +0800
Handle const0_operand for *avx2_pcmp<mode>3_1.
caused
FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu|vmovups 1
To reproduce:
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr59539-1.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr59539-1.c --target_board='unix{-m64\ -march=cascadelake}'"
gcc/ChangeLog:
* config/i386/sse.md (*avx2_pcmp<mode>3_1): Don't force_reg
operands[3] when it's not const0_rtx.
Use a new struct diagnostic_option_id rather than just "int" when
referring to command-line options controlling warnings in the
diagnostic subsystem.
No functional change intended, but better documents the meaning of
the code.
gcc/c-family/ChangeLog:
* c-common.cc (c_option_controlling_cpp_diagnostic): Return
diagnostic_option_id rather than int.
(c_cpp_diagnostic): Update for renaming of
diagnostic_override_option_index to diagnostic_set_option_id.
gcc/c/ChangeLog:
* c-errors.cc (pedwarn_c23): Use "diagnostic_option_id option_id"
rather than "int opt". Update for renaming of diagnostic_info
field.
(pedwarn_c11): Likewise.
(pedwarn_c99): Likewise.
(pedwarn_c90): Likewise.
* c-tree.h (pedwarn_c90): Likewise for decl.
(pedwarn_c99): Likewise.
(pedwarn_c11): Likewise.
(pedwarn_c23): Likewise.
gcc/cp/ChangeLog:
* constexpr.cc (constexpr_error): Update for renaming of
diagnostic_info field.
* cp-tree.h (pedwarn_cxx98): Use "diagnostic_option_id" rather
than "int".
* error.cc (cp_adjust_diagnostic_info): Update for renaming of
diagnostic_info field.
(pedwarn_cxx98): Use "diagnostic_option_id option_id" rather than
"int opt". Update for renaming of diagnostic_info field.
(diagnostic_set_info): Likewise.
gcc/d/ChangeLog:
* d-diagnostic.cc (d_diagnostic_report_diagnostic): Update for
renaming of diagnostic_info field.
gcc/ChangeLog:
* diagnostic-core.h (struct diagnostic_option_id): New.
(warning): Use it rather than "int" for param.
(warning_n): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(pedwarn): Likewise.
(permerror_opt): Likewise.
(emit_diagnostic): Likewise.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
* diagnostic-format-json.cc
(json_output_format::on_report_diagnostic): Update for renaming of
diagnostic_info field.
* diagnostic-format-sarif.cc (sarif_builder::make_result_object):
Likewise.
(make_reporting_descriptor_object_for_warning): Likewise.
* diagnostic-format-text.cc (print_option_information): Likewise.
* diagnostic-global-context.cc (emit_diagnostic): Use
"diagnostic_option_id option_id" rather than "int opt".
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist_meta): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_meta): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror_opt): Likewise.
* diagnostic.cc (diagnostic_set_info_translated): Update for
renaming of diagnostic_info field.
(diagnostic_option_classifier::classify_diagnostic): Use
"diagnostic_option_id option_id" rather than "int opt".
(update_effective_level_from_pragmas): Update for renaming of
diagnostic_info field.
(diagnostic_context::diagnostic_enabled): Likewise.
(diagnostic_context::warning_enabled_at): Use
"diagnostic_option_id option_id" rather than "int opt".
(diagnostic_context::diagnostic_impl): Likewise.
(diagnostic_context::diagnostic_n_impl): Likewise.
* diagnostic.h (diagnostic_info::diagnostic_info): Update for...
(diagnostic_info::option_index): Rename...
(diagnostic_info::option_id): ...to this.
(class diagnostic_option_manager): Use
"diagnostic_option_id option_id" rather than "int opt" for vfuncs.
(diagnostic_option_classifier): Likewise for member funcs.
(diagnostic_classification_change_t::option): Add comment.
(diagnostic_context::warning_enabled_at): Use
"diagnostic_option_id option_id" rather than "int option_index".
(diagnostic_context::option_unspecified_p): Likewise.
(diagnostic_context::classify_diagnostic): Likewise.
(diagnostic_context::option_enabled_p): Likewise.
(diagnostic_context::make_option_name): Likewise.
(diagnostic_context::make_option_url): Likewise.
(diagnostic_context::diagnostic_impl): Likewise.
(diagnostic_context::diagnostic_n_impl): Likewise.
(diagnostic_override_option_index): Rename...
(diagnostic_set_option_id): ...to this, and update for
diagnostic_info field renaming.
(diagnostic_classify_diagnostic): Use "diagnostic_option_id"
rather than "int".
(warning_enabled_at): Likewise.
(option_unspecified_p): Likewise.
gcc/fortran/ChangeLog:
* cpp.cc (cb_cpp_diagnostic_cpp_option): Convert return type from
"int" to "diagnostic_option_id".
(cb_cpp_diagnostic): Update for renaming of
diagnostic_override_option_index to diagnostic_set_option_id.
* error.cc (gfc_warning): Update for renaming of diagnostic_info
field.
(gfc_warning_now_at): Likewise.
(gfc_warning_now): Likewise.
(gfc_warning_internal): Likewise.
gcc/ChangeLog:
* ipa-pure-const.cc: Replace include of "opts.h" with
"opts-diagnostic.h".
(suggest_attribute): Convert param from int to
diagnostic_option_id.
* lto-wrapper.cc (class lto_diagnostic_option_manager): Use
diagnostic_option_id rather than "int".
* opts-common.cc
(compiler_diagnostic_option_manager::option_enabled_p): Likewise.
* opts-diagnostic.h (class gcc_diagnostic_option_manager):
Likewise.
(class compiler_diagnostic_option_manager): Likewise.
* opts.cc (compiler_diagnostic_option_manager::make_option_name):
Likewise.
(gcc_diagnostic_option_manager::make_option_url): Likewise.
* substring-locations.cc
(format_string_diagnostic_t::emit_warning_n_va): Likewise.
(format_string_diagnostic_t::emit_warning_va): Likewise.
(format_string_diagnostic_t::emit_warning): Likewise.
(format_string_diagnostic_t::emit_warning_n): Likewise.
* substring-locations.h
(format_string_diagnostic_t::emit_warning_va): Likewise.
(format_string_diagnostic_t::emit_warning_n_va): Likewise.
(format_string_diagnostic_t::emit_warning): Likewise.
(format_string_diagnostic_t::emit_warning_n): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
We were using
https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json
as the URL for the SARIF 2.1 schema, but this is now a 404.
Update it to the URL listed in the spec (§3.13.3 "$schema property"),
which is:
https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json
and update the copy in
gcc/testsuite/lib/sarif-schema-2.1.0.json
used by the "verify-sarif-file" DejaGnu directive to the version found at
that latter URL; the sha256 sum changes
from: 2b19d2358baef0251d7d24e208d05ffabf1b2a3ab5e1b3a816066fc57fd4a7e8
to: c3b4bb2d6093897483348925aaa73af03b3e3f4bd4ca38cef26dcb4212a2682e
Doing so added a validation error on
c-c++-common/diagnostic-format-sarif-file-pr111700.c
for which we emit this textual output:
this-file-does-not-exist.c: warning: #warning message [-Wcpp]
with no line number, and these invalid SARIF regions within the
physical location of the warning:
"region": {"startColumn": 2,
"endColumn": 9},
"contextRegion": {}
This is due to this directive:
# 0 "this-file-does-not-exist.c"
with line number 0.
The patch fixes this by not creating regions that have startLine <= 0.
gcc/ChangeLog:
PR other/116603
* diagnostic-format-sarif.cc (SARIF_SCHEMA): Update URL.
(sarif_builder::maybe_make_region_object): Don't create regions
with startLine <= 0.
(sarif_builder::maybe_make_region_object_for_context): Likewise.
gcc/testsuite/ChangeLog:
PR other/116603
* gcc.dg/plugin/diagnostic-test-metadata-sarif.py (test_basics):
Update expected schema URL.
* gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py:
Likewise.
* gcc.dg/sarif-output/test-include-chain-1.py: Likewise.
* gcc.dg/sarif-output/test-include-chain-2.py: Likewise.
* gcc.dg/sarif-output/test-missing-semicolon.py: Likewise.
* gcc.dg/sarif-output/test-no-diagnostics.py: Likewise.
* gcc.dg/sarif-output/test-werror.py: Likewise.
* lib/sarif-schema-2.1.0.json: Update with copy downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Double-word memory operands are accessed as their high and low part, so the
memory location has to be offsettable. Use "o" constraint instead of "m"
for double-word memory operands.
gcc/ChangeLog:
* config/i386/i386.md (*insvdi_lowpart_1): Use "o" constraint
instead of "m" for double-word mode memory operands.
(*add<dwi>3_doubleword_zext): Ditto.
(*addv<dwi>4_doubleword_1): Use "jO" constraint instead of "jM"
for double-word mode memory operands.
I missed this in r15-1108-g70f26314b62e2d.
gcc/analyzer/ChangeLog:
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Drop unused
local "summary_cast_reg"
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This expands on optimizing `popcount(a) == 1` to also handle
`popcount(a) <= 1`. `<= 1` can be expanded as `(a & -a) == 0`
like what is done for `== 1` if we know that a was nonzero.
We have to do the optimization in 2 places due to if we have
an optab entry for popcount or not.
Built and tested for aarch64-linux-gnu.
PR middle-end/90693
gcc/ChangeLog:
* internal-fn.cc (expand_POPCOUNT): Handle the second argument
being `-1` for `<= 1`.
* tree-ssa-math-opts.cc (match_single_bit_test): Handle LE/GT
cases.
(math_opts_dom_walker::after_dom_children): Call match_single_bit_test
for LE_EXPR/GT_EXPR also.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-le-1.c: New test.
* gcc.target/aarch64/popcnt-le-2.c: New test.
* gcc.target/aarch64/popcnt-le-3.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
pa_print_operand handles both operand orders for scaled index
addresses, so it isn't necessary to canonicalize the order of
operands.
2024-09-09 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.cc (pa_legitimate_address_p): Don't
canonicalize operand order of scaled index addresses.
When evaluating the difference of two aligned pointers in CCP we
fail to handle the EXACT_DIV_EXPR by the element size that occurs.
The testcase then also exercises modulo to test alignment but
modulo by a power-of-two isn't handled either.
PR tree-optimization/116514
* tree-ssa-ccp.cc (bit_value_binop): Handle EXACT_DIV_EXPR
like TRUNC_DIV_EXPR. Handle exact division of a signed value
by a power-of-two like a shift. Handle unsigned division by
a power-of-two like a shift.
Handle unsigned TRUNC_MOD_EXPR by power-of-two, handle signed
TRUNC_MOD_EXPR by power-of-two if the result is zero.
* gcc.dg/tree-ssa/ssa-ccp-44.c: New testcase.
The following avoids classifying a double reduction that's not
actually a reduction in the outer loop (because its value isn't
used outside of the outer loop). This avoids us ICEing on the
unexpected stmt/SLP node arrangement.
PR tree-optimization/116647
* tree-vect-loop.cc (vect_is_simple_reduction): Add missing
check to double reduction detection.
* gcc.dg/torture/pr116647.c: New testcase.
* gcc.dg/vect/no-scevccp-pr86725-2.c: Adjust expected pattern.
* gcc.dg/vect/no-scevccp-pr86725-4.c: Likewise.
I noticed this folding inside fab could be done else where and could
even improve inlining decisions and a few other things so let's
move it to fold_stmt.
It also fixes PR 116601 because places which call fold_stmt already
have to deal with the stmt becoming a non-throw statement.
For the fix for PR 116601 on the branches should be the original patch
rather than a backport of this one.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/116601
gcc/ChangeLog:
* gimple-fold.cc (optimize_memcpy_to_memset): Move
from tree-ssa-ccp.cc and rename. Also return true
if the optimization happened.
(gimple_fold_builtin_memory_op): Call
optimize_memcpy_to_memset.
(fold_stmt_1): Call optimize_memcpy_to_memset for
load/store copies.
* tree-ssa-ccp.cc (optimize_memcpy): Delete.
(pass_fold_builtins::execute): Remove code that
calls optimize_memcpy.
gcc/testsuite/ChangeLog:
* gcc.dg/pr78408-1.c: Adjust dump scan to match where
the optimization now happens.
* g++.dg/torture/except-2.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
There was a reported regression on x86-64 with -march=cascadelake
and -m32 where epilogue vectorization causes a different number of
SLPed loops. Fixed by disabling epilogue vectorization for the
testcase.
* gcc.dg/vect/fast-math-vect-call-2.c: Disable epilogue
vectorization.
The test as committed without the tree-vrp.cc change only FAILs with
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
The DEBUG code in there was just to make it easier to debug, but doesn't
actually fail when the test is miscompiled.
We don't need such debugging code in simple tests like that, but it is
useful if they abort when miscompiled.
With this patch without the tree-vrp.cc change I see
FAIL: gcc.dg/pr116588.c execution test
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
and with it it passes.
2024-09-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/116588
* gcc.dg/pr116588.c: Remove -DDEBUG from dg-options.
(main): Remove debugging code and simplify.
Fix up to make 'gcc.dg/opt-ordered-and-nonequal-1.c' of
commit 91421e21e8
"Match: Fix ordered and nonequal" work for default
'LOGICAL_OP_NON_SHORT_CIRCUIT == false' configurations.
PR testsuite/116635
gcc/testsuite/
* gcc.dg/opt-ordered-and-nonequal-1.c: Fix re
'LOGICAL_OP_NON_SHORT_CIRCUIT'.
This small cleanup removes a redundant check for gimple_assign_cast_p and reformats
based on that. Also changes the if statement that checks if the integral type and the
check to see if the constant fits into the new type such that it returns null
and reformats based on that.
Also moves the check for has_single_use earlier so it is less complex still a cheaper
check than some of the others (like the check on the integer side).
This was noticed when adding a few new things to factor_out_conditional_operation
but those are not ready to submit yet.
Note there are no functional difference with this change.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (factor_out_conditional_operation): Move the has_single_use
checks much earlier. Remove redundant check for gimple_assign_cast_p.
Change around the check if the integral consts fits into the new type.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch will add those recent aliased CPU names into documentation
for clearness.
gcc/ChangeLog:
PR target/116617
* doc/invoke.texi: Add meteorlake, raptorlake and lunarlake.