This patch is to add test cases to check if vectorizer
can exploit vector multiply instrutions on Power, some
of them are supported since Power8, the others are newly
introduced by Power10.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/mul-vectorize-1.c: New test.
* gcc.target/powerpc/mul-vectorize-2.c: New test.
It sounds plausible that this assert
int f();
static_assert(noexcept(sizeof(f())));
should pass: sizeof produces a std::size_t and its operand is not
evaluated, so it can't throw. noexcept should only evaluate to
false for potentially evaluated operands. Therefore I think that
check_noexcept_r shouldn't walk into operands of sizeof/decltype/
alignof/typeof.
PR c++/101087
gcc/cp/ChangeLog:
* cp-tree.h (unevaluated_p): New.
* except.c (check_noexcept_r): Use it. Don't walk into
unevaluated operands.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept70.C: New test.
The "new" IPA-SRA has a more difficult job than the previous
not-truly-IPA version when identifying situations in which a parameter
passed by reference can be passed into a third function and only thee
converted to one passed by value (and possibly "split" at the same
time).
In order to allow this, two conditions must be fulfilled. First the
call to the third function must happen before any modifications of
memory, because it could change the value passed by reference.
Second, in order to make sure we do not introduce new (invalid)
dereferences, the call must postdominate the entry BB.
The second condition is actually not necessary if the caller function
is also certain to dereference the pointer but the first one must
still hold. Unfortunately, the code making this overriding decision
also happen to trigger when the first condition is not fulfilled.
This is fixed in the following patch.
gcc/ChangeLog:
2021-06-16 Martin Jambor <mjambor@suse.cz>
PR ipa/101066
* ipa-sra.c (class isra_call_summary): New member
m_before_any_store, initialize it in the constructor.
(isra_call_summary::dump): Dump the new field.
(ipa_sra_call_summaries::duplicate): Copy it.
(process_scan_results): Set it.
(isra_write_edge_summary): Stream it.
(isra_read_edge_summary): Likewise.
(param_splitting_across_edge): Only override
safe_to_import_accesses if m_before_any_store is set.
gcc/testsuite/ChangeLog:
2021-06-16 Martin Jambor <mjambor@suse.cz>
PR ipa/101066
* gcc.dg/ipa/pr101066.c: New test.
I've noticed that overriding cpu/arch flags when running the testsuite
can cause this test to fail rather than being skipped because of
incompatible flags combination.
Since the test forces -march=armv7-a, make sure it is accepted in
combination with the current runtestflags.
2021-07-08 Christophe Lyon <christophe.lyon@foss.st.om>
gcc/testsuite/
* gcc.dg/debug/pr57351.c: Require arm_arch_v7a_ok
effective-target.
gcc/ada/
* sem_ch5.adb (Analyze_Target_Name): Properly reject a
Target_Name when it appears outside of an assignment statement,
or within the left-hand side of one.
gcc/ada/
* exp_dbug.adb (Get_Encoded_Name): Do not encode names of discrete
types with custom bounds, except with -fgnat-encodings=all.
* exp_pakd.adb (Create_Packed_Array_Impl_Type): Adjust comment.
gcc/ada/
* comperr.adb (Compiler_Abort): Call Sinput.Unlock, because if
this is called late, then Source_Dump would crash otherwise.
* debug.adb: Correct documentation of the -gnatd.9 switch.
* exp_ch4.adb (Expand_Allocator_Expression): Add a comment.
* exp_ch6.adb: Minor comment fixes. Add assertion.
* exp_ch6.ads (Is_Build_In_Place_Result_Type): Correct comment.
* exp_ch7.adb, checks.ads: Minor comment fixes.
gcc/ada/
* par-ch5.adb (P_Iterator_Specification): Add support for access
definition in loop parameter.
* sem_ch5.adb (Check_Subtype_Indication): Renamed...
(Check_Subtype_Definition): ... into this and check for conformance
on access definitions, and improve error messages.
(Analyze_Iterator_Specification): Add support for access definition
in loop parameter.
gcc/ada/
* exp_imgv.adb: Add with and use clause for Restrict and Rident.
(Build_Enumeration_Image_Tables): Do not generate the hash function
if the No_Implicit_Loops restriction is active.
gcc/ada/
* exp_prag.adb (Expand_Pragma_Inspection_Point): After expansion
of the Inspection_Point pragma, check if referenced entities
that have a freeze node are already frozen. If they aren't, emit
a warning and turn the pragma into a no-op.
gcc/ada/
* rtsfind.ads, rtsfind.adb: Add support for finding the packages
System.Atomic_Operations and
System.Atomic_Operations.Test_And_Set and the declarations
within that latter package of the type Test_And_Set_Flag and the
function Atomic_Test_And_Set.
* exp_ch11.adb (Expand_N_Exception_Declaration): If an exception
is declared other than at library level, then we need to call
Register_Exception the first time (and only the first time) the
declaration is elaborated. In order to decide whether to
perform this call for a given elaboration of the declaration, we
used to unconditionally use a (library-level) Boolean variable.
Now we instead use a variable of type
System.Atomic_Operations.Test_And_Set.Test_And_Set_Flag unless
either that type is unavailable or a No_Tasking restriction is
in effect (in which case we use a Boolean variable as before).
gcc/ada/
* libgnat/a-cohama.ads: Introduce an equality operator over
cursors.
* libgnat/a-cohase.ads: Ditto.
* libgnat/a-cohama.adb: Add body for "=" over cursors.
(Insert): Do not set the Position component of the cursor that
denotes the inserted element.
* libgnat/a-cohase.adb: Ditto.
vectorizable_reduction had code guarded by:
if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
|| STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
But that's always true after:
if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_reduction_def
&& STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def
&& STMT_VINFO_DEF_TYPE (stmt_info) != vect_nested_cycle)
return false;
if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
{
…
return true;
}
(I wasn't sure at first how the empty “else” for the first “if” above
was supposed to work.)
gcc/
* tree-vect-loop.c (vectorizable_reduction): Remove always-true
if condition.
match.pd has a rule to simplify an extension, operation and truncation
back to the original type:
(simplify
(convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2)))
Currently it handles cases in which @2 is an INTEGER_CST, but it
also works for POLY_INT_CSTs.[*]
For INTEGER_CST it doesn't matter whether we test @2 or @4,
but for POLY_INT_CST it is possible to have unfolded (convert …)s.
Originally I saw this leading to some bad ivopts decisions, because
we weren't folding away redundancies from candidate iv expressions.
It's also possible to test the fold directly using the SVE ACLE.
[*] Not all INTEGER_CST rules work for POLY_INT_CSTs, since extensions
don't necessarily distribute over the internals of the POLY_INT_CST.
But in this case that isn't an issue.
gcc/
* match.pd: Simplify an extend-operate-truncate sequence involving
a POLY_INT_CST.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/cntb_1.c: New test.
All of the optimizations/transformations mentioned in bugzilla for
PR tree-optimization/40210 are already implemented in mainline GCC,
with one exception. In comment #5, there's a suggestion that
(bswap64(x)>>56)&0xff can be implemented without the bswap as
(unsigned char)x, or equivalently x&0xff.
This patch implements the above optimization, and closely related
variants. For any single bit, (bswap(X)>>C1)&1 can be simplified
to (X>>C2)&1, where bit position C2 is the appropriate permutation
of C1. Similarly, the bswap can eliminated if the desired set of
bits all lie within the same byte, hence (bswap(x)>>8)&255 can
always be optimized, as can (bswap(x)>>8)&123.
Previously,
int foo(long long x) {
return (__builtin_bswap64(x) >> 56) & 0xff;
}
compiled with -O2 to
foo: movq %rdi, %rax
bswap %rax
shrq $56, %rax
ret
with this patch, it now compiles to
foo: movzbl %dil, %eax
ret
2021-07-08 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR tree-optimization/40210
* match.pd (bswap optimizations): Simplify (bswap(x)>>C1)&C2 as
(x>>C3)&C2 when possible. Simplify bswap(x)>>C1 as ((T)x)>>C2
when possible. Simplify bswap(x)&C1 as (x>>C2)&C1 when 0<=C1<=255.
gcc/testsuite/ChangeLog
PR tree-optimization/40210
* gcc.dg/builtin-bswap-13.c: New test.
* gcc.dg/builtin-bswap-14.c: New test.
This patch adds support for the VDIVSQ, VDIVUQ, VMODSQ, and VMODUQ
instructions to do 128-bit arithmetic.
2021-07-07 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/100809
* config/rs6000/rs6000.md (udivti3): New insn.
(divti3): New insn.
(umodti3): New insn.
(modti3): New insn.
gcc/testsuite/
PR target/100809
* gcc.target/powerpc/p10-vdivq-vmodq.c: New test.
I'm working on reimplementing -Wanalyzer-use-of-uninitialized-value, but
I ran into issues with
region_model::add_any_constraints_from_ssa_def_stmt.
This function is from the initial commit of the analyzer and walks the
SSA names finding conditions that were missed due to the GCC 10 era
region_model not retaining useful information on how values were
created; as of GCC 11 the symbolic values contain this information,
and so the conditions can be reconstructed from them instead.
region_model::add_any_constraints_from_ssa_def_stmt is a liability
when tracking uninitialized values as it requires looking up SSA
values when those values may have been purged, thus greatly complicating
detection of uses of uninitialized values.
It's simplest to eliminate it and reimplement the condition-finding
via the makeup of the svalues, which this patch does. Doing so requires
supporting add_condition on svalues rather than just on trees, which
requires some changes to ana::state_machine and its subclasses.
gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (null_assignment_sm_context::get_state):
New overload.
(null_assignment_sm_context::set_next_state): New overload.
(null_assignment_sm_context::get_diagnostic_tree): New.
* engine.cc (impl_sm_context::get_state): New overload.
(impl_sm_context::set_next_state): New overload.
(impl_sm_context::get_diagnostic_tree): New overload.
(impl_region_model_context::on_condition): Convert params from
tree to const svalue *.
* exploded-graph.h (impl_region_model_context::on_condition):
Likewise.
* region-model.cc (region_model::on_call_pre): Move handling of
internal calls to before checking for get_fndecl_for_call.
(region_model::add_constraints_from_binop): New.
(region_model::add_constraint): Split out into a new overload
working on const svalue * rather than tree. Call
add_constraints_from_binop. Drop call to
add_any_constraints_from_ssa_def_stmt.
(region_model::add_any_constraints_from_ssa_def_stmt): Delete.
(region_model::add_any_constraints_from_gassign): Delete.
(region_model::add_any_constraints_from_gcall): Delete.
* region-model.h
(region_model::add_any_constraints_from_ssa_def_stmt): Delete.
(region_model::add_any_constraints_from_gassign): Delete.
(region_model::add_any_constraints_from_gcall): Delete.
(region_model::add_constraint): Add overload decl.
(region_model::add_constraints_from_binop): New decl.
(region_model_context::on_condition): Convert params from tree to
const svalue *.
(noop_region_model_context::on_condition): Likewise.
* sm-file.cc (fileptr_state_machine::condition): Likewise.
* sm-malloc.cc (malloc_state_machine::on_condition): Likewise.
* sm-pattern-test.cc: Include tristate.h, selftest.h,
analyzer/call-string.h, analyzer/program-point.h,
analyzer/store.h, and analyzer/region-model.h.
(pattern_test_state_machine::on_condition): Convert params from tree to
const svalue *.
* sm-sensitive.cc (sensitive_state_machine::on_condition): Delete.
* sm-signal.cc (signal_state_machine::on_condition): Delete.
* sm-taint.cc (taint_state_machine::on_condition): Convert params
from tree to const svalue *.
* sm.cc: Include tristate.h, selftest.h, analyzer/call-string.h,
analyzer/program-point.h, analyzer/store.h, and
analyzer/region-model.h.
(any_pointer_p): Add overload taking const svalue *sval.
* sm.h (any_pointer_p): Add overload taking const svalue *sval.
(state_machine::on_condition): Convert params from tree to
const svalue *. Provide no-op default implementation.
(sm_context::get_state): Add overload taking const svalue *sval.
(sm_context::set_next_state): Likewise.
(sm_context::on_transition): Likewise.
(sm_context::get_diagnostic_tree): Likewise.
* svalue.cc (svalue::all_zeroes_p): New.
(constant_svalue::all_zeroes_p): New.
(repeated_svalue::all_zeroes_p): Convert to vfunc.
* svalue.h (svalue::all_zeroes_p): New decl.
(constant_svalue::all_zeroes_p): New decl.
(repeated_svalue::all_zeroes_p): Convert decl to vfunc.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/pattern-test-2.c: Update expected results.
* gcc.dg/plugin/analyzer_gil_plugin.c
(gil_state_machine::on_condition): Remove.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The previous MMA patch added some fragile code to initialize its new
built-ins. This patch hardens the initialization.
2021-07-07 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.c (mma_init_builtins): Use VSX_BUILTIN_LXVP
and VSX_BUILTIN_STXVP.