Commit Graph

191794 Commits

Author SHA1 Message Date
Jakub Jelinek
758671b88b match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]
We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

2022-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/104675
	* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
	COMPLEX_TYPE.

	* gcc.dg/pr104675-1.c: New test.
	* gcc.dg/pr104675-2.c: New test.
2022-02-25 10:55:17 +01:00
Alexandre Oliva
a9e2ebe839 Revert commit r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb
The patch for PR103302 caused PR104121, and extended the live ranges
of LRA reloads.


for gcc/ChangeLog

	PR target/104121
	PR target/103302
	* expr.cc (emit_move_multi_word): Restore clobbers during LRA.
2022-02-24 22:16:56 -03:00
Alexandre Oliva
33c7df5854 Add testcase from PR103845
This problem was already fixed as part of PR104263: the abnormal edge
that remained from before inlining didn't make sense after inlining.
So this patch adds only the testcase.


for  gcc/testsuite/ChangeLog

	PR tree-optimization/103845
	PR tree-optimization/104263
	* gcc.dg/pr103845.c: New.
2022-02-24 22:16:56 -03:00
Alexandre Oliva
a026b67f8f Cope with NULL dw_cfi_cfa_loc
In def_cfa_0, we may set the 2nd operand's dw_cfi_cfa_loc to NULL, but
then cfi_oprnd_equal_p calls cfa_equal_p with a NULL dw_cfa_location*.
This patch aranges for us to tolerate NULL dw_cfi_cfa_loc.


for  gcc/ChangeLog

	PR middle-end/104540
	* dwarf2cfi.cc (cfi_oprnd_equal_p): Cope with NULL
	dw_cfi_cfa_loc.

for  gcc/testsuite/ChangeLog

	PR middle-end/104540
	* g++.dg/pr104540.C: New.
2022-02-24 22:03:34 -03:00
Alexandre Oliva
e53bb1965d Copy EH phi args for throwing hardened compares
When we duplicate a throwing compare for hardening, the EH edge from
the original compare gets duplicated for the inverted compare, but we
failed to adjust any PHI nodes in the EH block.  This patch adds the
needed adjustment, copying the PHI args from those of the preexisting
edge.


for  gcc/ChangeLog

	PR tree-optimization/103856
	* gimple-harden-conditionals.cc (non_eh_succ_edge): Enable the
	eh edge to be requested through an extra parameter.
	(pass_harden_compares::execute): Copy PHI args in the EH dest
	block for the new EH edge added for the inverted compare.

for  gcc/testsuite/ChangeLog

	PR tree-optimization/103856
	* g++.dg/pr103856.C: New.
2022-02-24 22:03:32 -03:00
GCC Administrator
756a61851c Daily bump. 2022-02-25 00:16:20 +00:00
Jonathan Wakely
41cbcf53dc libstdc++: Fix cast in source_location::current() [PR104602]
This fixes a problem for Clang, which is going to return a non-void
pointer from __builtin_source_location(). The current definition of
std::source_location::current() converts that to void* and then has to
cast it back again in the body (which makes it invalid in a constant
expression). By using the actual type of the returned pointer, we avoid
the problematic cast for Clang.

libstdc++-v3/ChangeLog:

	PR libstdc++/104602
	* include/std/source_location (source_location::current): Use
	deduced type of __builtin_source_location().
2022-02-24 23:42:41 +00:00
Pat Haugen
ae3c4e521d Fix attr-retain-* tescases for 32-bit PowerPC.
PR testsuite/100407

gcc/testsuite/
	* gcc.c-torture/compile/attr-retain-1.c: Add -G0 for 32-bit PowerPC.
	* gcc.c-torture/compile/attr-retain-2.c: Likewise.
2022-02-24 15:33:42 -06:00
Harald Anlauf
916b809fbf Fortran: frontend code for F2018 QUIET specifier to STOP and ERROR STOP
Fortran 2018 allows for a QUIET specifier to the STOP and ERROR STOP
statements.  Whilst the gfortran library code provides support for this
specifier for quite some time, the frontend implementation was missing.

gcc/fortran/ChangeLog:

	PR fortran/84519
	* dump-parse-tree.cc (show_code_node): Dump QUIET specifier when
	present.
	* match.cc (gfc_match_stopcode): Implement parsing of F2018 QUIET
	specifier.  F2018 stopcodes may have non-default integer kind.
	* resolve.cc (gfc_resolve_code): Add checks for QUIET argument.
	* trans-stmt.cc (gfc_trans_stop): Pass QUIET specifier to call of
	library function.

gcc/testsuite/ChangeLog:

	PR fortran/84519
	* gfortran.dg/stop_1.f90: New test.
	* gfortran.dg/stop_2.f: New test.
	* gfortran.dg/stop_3.f90: New test.
	* gfortran.dg/stop_4.f90: New test.
2022-02-24 20:38:13 +01:00
Palmer Dabbelt
8645370af1
RISC-V: Document the degree of position independence that medany affords
The code generated by -mcmodel=medany is defined to be
position-independent, but is not guaranteed to function correctly when
linked into position-independent executables or libraries.  See the
recent discussion at the psABI specification [1] for more details.

It would be better to reject these invalid sequences when linking, but
as pointed out in a recent LD bug [2] there may be some compatibility
issues related to the PCREL_HI20 relocations used to initialize GP.
Given the complexity here it's unlikely we'll be able to reject these
sequences any time soon, so instead just document that these may not
work.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=28789

gcc/ChangeLog:

	* doc/invoke.texi (RISC-V -mcmodel=medany): Document the degree
	of position independence that -mcmodel=medany affords.

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-24 11:29:43 -08:00
Xi Ruoyao
157cc4e011
libgcc: fix a warning calling find_fde_tail
The third parameter of find_fde_tail is an _Unwind_Ptr (which is an
integer type instead of a pointer), but we are passing NULL to it.  This
causes a -Wint-conversion warning.

libgcc/

	* unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call find_fde_tail
	with 0 instead of NULL.
2022-02-25 03:10:37 +08:00
Martin Liska
029851fe70 Fix clang warning in pt.cc
Fixes:

gcc/cp/pt.cc:13755:23: warning: suggest braces around initialization of subobject [-Wmissing-braces]
  tree_vec_map in = { fn, nullptr };

gcc/cp/ChangeLog:

	* pt.cc (defarg_insts_for): Use braces for subobject.
2022-02-24 16:59:01 +01:00
Jose E. Marchesi
39be73d07b bpf: do not --enable-gcov for bpf-*-* targets
This patch changes the build machinery in order to disable the build
of GCOV (both compiler and libgcc) in bpf-*-* targets.  The reason for
this change is that BPF is (currently) too restricted in order to
support the coverage instrumentalization.

Tested in bpf-unknown-none and x86_64-linux-gnu targets.

2022-02-23  Jose E. Marchesi  <jose.marchesi@oracle.com>

gcc/ChangeLog

	PR target/104656
	* configure.ac: --disable-gcov if targetting bpf-*.
	* configure: Regenerate.

libgcc/ChangeLog

	PR target/104656
	* configure.ac: --disable-gcov if targetting bpf-*.
	* configure: Regenerate.
2022-02-24 16:43:57 +01:00
Richard Biener
a4066d3a50 tree-optimization/104676 - free nb_iterations after loop distribution
Loop distribution can release SSA names used in nb_iterations, make
sure to release those.

2022-02-24  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104676
	* tree-loop-distribution.cc (loop_distribution::execute):
	Do a full scev_reset.

	* gcc.dg/torture/pr104676.c: New testcase.
2022-02-24 15:57:55 +01:00
Jakub Jelinek
9251b457eb sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]
The following testcase is miscompiled, because -fipa-pure-const discovers
that bar is const, but when sccvn during fre3 sees
  # .MEM_140 = VDEF <.MEM_96>
  *__pred$__d_43 = _50 (_49);
where _50 value numbers to &bar, it value numbers .MEM_140 to
vuse_ssa_val (gimple_vuse (stmt)).  For const/pure calls that return
a SSA_NAME (or don't have lhs) that is fine, those calls don't store
anything, but if the lhs is present and not an SSA_NAME, value numbering
the vdef to anything but itself means that e.g. walk_non_aliased_vuses
won't consider the call, but the call acts as a store to its lhs.
When it is ignored, sccvn will return whatever has been stored to the
lhs earlier.

I've bootstrapped/regtested an earlier version of this patch, which did the
if (!lhs && gimple_call_lhs (stmt))
  changed |= set_ssa_val_to (vdef, vdef);
part before else if (vnresult->result_vdef), and that regressed
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo \\\\(" 1
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 \\\\(" 1
so this updated patch uses result_vdef there as before and only otherwise
(which I think must be the const/pure case) decides based on whether the
lhs is non-SSA_NAME.

2022-02-24  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/104601
	* tree-ssa-sccvn.cc (visit_reference_op_call): For calls with
	non-SSA_NAME lhs value number vdef to itself instead of e.g. the
	vuse value number.

	* g++.dg/torture/pr104601.C: New test.
2022-02-24 15:29:02 +01:00
Tom de Vries
59b8ade887 [libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c
Add openmp test-cases that test the omp declare variant construct:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
using the available nvptx isas.

Only the one for sm_30 is a dg-do run test-case, the other ones are dg-do
link.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-02-24  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.c/declare-variant-3-sm30.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm35.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm53.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm70.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm75.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm80.c: New test.
	* testsuite/libgomp.c/declare-variant-3.h: New header file.
2022-02-24 11:41:03 +01:00
Tom de Vries
a046033ea0 [nvptx] Add missing t-omp-device isas
In t-omp-device we list isas that can be used in omp declare variant like so:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
and in nvptx_omp_device_kind_arch_isa we handle them.

Update both to reflect the current list of isas.

Tested on x86_64-linux with nvptx accelerator.

gcc/ChangeLog:

2022-02-23  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Handle
	sm_70, sm_75 and sm_80.
	* config/nvptx/t-omp-device: Add sm_53, sm_70, sm_75 and sm_80.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
2022-02-24 09:19:01 +01:00
Tom de Vries
c982d02ffe [nvptx] Add shf.{l,r}.wrap insn
Ptx contains funnel shift operations shf.l.wrap and shf.r.wrap that can be
used to implement 32-bit left or right rotate.

Add define_insns rotlsi3 and rotrsi3.

Tested on nvptx.

gcc/ChangeLog:

2022-02-23  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.md (define_insn "rotlsi3", define_insn
	"rotrsi3"): New define_insn.

gcc/testsuite/ChangeLog:

2022-02-23  Tom de Vries  <tdevries@suse.de>

	* gcc.target/nvptx/rotate-run.c: New test.
	* gcc.target/nvptx/rotate.c: New test.
2022-02-24 09:18:47 +01:00
Tom de Vries
7862f6ccd8 [nvptx] Fix dummy location in gen_comment
I committed "[nvptx] Add -mptx-comment", but tested it in combination with the
proposed "[final] Handle compiler-generated asm insn" (
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590721.html ), so
by itself the commit introduced some regressions:
...
FAIL: gcc.dg/20020426-2.c (internal compiler error: Segmentation fault)
FAIL: gcc.dg/analyzer/zlib-3.c (internal compiler error: Segmentation fault)
FAIL: gcc.dg/pr101223.c (internal compiler error: Segmentation fault)
FAIL: gcc.dg/torture/pr80764.c   -O2  (internal compiler error: Segmentation fault)
...

There are due to cfun->function_start_locus == 0.

Fix these by using DECL_SOURCE_LOCATION (cfun->decl) instead.

Tested on nvptx.

gcc/ChangeLog:

2022-02-23  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.cc (gen_comment): Use
	DECL_SOURCE_LOCATION (cfun->decl) instead of cfun->function_start_locus.
2022-02-24 09:17:27 +01:00
liuhongt
ffb2c67170 Fix typo in <code>v1ti3.
For evex encoding vp{xor,or,and}, suffix is needed.

Or there would be an error for
vpxor %xmm0, %xmm31, %xmm1

Error: unsupported instruction `vpxor'

gcc/ChangeLog:

	* config/i386/sse.md (<code>v1ti3): Add suffix and replace
	isa attr of alternative 2 from avx to avx512vl.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512vl-logicsuffix-1.c: New test.
2022-02-24 09:05:10 +08:00
GCC Administrator
4bf3bac151 Daily bump. 2022-02-24 00:16:22 +00:00
David Malcolm
aee1adf2cd analyzer: handle __attribute__((const)) [PR104434]
When testing -fanalyzer on openblas-0.3, I noticed slightly over 2000
false positives from -Wanalyzer-malloc-leak on code like this:

        if( LAPACKE_lsame( vect, 'b' ) || LAPACKE_lsame( vect, 'p' ) ) {
            pt_t = (lapack_complex_float*)
                LAPACKE_malloc( sizeof(lapack_complex_float) *
                                ldpt_t * MAX(1,n) );
            [...snip...]
        }

        [...snip lots of code...]

        if( LAPACKE_lsame( vect, 'b' ) || LAPACKE_lsame( vect, 'q' ) ) {
            LAPACKE_free( pt_t );
        }

where LAPACKE_lsame is a char-comparison function implemented in a
different TU.
The analyzer naively considers the execution path where:
  LAPACKE_lsame( vect, 'b' ) || LAPACKE_lsame( vect, 'p' )
is true at the malloc guard, but then false at the free guard, which
is thus a memory leak.

This patch makes -fanalyer respect __attribute__((const)), so that the
analyzer treats such functions as returning the same value when given
the same inputs.

I've filed https://github.com/xianyi/OpenBLAS/issues/3543 suggesting that
LAPACKE_lsame be annotated with __attribute__((const)); with that, and
with this patch, the false positives seem to be fixed.

gcc/analyzer/ChangeLog:
	PR analyzer/104434
	* analyzer.h (class const_fn_result_svalue): New decl.
	* region-model-impl-calls.cc (call_details::get_manager): New.
	* region-model-manager.cc
	(region_model_manager::get_or_create_const_fn_result_svalue): New.
	(region_model_manager::log_stats): Log
	m_const_fn_result_values_map.
	* region-model.cc (const_fn_p): New.
	(maybe_get_const_fn_result): New.
	(region_model::on_call_pre): Handle fndecls with
	__attribute__((const)) by calling the above rather than making
	a conjured_svalue.
	* region-model.h (visitor::visit_const_fn_result_svalue): New.
	(region_model_manager::get_or_create_const_fn_result_svalue): New
	decl.
	(region_model_manager::const_fn_result_values_map_t): New typedef.
	(region_model_manager::m_const_fn_result_values_map): New field.
	(call_details::get_manager): New decl.
	* svalue.cc (svalue::cmp_ptr): Handle SK_CONST_FN_RESULT.
	(const_fn_result_svalue::dump_to_pp): New.
	(const_fn_result_svalue::dump_input): New.
	(const_fn_result_svalue::accept): New.
	* svalue.h (enum svalue_kind): Add SK_CONST_FN_RESULT.
	(svalue::dyn_cast_const_fn_result_svalue): New.
	(class const_fn_result_svalue): New.
	(is_a_helper <const const_fn_result_svalue *>::test): New.
	(template <> struct default_hash_traits<const_fn_result_svalue::key_t>):
	New.

gcc/testsuite/ChangeLog:
	PR analyzer/104434
	* gcc.dg/analyzer/attr-const-1.c: New test.
	* gcc.dg/analyzer/attr-const-2.c: New test.
	* gcc.dg/analyzer/attr-const-3.c: New test.
	* gcc.dg/analyzer/pr104434-const.c: New test.
	* gcc.dg/analyzer/pr104434-nonconst.c: New test.
	* gcc.dg/analyzer/pr104434.h: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-23 18:51:26 -05:00
Marek Polacek
cdcea7c1ef c++: Add new test [PR79493]
A nice side effect of r12-1822 was improving the diagnostic
we emit for the following test.

	PR c++/79493

gcc/testsuite/ChangeLog:

	* g++.dg/diagnostic/undeclared1.C: New test.
2022-02-23 12:47:24 -05:00
Marek Polacek
9675ecf7f9 c++: Add fixed test [PR70077]
Fixed with r10-1280.

	PR c++/70077

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/noexcept76.C: New test.
2022-02-23 12:38:02 -05:00
Richard Biener
fdc46830f1 middle-end/104644 - recursion with bswap match.pd pattern
The following patch avoids infinite recursion during generic folding.
The (cmp (bswap @0) INTEGER_CST@1) simplification relies on
(bswap @1) actually being simplified, if it is not simplified, we just
move the bswap from one operand to the other and if @0 is also INTEGER_CST,
we apply the same rule next.

The reason why bswap @1 isn't folded to INTEGER_CST is that the INTEGER_CST
has TREE_OVERFLOW set on it and fold-const-call.cc predicate punts in
such cases:
static inline bool
integer_cst_p (tree t)
{
  return TREE_CODE (t) == INTEGER_CST && !TREE_OVERFLOW (t);
}
The patch uses ! modifier to ensure the bswap is simplified and
extends support to GENERIC by means of requiring !EXPR_P which
is not perfect but a conservative approximation.

2022-02-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104644
	* doc/match-and-simplify.texi: Amend ! documentation.
	* genmatch.cc (expr::gen_transform): Code-generate ! support
	for GENERIC.
	(parser::parse_expr): Allow ! for GENERIC.
	* match.pd (cmp (bswap @0) INTEGER_CST@1): Use ! modifier on
	bswap.

	* gcc.dg/pr104644.c: New test.

Co-Authored-by: Jakub Jelinek <jakub@redhat.com>
2022-02-23 13:51:43 +01:00
Richard Biener
f4ed267fa5 Support SSA name declarations with pointer type
Currently we fail to parse

  int * _3;

as SSA name and instead get a VAR_DECL because of the way the C
frontends declarator specs work.  That causes havoc if those
supposed SSA names are used in PHIs or in other places where
VAR_DECLs are not allowed.  The following fixes the pointer case
in an ad-hoc way - for more complex type declarators we probably
have to find a way to re-use the C frontend grokdeclarator without
actually creating a VAR_DECL there (or maybe make it create an
SSA name).

Pointers appear too often to be neglected though, thus the following
ad-hoc fix for this.  This also adds verification that we do not
end up with SSA names without definitions as can happen when
reducing a GIMPLE testcase.  Instead of working through segfaults
one-by-one we emit errors for all of those at once now.

2022-02-23  Richard Biener  <rguenther@suse.de>

gcc/c
	* gimple-parser.cc (c_parser_parse_gimple_body): Diagnose
	SSA names without definition.
	(c_parser_gimple_declaration): Handle pointer typed SSA names.

gcc/testsuite/
	* gcc.dg/gimplefe-49.c: New testcase.
	* gcc.dg/gimplefe-error-13.c: Likewise.
2022-02-23 12:15:30 +01:00
Richard Biener
6e80c4f1ad tree-optimization/101636 - CTOR vectorization ICE
The following fixes an ICE when vectorizing the defs of a CTOR
results in a different vector type than expected.  That can happen
with AARCH64 SVE and a fixed vector length as noted in r10-5979
and on x86 with AVX512 mask CTORs and trying to re-vectorize
using SSE as shown in this bug.

The fix is simply to reject the vectorization when it didn't
produce the desired type.

2022-02-23  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101636
	* tree-vect-slp.cc (vect_print_slp_tree): Dump the
	vector type of the node.
	(vect_slp_analyze_operations): Make sure the CTOR
	is vectorized with an expected type.
	(vectorize_slp_instance_root_stmt): Revert r10-5979 fix.

	* gcc.target/i386/pr101636.c: New testcase.
	* c-c++-common/torture/pr101636.c: Likewise.
2022-02-23 12:14:14 +01:00
Jakub Jelinek
c8cb5098c7 warn-recursion: Don't warn for __builtin_calls in gnu_inline extern inline functions [PR104633]
The first two testcases show different ways how e.g. the glibc
_FORTIFY_SOURCE wrappers are implemented, and on Winfinite-recursion-3.c
the new -Winfinite-recursion warning emits a false positive warning.

It is a false positive because when a builtin with 2 names is called
through the __builtin_ name (but not all builtins have a name prefixed
exactly like that) from extern inline function with gnu_inline semantics,
it doesn't mean the compiler will ever attempt to use the user inline
wrapper for the call, the __builtin_ just does what the builtin function
is expected to do and either expands into some compiler generated code,
or if the compiler decides to emit a call it will use an actual definition
of the function, but that is not the extern inline gnu_inline function
which is never emitted out of line.
Compared to that, in Winfinite-recursion-5.c the extern inline gnu_inline
wrapper calls the builtin by the same name as the function's name and in
that case it is infinite recursion, we actuall try to inline the recursive
call and also error because the recursion is infinite during inlining;
without always_inline we wouldn't error but it is still infinite recursion,
the user has no control on how many recursive calls we actually inline.

2022-02-22  Jakub Jelinek  <jakub@redhat.com>

	PR c/104633
	* gimple-warn-recursion.cc (pass_warn_recursion::find_function_exit):
	Don't warn about calls to corresponding builtin from extern inline
	gnu_inline wrappers.

	* gcc.dg/Winfinite-recursion-3.c: New test.
	* gcc.dg/Winfinite-recursion-4.c: New test.
	* gcc.dg/Winfinite-recursion-5.c: New test.
2022-02-23 12:03:55 +01:00
Roger Sayle
0677014871 nvptx: Back-end portion of a fix for PR target/104489.
This one line fix/tweak is the back-end specific change for a fix for
PR target/104489, that allows the ISA for GCC's nvptx backend to be bumped
to sm_53.  The machine-independent middle-end pieces were posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html

2022-02-23  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/104489
	* config/nvptx/nvptx.md (*movhf_insn): Add subregs_ok attribute.
2022-02-23 07:24:50 +00:00
Christophe Lyon
fd0ab7c734 arm: Fix typo in auto-vectorized MVE comparisons
I made a last minute renaming of mve_const_bool_vec_to_hi () into
mve_bool_vec_to_const () and forgot to update the call sites in vfp.md
accordingly.

Committed as obvious.

2022-02-23  Christophe Lyon <christophe.lyon@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Fix
	typo.
2022-02-23 06:44:12 +00:00
Cui,Lili
2f0c93326f x86: Update Intel architectures ISA support in documentation.
Since the ISA supported by Intel architectures in the documentation
are inconsistent with the actual, modify them all.

gcc/Changelog:

	* doc/invoke.texi: Update documents for Intel architectures.
2022-02-23 10:24:21 +08:00
GCC Administrator
2cfb33fc1e Daily bump. 2022-02-23 00:16:24 +00:00
Ian Lance Taylor
3d54f1ffaf libgo: update README.gcc
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/387514
2022-02-22 15:34:46 -08:00
Paul A. Clarke
96ee5ce5f8 rs6000: Move g++.dg/ext powerpc tests to g++.target
Also adjust DejaGnu directives, as specifically requiring "powerpc*-*-*" is no
longer required.

2021-02-22  Paul A. Clarke  <pc@us.ibm.com>

gcc/testsuite
	* g++.dg/ext/altivec-1.C: Move to g++.target/powerpc, adjust dg
	directives.
	* g++.dg/ext/altivec-2.C: Likewise.
	* g++.dg/ext/altivec-3.C: Likewise.
	* g++.dg/ext/altivec-4.C: Likewise.
	* g++.dg/ext/altivec-5.C: Likewise.
	* g++.dg/ext/altivec-6.C: Likewise.
	* g++.dg/ext/altivec-7.C: Likewise.
	* g++.dg/ext/altivec-8.C: Likewise.
	* g++.dg/ext/altivec-9.C: Likewise.
	* g++.dg/ext/altivec-10.C: Likewise.
	* g++.dg/ext/altivec-11.C: Likewise.
	* g++.dg/ext/altivec-12.C: Likewise.
	* g++.dg/ext/altivec-13.C: Likewise.
	* g++.dg/ext/altivec-14.C: Likewise.
	* g++.dg/ext/altivec-15.C: Likewise.
	* g++.dg/ext/altivec-16.C: Likewise.
	* g++.dg/ext/altivec-17.C: Likewise.
	* g++.dg/ext/altivec-18.C: Likewise.
	* g++.dg/ext/altivec-cell-1.C: Likewise.
	* g++.dg/ext/altivec-cell-2.C: Likewise.
	* g++.dg/ext/altivec-cell-3.C: Likewise.
	* g++.dg/ext/altivec-cell-4.C: Likewise.
	* g++.dg/ext/altivec-cell-5.C: Likewise.
	* g++.dg/ext/altivec-types-1.C: Likewise.
	* g++.dg/ext/altivec-types-2.C: Likewise.
	* g++.dg/ext/altivec-types-3.C: Likewise.
	* g++.dg/ext/altivec-types-4.C: Likewise.
	* g++.dg/ext/undef-bool-1.C: Likewise.
2022-02-22 17:26:15 -06:00
Harald Anlauf
bc66b471d1 Fortran: skip compile-time shape check if constructor shape is not known
gcc/fortran/ChangeLog:

	PR fortran/104619
	* resolve.cc (resolve_structure_cons): Skip shape check if shape
	of constructor cannot be determined at compile time.

gcc/testsuite/ChangeLog:

	PR fortran/104619
	* gfortran.dg/derived_constructor_comps_7.f90: New test.
2022-02-22 21:34:58 +01:00
Roger Sayle
9d1796d82d Restore bootstrap on x86_64-pc-linux-gnu
This patch resolves the bootstrap failure on x86_64-pc-linux-gnu.

2022-02-22  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386-expand.cc (ix86_expand_cmpxchg_loop): Restore
	bootstrap.
2022-02-22 18:17:24 +00:00
Thomas Schwinge
54f7450232 Get rid of 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref'
Clean-up for commit e2a58ed6dc
"openacc: Middle-end worker-partitioning support":
as of commit 2a3f9f6532
"openacc: Shared memory layout optimisation", we're no longer
running into the vectorizer ICEs for '!ADDR_SPACE_GENERIC_P'.

	gcc/
	* omp-low.cc (omp_build_component_ref): Move function...
	* omp-general.cc (omp_build_component_ref): ... here.  Remove
	'static'.
	* omp-general.h (omp_build_component_ref): Declare function.
	* omp-oacc-neuter-broadcast.cc (oacc_build_component_ref): Remove
	function.
	(build_receiver_ref, build_sender_ref): Call
	'omp_build_component_ref' instead.
2022-02-22 17:53:10 +01:00
Thomas Schwinge
0fe9176f41 Further simplify 'gcc/omp-oacc-neuter-broadcast.cc:record_field_map_t'
Now that I've resolved GCC 'hash_map' issues (a while ago already), we may
further simplify this after commit 049eda8274
"Avoid 'GTY' use for 'gcc/omp-oacc-neuter-broadcast.cc:field_map'": as
'hash_map' Value, directly store 'field_map_t' objects, not pointers to
manually allocated 'field_map_t' objects.

	gcc/
	* omp-oacc-neuter-broadcast.cc (record_field_map_t): Further
	simplify.  Adjust all users.
2022-02-22 17:43:39 +01:00
Thomas Schwinge
f8187b5c0d Fix OpenACC gang-redundant execution in 'libgomp.oacc-fortran/privatized-ref-2.f90'
This was a latent problem, and this commit here now resolves a regression that
after recent commit a78b1ab1df
"amdgcn: Tune default OpenMP/OpenACC GPU utilization" we had (only) seen on a
GCN offloading '-march=gfx908' system:

    {+WARNING: program timed out.+}
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  execution test

Same for other optimization levels.

Make sure that we're not executing non-parallelized code in gang-redundant
mode, by putting these parts into their own 'parallel' constructs, which then
default to 'num_gangs(1)'.

	libgomp/
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Fix OpenACC
	gang-redundant execution.
2022-02-22 17:32:03 +01:00
Segher Boessenkool
537c965880 rs6000: Fix GC on rs6000.c decls for atomic handling (PR88134)
In PR88134 it is pointed out that we do not have GTY markup for some
variables we use for atomic.  So, let's add that.

2022-02-22  Segher Boessenkool  <segher@kernel.crashing.org>

	PR target/88134
	* config/rs6000/rs6000.cc (atomic_hold_decl, atomic_clear_decl,
	atomic_update_decl): Add GTY markup.
2022-02-22 16:20:23 +00:00
Christophe Lyon
e9f8443a91 arm: Add VPR_REG to ALL_REGS
VPR_REG should be part of ALL_REGS, this patch fixes this omission.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	* config/arm/arm.h (REG_CLASS_CONTENTS): Add VPR_REG to ALL_REGS.
2022-02-22 15:55:09 +00:00
Christophe Lyon
c6b4ea7ab1 arm: Convert more MVE/CDE builtins to predicate qualifiers
This patch covers a few non-load/store builtins where we do not use
the <mode> iterator and thus we cannot use <MVE_vpred>.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/arm-builtins.cc (CX_UNARY_UNONE_QUALIFIERS): Use
	predicate.
	(CX_BINARY_UNONE_QUALIFIERS): Likewise.
	(CX_TERNARY_UNONE_QUALIFIERS): Likewise.
	(TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
	(QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
	(QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Delete.
	* config/arm/arm_mve_builtins.def: Use predicated qualifiers.
	* config/arm/mve.md: Use VxBI instead of HI.
2022-02-22 15:55:09 +00:00
Christophe Lyon
6a7c13a0cf arm: Convert more load/store MVE builtins to predicate qualifiers
This patch covers a few builtins where we do not use the <mode>
iterator and thus we cannot use <MVE_vpred>.

For v2di instructions, we keep the HI mode for predicates.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/arm-builtins.cc (STRSBS_P_QUALIFIERS): Use predicate
	qualifier.
	(STRSBU_P_QUALIFIERS): Likewise.
	(LDRGBS_Z_QUALIFIERS): Likewise.
	(LDRGBU_Z_QUALIFIERS): Likewise.
	(LDRGBWBXU_Z_QUALIFIERS): Likewise.
	(LDRGBWBS_Z_QUALIFIERS): Likewise.
	(LDRGBWBU_Z_QUALIFIERS): Likewise.
	(STRSBWBS_P_QUALIFIERS): Likewise.
	(STRSBWBU_P_QUALIFIERS): Likewise.
	* config/arm/mve.md: Use VxBI instead of HI.
2022-02-22 15:55:09 +00:00
Christophe Lyon
724d6566cd arm: Convert more MVE builtins to predicate qualifiers
This patch covers all builtins that have an HI operand and use the
<mode> iterator, thus we can replace HI whe <MVE_vpred>.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/arm-builtins.cc (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ...
	(TERNOP_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this.
	(TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
	(TERNOP_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
	(TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
	(TERNOP_NONE_NONE_IMM_PRED_QUALIFIERS): ... this.
	(TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Change to ...
	(TERNOP_NONE_NONE_UNONE_PRED_QUALIFIERS): ... this.
	(QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ...
	(QUADOP_UNONE_UNONE_NONE_NONE_PRED_QUALIFIERS): ... this.
	(QUADOP_NONE_NONE_NONE_NONE_PRED_QUALIFIERS): New.
	(QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
	(QUADOP_NONE_NONE_NONE_IMM_PRED_QUALIFIERS): ... this.
	(QUADOP_UNONE_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New.
	(QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
	(QUADOP_UNONE_UNONE_NONE_IMM_PRED_QUALIFIERS): ... this.
	(QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
	(QUADOP_NONE_NONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
	(QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
	(QUADOP_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
	(QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ...
	(QUADOP_UNONE_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this.
	(STRS_P_QUALIFIERS): Use predicate qualifier.
	(STRU_P_QUALIFIERS): Likewise.
	(STRSU_P_QUALIFIERS): Likewise.
	(STRSS_P_QUALIFIERS): Likewise.
	(LDRGS_Z_QUALIFIERS): Likewise.
	(LDRGU_Z_QUALIFIERS): Likewise.
	(LDRS_Z_QUALIFIERS): Likewise.
	(LDRU_Z_QUALIFIERS): Likewise.
	(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
	(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
	(BINOP_NONE_NONE_PRED_QUALIFIERS): New.
	(BINOP_UNONE_UNONE_PRED_QUALIFIERS): New.
	* config/arm/arm_mve_builtins.def: Use new predicated qualifiers.
	* config/arm/mve.md: Use MVE_VPRED instead of HI.
2022-02-22 15:55:08 +00:00
Christophe Lyon
e6a4aefce8 arm: Convert remaining MVE vcmp builtins to predicate qualifiers
This is mostly a mechanical change, only tested by the intrinsics
expansion tests.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/arm-builtins.cc (BINOP_UNONE_NONE_NONE_QUALIFIERS):
	Delete.
	(TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ...
	(TERNOP_PRED_NONE_NONE_PRED_QUALIFIERS): ... this.
	(TERNOP_PRED_UNONE_UNONE_PRED_QUALIFIERS): New.
	* config/arm/arm_mve_builtins.def (vcmp*q_n_, vcmp*q_m_f): Use new
	predicated qualifiers.
	* config/arm/mve.md (mve_vcmp<mve_cmp_op>q_n_<mode>)
	(mve_vcmp*q_m_f<mode>): Use MVE_VPRED instead of HI.
2022-02-22 15:55:08 +00:00
Christophe Lyon
df0e57c2c0 arm: Fix vcond_mask expander for MVE (PR target/100757)
The problem in this PR is that we call VPSEL with a mask of vector
type instead of HImode. This happens because operand 3 in vcond_mask
is the pre-computed vector comparison and has vector type.

This patch fixes it by implementing TARGET_VECTORIZE_GET_MASK_MODE,
returning the appropriate VxBI mode when targeting MVE.  In turn, this
implies implementing vec_cmp<mode><MVE_vpred>,
vec_cmpu<mode><MVE_vpred> and vcond_mask_<mode><MVE_vpred>, and we can
move vec_cmp<mode><v_cmp_result>, vec_cmpu<mode><mode> and
vcond_mask_<mode><v_cmp_result> back to neon.md since they are not
used by MVE anymore.  The new *<MVE_vpred> patterns listed above are
implemented in mve.md since they are only valid for MVE. However this
may make maintenance/comparison more painful than having all of them
in vec-common.md.

In the process, we can get rid of the recently added vcond_mve
parameter of arm_expand_vector_compare.

Compared to neon.md's vcond_mask_<mode><v_cmp_result> before my "arm:
Auto-vectorization for MVE: vcmp" patch (r12-834), it keeps the VDQWH
iterator added in r12-835 (to have V4HF/V8HF support), as well as the
(!<Is_float_mode> || flag_unsafe_math_optimizations) condition which
was not present before r12-834 although SF modes were enabled by VDQW
(I think this was a bug).

Using TARGET_VECTORIZE_GET_MASK_MODE has the advantage that we no
longer need to generate vpsel with vectors of 0 and 1: the masks are
now merged via scalar 'ands' instructions operating on 16-bit masks
after converting the boolean vectors.

In addition, this patch fixes a problem in arm_expand_vcond() where
the result would be a vector of 0 or 1 instead of operand 1 or 2.

Since we want to skip gcc.dg/signbit-2.c for MVE, we also add a new
arm_mve effective target.

Reducing the number of iterations in pr100757-3.c from 32 to 8, we
generate the code below:

float a[32];
float fn1(int d) {
  float c = 4.0f;
  for (int b = 0; b < 8; b++)
    if (a[b] != 2.0f)
      c = 5.0f;
  return c;
}

fn1:
	ldr     r3, .L3+48
	vldr.64 d4, .L3              // q2=(2.0,2.0,2.0,2.0)
	vldr.64 d5, .L3+8
	vldrw.32        q0, [r3]     // q0=a(0..3)
	adds    r3, r3, #16
	vcmp.f32        eq, q0, q2   // cmp a(0..3) == (2.0,2.0,2.0,2.0)
	vldrw.32        q1, [r3]     // q1=a(4..7)
	vmrs     r3, P0
	vcmp.f32        eq, q1, q2   // cmp a(4..7) == (2.0,2.0,2.0,2.0)
	vmrs    r2, P0  @ movhi
	ands    r3, r3, r2           // r3=select(a(0..3]) & select(a(4..7))
	vldr.64 d4, .L3+16           // q2=(5.0,5.0,5.0,5.0)
	vldr.64 d5, .L3+24
	vmsr     P0, r3
	vldr.64 d6, .L3+32           // q3=(4.0,4.0,4.0,4.0)
	vldr.64 d7, .L3+40
	vpsel q3, q3, q2             // q3=vcond_mask(4.0,5.0)
	vmov.32 r2, q3[1]            // keep the scalar max
	vmov.32 r0, q3[3]
	vmov.32 r3, q3[2]
	vmov.f32        s11, s12
	vmov    s15, r2
	vmov    s14, r3
	vmaxnm.f32      s15, s11, s15
	vmaxnm.f32      s15, s15, s14
	vmov    s14, r0
	vmaxnm.f32      s15, s15, s14
	vmov    r0, s15
	bx      lr
	.L4:
	.align  3
	.L3:
	.word   1073741824	// 2.0f
	.word   1073741824
	.word   1073741824
	.word   1073741824
	.word   1084227584	// 5.0f
	.word   1084227584
	.word   1084227584
	.word   1084227584
	.word   1082130432	// 4.0f
	.word   1082130432
	.word   1082130432
	.word   1082130432

This patch adds tests that trigger an ICE without this fix.

The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
various types and return values different from 0 and 1 to avoid
commonalization with boolean masks.  In addition, since we should not
need these masks, the tests make sure they are not present.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	PR target/100757
	gcc/
	* config/arm/arm-protos.h (arm_get_mask_mode): New prototype.
	(arm_expand_vector_compare): Update prototype.
	* config/arm/arm.cc (TARGET_VECTORIZE_GET_MASK_MODE): New.
	(arm_vector_mode_supported_p): Add support for VxBI modes.
	(arm_expand_vector_compare): Remove useless generation of vpsel.
	(arm_expand_vcond): Fix select operands.
	(arm_get_mask_mode): New.
	* config/arm/mve.md (vec_cmp<mode><MVE_vpred>): New.
	(vec_cmpu<mode><MVE_vpred>): New.
	(vcond_mask_<mode><MVE_vpred>): New.
	* config/arm/vec-common.md (vec_cmp<mode><v_cmp_result>)
	(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): Move to ...
	* config/arm/neon.md (vec_cmp<mode><v_cmp_result>)
	(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): ... here
	and disable for MVE.
	* doc/sourcebuild.texi (arm_mve): Document new effective-target.

	gcc/testsuite/
	PR target/100757
	* gcc.target/arm/simd/pr100757-2.c: New.
	* gcc.target/arm/simd/pr100757-3.c: New.
	* gcc.target/arm/simd/pr100757-4.c: New.
	* gcc.target/arm/simd/pr100757.c: New.
	* gcc.dg/signbit-2.c: Skip when targeting ARM/MVE.
	* lib/target-supports.exp (check_effective_target_arm_mve): New.
2022-02-22 15:55:07 +00:00
Christophe Lyon
91224cf625 arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates
We make use of qualifier_predicate to describe MVE builtins
prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins,
as they are exercised by the tests added earlier in the series.

Special handling is needed for mve_vpselq because it has a v2di
variant, which has no natural VPR.P0 representation: we keep HImode
for it.

The vector_compare expansion code is updated to use the right VxBI
mode instead of HI for the result.

We extend the existing thumb2_movhi_vfp and thumb2_movhi_fp16 patterns
to use the new MVE_7_HI iterator which covers HI and the new VxBI
modes, in conjunction with the new DB constraint for a constant vector
of booleans.

This patch also adds tests derived from the one provided in PR
target/101325: there is a compile-only test because I did not have
access to anything that could execute MVE code until recently.  I have
been able to add an executable test since QEMU supports MVE.

Instead of adding arm_v8_1m_mve_hw, I update arm_mve_hw so that it
uses add_options_for_arm_v8_1m_mve_fp, like arm_neon_hw does.  This
ensures arm_mve_hw passes even if the toolchain does not generate MVE
code by default.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon <christophe.lyon@arm.com>
	    Richard Sandiford  <richard.sandiford@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/arm/arm-builtins.cc (BINOP_PRED_UNONE_UNONE_QUALIFIERS)
	(BINOP_PRED_NONE_NONE_QUALIFIERS)
	(TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS)
	(TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New.
	* config/arm/arm-protos.h (mve_bool_vec_to_const): New.
	* config/arm/arm.cc (arm_hard_regno_mode_ok): Handle new VxBI
	modes.
	(arm_mode_to_pred_mode): New.
	(arm_expand_vector_compare): Use the right VxBI mode instead of
	HI.
	(arm_expand_vcond): Likewise.
	(simd_valid_immediate): Handle MODE_VECTOR_BOOL.
	(mve_bool_vec_to_const): New.
	(neon_make_constant): Call mve_bool_vec_to_const when needed.
	* config/arm/arm_mve_builtins.def (vcmpneq_, vcmphiq_, vcmpcsq_)
	(vcmpltq_, vcmpleq_, vcmpgtq_, vcmpgeq_, vcmpeqq_, vcmpneq_f)
	(vcmpltq_f, vcmpleq_f, vcmpgtq_f, vcmpgeq_f, vcmpeqq_f, vpselq_u)
	(vpselq_s, vpselq_f): Use new predicated qualifiers.
	* config/arm/constraints.md (DB): New.
	* config/arm/iterators.md (MVE_7, MVE_7_HI): New mode iterators.
	(MVE_VPRED, MVE_vpred): New attribute iterators.
	* config/arm/mve.md (@mve_vcmp<mve_cmp_op>q_<mode>)
	(@mve_vcmp<mve_cmp_op>q_f<mode>, @mve_vpselq_<supf><mode>)
	(@mve_vpselq_f<mode>): Use MVE_VPRED instead of HI.
	(@mve_vpselq_<supf>v2di): Define separately.
	(mov<mode>): New expander for VxBI modes.
	* config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Use
	MVE_7_HI iterator and add support for DB constraint.

	gcc/testsuite/
	PR target/100757
	PR target/101325
	* gcc.dg/rtl/arm/mve-vxbi.c: New test.
	* gcc.target/arm/simd/pr101325.c: New.
	* gcc.target/arm/simd/pr101325-2.c: New.
	* lib/target-supports.exp (check_effective_target_arm_mve_hw): Use
	add_options_for_arm_v8_1m_mve_fp.
2022-02-22 15:55:07 +00:00
Christophe Lyon
884f77b422 arm: Implement MVE predicates as vectors of booleans
This patch implements support for vectors of booleans to support MVE
predicates, instead of HImode.  Since the ABI mandates pred16_t (aka
uint16_t) to represent predicates in intrinsics prototypes, we
introduce a new "predicate" type qualifier so that we can map relevant
builtins HImode arguments and return value to the appropriate vector
of booleans (VxBI).

We have to update test_vector_ops_duplicate, because it iterates using
an offset in bytes, where we would need to iterate in bits: we stop
iterating when we reach the end of the vector of booleans.

In addition, we have to fix the underlying definition of vectors of
booleans because ARM/MVE needs a different representation than
AArch64/SVE. With ARM/MVE the 'true' bit is duplicated over the
element size, so that a true element of V4BI is represented by
'0b1111'.  This patch updates the aarch64 definition of VNx*BI as
needed.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>
	    Richard Sandiford  <richard.sandiford@arm.com>

	gcc/
	PR target/100757
	PR target/101325
	* config/aarch64/aarch64-modes.def (VNx16BI, VNx8BI, VNx4BI,
	VNx2BI): Update definition.
	* config/arm/arm-builtins.cc (arm_init_simd_builtin_types): Add new
	simd types.
	(arm_init_builtin): Map predicate vectors arguments to HImode.
	(arm_expand_builtin_args): Move HImode predicate arguments to VxBI
	rtx. Move return value to HImode rtx.
	* config/arm/arm-builtins.h (arm_type_qualifiers): Add qualifier_predicate.
	* config/arm/arm-modes.def (B2I, B4I, V16BI, V8BI, V4BI): New modes.
	* config/arm/arm-simd-builtin-types.def (Pred1x16_t,
	Pred2x8_t,Pred4x4_t): New.
	* emit-rtl.cc (init_emit_once): Handle all boolean modes.
	* genmodes.cc (mode_data): Add boolean field.
	(blank_mode): Initialize it.
	(make_complex_modes): Fix handling of boolean modes.
	(make_vector_modes): Likewise.
	(VECTOR_BOOL_MODE): Use new COMPONENT parameter.
	(make_vector_bool_mode): Likewise.
	(BOOL_MODE): New.
	(make_bool_mode): New.
	(emit_insn_modes_h): Fix generation of boolean modes.
	(emit_class_narrowest_mode): Likewise.
	* machmode.def: (VECTOR_BOOL_MODE): Document new COMPONENT
	parameter.  Use new BOOL_MODE instead of FRACTIONAL_INT_MODE to
	define BImode.
	* rtx-vector-builder.cc (rtx_vector_builder::find_cached_value):
	Fix handling of constm1_rtx for VECTOR_BOOL.
	* simplify-rtx.cc (native_encode_rtx): Fix support for VECTOR_BOOL.
	(native_decode_vector_rtx): Likewise.
	(test_vector_ops_duplicate): Skip vec_merge test
	with vectors of booleans.
	* varasm.cc (output_constant_pool_2): Likewise.
2022-02-22 15:55:07 +00:00
Christophe Lyon
0d0aaea105 arm: Fix mve_vmvnq_n_<supf><mode> argument mode
The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
<V_elem> iterator instead of HI in mve_vmvnq_n_<supf><mode>.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	* config/arm/mve.md (mve_vmvnq_n_<supf><mode>): Use V_elem mode
	for operand 1.
2022-02-22 15:55:06 +00:00
Christophe Lyon
6769084fdf arm: Add support for VPR_REG in arm_class_likely_spilled_p
VPR_REG is the only register in its class, so it should be handled by
TARGET_CLASS_LIKELY_SPILLED_P, which is achieved by calling
default_class_likely_spilled_p.  No test fails without this patch, but
it seems it should be implemented.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	gcc/
	* config/arm/arm.cc (arm_class_likely_spilled_p): Handle VPR_REG.
2022-02-22 15:55:06 +00:00