Commit Graph

187182 Commits

Author SHA1 Message Date
Richard Biener
3c91efec15 tree-optimization/101615 - SLP permute opt with CTOR roots
CTOR roots are not explicitely represented so we have to make sure
to materialize permutes on SLP graph entries to them.

2021-07-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101615
	* tree-vect-slp.c (vect_optimize_slp): Materialize permutes
	at CTOR SLP graph entries.

	* gcc.dg/vect/bb-slp-pr101615-2.c: New testcase.
2021-07-28 17:41:25 +02:00
Kyrylo Tkachov
8b06ccb20e aarch64: Add smov alternative to sign_extend pattern
In the testcase here we were generating a umov + sxth to move
a half-word value from SIMD to GP regs with sign-extension.
We can use a single smov instruction for it instead but the
sign-extend pattern was missing the right alternative.
The *zero_extend<SHORT:mode><GPI:mode>2_aarch64 pattern for
zero-extension already has the right alternative for
the analogous umov instruction, so this mirrors that pattern.

Bootstrapped and tested on aarch64-none-linux-gnu.

The test gcc.target/aarch64/sve/clastb_4.c is adjusted to scan for
the clastb  h0, p0, h0, z0.h form
instead of
the clastb  w0, p0, w0, z0.h form.

This is an improvement as the W forms of the clast instructions are more expensive.

gcc/ChangeLog:

	* config/aarch64/aarch64.md (*extend<SHORT:mode><GPI:mode>2_aarch64):
	Add "r,w" alternative.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/smov_1.c: New test.
	* gcc.target/aarch64/sve/clastb_4.c: Adjust clast scan-assembler.
2021-07-28 16:34:03 +01:00
H.J. Lu
9775e465c1 x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
There is no SSE <-> AVX transition penalty if the upper bits of YMM/ZMM
registers are unchanged and YMM/ZMM store doesn't change the upper bits
of YMM/ZMM registers.

1. Since zeroing YMM/ZMM register is implemented with zeroing XMM
register, don't set AVX_U128_DIRTY when zeroing YMM/ZMM register.
2. Since store doesn't change the INIT state on the upper bits of
YMM/ZMM register, don't set AVX_U128_DIRTY on store if the source
of store was never non-zero.

Here are the vzeroupper count differences on SPEC CPU 2017 with

-Ofast -march=skylake-avx512

                Before  After    Diff
500.perlbench_r	226	225	-0.44%
502.gcc_r      	1263	1103	-12.67%
503.bwaves_r   	14	14	0.00%
505.mcf_r      	29	28	-3.45%
507.cactuBSSN_r	4651	4628	-0.49%
508.namd_r     	433	432	-0.23%
510.parest_r   	20380	19347	-5.07%
511.povray_r   	495	452	-8.69%
519.lbm_r      	2	2	0.00%
520.omnetpp_r  	5954	5677	-4.65%
521.wrf_r      	12353	12339	-0.11%
523.xalancbmk_r	13137	13001	-1.04%
525.x264_r     	192	191	-0.52%
526.blender_r  	2515	2366	-5.92%
527.cam4_r     	4601	4583	-0.39%
531.deepsjeng_r	20	19	-5.00%
538.imagick_r  	898	805	-10.36%
541.leela_r    	427	399	-6.56%
544.nab_r      	74	74	0.00%
548.exchange2_r	72	72	0.00%
549.fotonik3d_r	318	318	0.00%
554.roms_r     	558	554	-0.72%
557.xz_r       	79	52	-34.18%

and performance differences are within noise range.

gcc/

	PR target/101456
	* config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set
	AVX_U128_DIRTY when all bits are zero.

gcc/testsuite/

	PR target/101456
	* gcc.target/i386/pr101456-1.c: New test.
	* gcc.target/i386/pr101456-2.c: Likewise.
2021-07-28 07:15:48 -07:00
Richard Biener
6bb6e2044c tree-optimization/101615 - SLP permute opt of existing vectors
This fixes one issue discovered when analyzing PR101615, namely
we happily push permutes to pre-existing vectors but end up
not actually permuting them.  In fact we don't want to, so force
materialization on the external.

It doesn't fix the original testcase though.

2021-07-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101615
	* tree-vect-slp.c (vect_optimize_slp): Pre-existing vector
	external nodes cannot be permuted so make them perm_out 0.

	* gcc.dg/vect/bb-slp-pr101615-1.c: New testcase.
2021-07-28 15:14:19 +02:00
Andrew Stubbs
1af1666694 amdgcn: Fix attributes for LLVM-12 [PR 100208]
This should work for a wider range of LLVM 12 variants now.
More work required for LLVM 13 though.

gcc/ChangeLog:

	PR target/100208
	* config.in: Regenerate.
	* config/gcn/gcn-hsa.h (A_FIJI): New define.
	(A_900): New define.
	(A_906): New define.
	(A_908): New define.
	(ASM_SPEC): Use A_FIJI, A_900, A_906 and A_908.
	* config/gcn/gcn.c (output_file_start): Adjust attributes according
	to the assembler capabilities.
	* config/gcn/mkoffload.c (main): Likewise.
	* configure: Regenerate.
	* configure.ac: Add tests for LLVM assembler attribute features.
2021-07-28 13:53:05 +01:00
Andrew MacLeod
04600a4722 Return undefined on edges which are not executed.
When a branch has been folded, mark any range requests on the unexecutable edge as
UNDEFINED.

	* gimple-range-gori.cc (gori_compute::outgoing_edge_range_p): Check for
	cond_false and cond_true on branches.
2021-07-28 08:32:38 -04:00
Siddhesh Poyarekar
31534ac26e analyzer: Handle strdup builtins
Consolidate allocator builtin handling and add support for
__builtin_strdup and __builtin_strndup.

gcc/analyzer/ChangeLog:

	* analyzer.cc (is_named_call_p, is_std_named_call_p): Make
	first argument a const_tree.
	* analyzer.h (is_named_call_p, -s_std_named_call_p): Likewise.
	* sm-malloc.cc (known_allocator_p): New function.
	(malloc_state_machine::on_stmt): Use it.

gcc/testsuite/ChangeLog:

	* gcc.dg/analyzer/strdup-1.c (test_4, test_5, test_6): New
	tests.
2021-07-28 17:43:26 +05:30
Siddhesh Poyarekar
84606efb0c analyzer: Recognize __builtin_free as a matching deallocator
Recognize __builtin_free as being equivalent to free when passed into
__attribute__((malloc ())), similar to how it is treated when it is
encountered as a call.  This fixes spurious warnings in glibc where
xmalloc family of allocators as well as reallocarray, memalign,
etc. are declared to have __builtin_free as the free function.

gcc/analyzer/ChangeLog:

	* sm-malloc.cc
	(malloc_state_machine::get_or_create_deallocator): Recognize
	__builtin_free.

gcc/testsuite/ChangeLog:

	* gcc.dg/analyzer/attr-malloc-1.c (compatible_alloc,
	compatible_alloc2): New extern allocator declarations.
	(test_9, test_10): New tests.
2021-07-28 17:43:15 +05:30
Iain Buclaw
54ec50bada d: Wrong evaluation order of binary expressions (PR101640)
The use of fold_build2 can in some cases swap the order of its operands
if that is the more optimal thing to do.  However this breaks semantic
guarantee of left-to-right evaluation in D.

	PR d/101640

gcc/d/ChangeLog:

	* expr.cc (binary_op): Use build2 instead of fold_build2.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr96429.d: Update test.
	* gdc.dg/pr101640.d: New test.
2021-07-28 13:13:06 +02:00
Iain Buclaw
c936c39f86 d: fix ICE at convert_expr(tree_node*, Type*, Type*) (PR101490)
Both the front-end and code generator had a modulo by zero bug when testing if
a conversion from a static array to dynamic array was valid.

	PR d/101490

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd 27e388b4c.
	* d-codegen.cc (build_array_index): Handle void arrays same as byte.
	* d-convert.cc (convert_expr): Handle converting to zero-sized arrays.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr101490.d: New test.
2021-07-28 13:13:05 +02:00
Iain Buclaw
1a2306ffe7 d: __FUNCTION__ doesn't work in core.stdc.stdio functions without cast (PR101441)
Backports fix from upstream to allow __FUNCTION__ and
__PRETTY_FUNCTION__ to be used as C string literals.

Reviewed-on: https://github.com/dlang/dmd/pull/12923

	PR d/101441

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd f8c1ca928.
2021-07-28 13:13:05 +02:00
Iain Buclaw
b2f6e1de24 d: Compile-time reflection for supported built-ins (PR101127)
In order to allow user-code to determine whether a back-end builtin is
available without error, LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE has been
defined to delay putting back-end builtin functions until the ISA that
defines them has been declared.

However in D, there is no global namespace.  All builtins get pushed
into the `gcc.builtins' module, which is constructed during the semantic
analysis pass, which has already finished by the time target attributes
are evaluated.  So builtins are not pushed by the new langhook because
they would be ultimately ignored.  Builtins exposed to D code then can
now only be altered by the command-line.

	PR d/101127

gcc/d/ChangeLog:

	* d-builtins.cc (d_builtin_function_ext_scope): New function.
	* d-lang.cc (LANG_HOOKS_BUILTIN_FUNCTION_EXT_SCOPE): Define.
	* d-tree.h (d_builtin_function_ext_scope): Declare.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr101127a.d: New test.
	* gdc.dg/pr101127b.d: New test.
2021-07-28 13:13:05 +02:00
Iain Buclaw
3e21361174 d: Change in DotTemplateExp type semantics leading to regression (PR101619)
By giving dot templates a type, meant that properry resolving silently
started passing for code that should never have passed.  The simple fix
is to provide implementations for checkType and checkValue that give an
error about dot templates having neither a value nor type.

Reviewed-on: https://github.com/dlang/dmd/pull/12920

	PR d/101619

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd 1d8386a63.
2021-07-28 13:13:04 +02:00
Ilya Leoshkevich
ea22954e7c IBM Z: Enable LSan and TSan
libsanitizer/ChangeLog:

	* configure.tgt (s390*-*-linux*): Enable LSan and TSan for
	s390x.
2021-07-28 13:03:32 +02:00
Bin Cheng
b662250c1f AArch64: use stable sorting in generating ldp/stp
In some corner cases, we have code as below:
  [base + 0x310] = A
  [base + 0x320] = B
  [base + 0x330] = C
  [base + 0x320] = D
unstable sorting could result in wrong value in offset 0x320.  The
patch fixes it by using gcc_stablesort.

2021-07-28  Bin Cheng  <bin.cheng@linux.alibaba.com>

	* config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): use
	gcc_stablesort.
2021-07-28 17:50:59 +08:00
Bin Cheng
0f95c6b2f7 Don't skip prologue/epilogue when initializing alias.
Register might be modified in prologue/epilogue, which shouldn't
be skipped in alias info analysis.

2021-07-28  Bin Cheng  <bin.cheng@linux.alibaba.com>

gcc/
	* alias.c (init_alias_analysis): Don't skip prologue/epilogue.
2021-07-28 17:44:35 +08:00
Jakub Jelinek
88d0f70a32 i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611]
AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode
it only supports logical and not arithmetic shifts, only AVX512F for
V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
emulation using various sequences, this patch handles the vector >> vector
case.  No need to adjust costs, the previous cost adjustment actually
covers even the vector by vector shifts.
The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right
V{2,4}DImode shifts (once of the original operands, once of sign mask
constant by the vector shift count), xor and subtraction, on each element
(long long) x >> y is done as
(((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
- (0x8000000000000000ULL >> y)
i.e. if x doesn't have in some element the MSB set, it is just the logical
shift, if it does, then the xor and subtraction cause also all higher bits
to be set.

2021-07-28  Jakub Jelinek  <jakub@redhat.com>

	PR target/101611
	* config/i386/sse.md (vashr<mode>3): Split into vashrv8di3 expander
	and vashrv4di3 expander, where the latter requires just TARGET_AVX2
	and has special !TARGET_AVX512VL expansion.
	(vashrv2di3<mask_name>): Rename to ...
	(vashrv2di3): ... this.  Change condition to TARGET_XOP || TARGET_AVX2
	and add special !TARGET_XOP && !TARGET_AVX512VL expansion.

	* gcc.target/i386/avx2-pr101611-1.c: New test.
	* gcc.target/i386/avx2-pr101611-2.c: New test.
2021-07-28 10:52:51 +02:00
Martin Uecker
8af0c50a29 Correct a mistake in a warnung for -Wnonnull.
In the warning for -Wnonnull when warning about array parameters
with bounds > 0 and which are NULL the numbers referring to the
two arguments are switched. This patch corrects the mistake.

2021-07-28  Martin Uecker  <muecker@gwdg.de>

gcc/
	* calls.c (maybe_warn_rdwr_sizes): Correct argument
	numbers in warning that were switched.

gcc/testsuite/
	* gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.
2021-07-28 08:48:30 +02:00
Sandra Loosemore
e78480ad09 Bind(c): Improve error checking in CFI_* functions
This patch adds additional run-time checking for invalid arguments to
CFI_establish and CFI_setpointer.  It also changes existing messages
throughout the CFI_* functions to use PRIiPTR to format CFI_index_t
values instead of casting them to int and using %d (which may not work
on targets where int is a smaller type), simplifies wording of some
messages, and fixes issues with capitalization, typos, and the like.
Additionally some coding standards problems such as >80 character lines
are addressed.

2021-07-24  Sandra Loosemore  <sandra@codesourcery.com>

	PR libfortran/101317

libgfortran/
	* runtime/ISO_Fortran_binding.c: Include <inttypes.h>.
	(CFI_address): Tidy error messages and comments.
	(CFI_allocate): Likewise.
	(CFI_deallocate): Likewise.
	(CFI_establish): Likewise.  Add new checks for validity of
	elem_len when it's used, plus type argument and extents.
	(CFI_is_contiguous): Tidy error messages and comments.
	(CFI_section): Likewise.  Refactor some repetitive code to
	make it more understandable.
	(CFI_select_part): Likewise.
	(CFI_setpointer): Likewise.  Check that source is not an
	unallocated allocatable array or an assumed-size array.

gcc/testsuite/
	* gfortran.dg/ISO_Fortran_binding_17.f90: Fix typo in error
	message patterns.
2021-07-27 21:24:26 -07:00
Sandra Loosemore
b4a9bc7856 Bind(c): Fix bugs in CFI_section
CFI_section was incorrectly adjusting the base pointer for the result
array twice in different ways.  It was also overwriting the array
dimension info in the result descriptor before computing the base
address offset from the source descriptor, which caused problems if
the two descriptors are the same.  This patch fixes both problems and
makes the code simpler, too.

A consequence of this patch is that the result array is now 0-based in
all dimensions instead of starting at the numbering to match the first
element of the source array.  The Fortran standard only specifies the
shape of the result array, not its lower bounds, so this is permitted
and probably less confusing for users as well as implementors.

2021-07-17  Sandra Loosemore  <sandra@codesourcery.com>

	PR libfortran/101310

libgfortran/
	* runtime/ISO_Fortran_binding.c (CFI_section): Fix the base
	address computation and simplify the code.

gcc/testsuite/
	* gfortran.dg/ISO_Fortran_binding_1.c (section_c): Remove
	incorrect assertions.
2021-07-27 21:24:25 -07:00
Sandra Loosemore
a3b350f179 Fix ISO_Fortran_binding.h paths in gfortran testsuite
ISO_Fortran_binding.h is now generated in the libgfortran build
directory where it is on the default include path.  Adjust includes in
the gfortran testsuite not to include an explicit path pointing at the
source directory.

2021-07-27  Sandra Loosemore  <sandra@codesourcery.com>

gcc/testsuite/
	PR libfortran/101305
	* gfortran.dg/ISO_Fortran_binding_1.c: Adjust include path.
	* gfortran.dg/ISO_Fortran_binding_10.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_11.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_12.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_15.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_16.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_17.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_18.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_3.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_5.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_6.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_7.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_8.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_9.c: Likewise.
	* gfortran.dg/PR94327.c: Likewise.
	* gfortran.dg/PR94331.c: Likewise.
	* gfortran.dg/bind_c_array_params_3_aux.c: Likewise.
	* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: Likewise.
	* gfortran.dg/pr93524.c: Likewise.
2021-07-27 21:24:11 -07:00
Sandra Loosemore
c4dc9f5901 Bind(C): Correct sizes of some types in CFI_establish
CFI_establish was failing to set the default elem_len correctly for
CFI_type_cptr, CFI_type_cfunptr, CFI_type_long_double, and
CFI_type_long_double_Complex.

2021-07-13  Sandra Loosemore  <sandra@codesourcery.com>

libgfortran/
	PR libfortran/101305
	* runtime/ISO_Fortran_binding.c (CFI_establish): Special-case
	CFI_type_cptr and CFI_type_cfunptr.  Correct size of long double
	on targets where it has kind 10.
2021-07-27 21:20:21 -07:00
Sandra Loosemore
fef67987cf Bind(C): Fix type encodings in ISO_Fortran_binding.h
ISO_Fortran_binding.h had many incorrect hardwired kind encodings in
the definitions of the CFI_type_* macros.  Additionally, not all
targets support all the defined type encodings, and the Fortran
standard requires those macros to have a negative value.

This patch changes ISO_Fortran_binding.h to use sizeof instead of
hard-coded sizes, and assembles it from fragments that reflect the
set of types supported by the target.

2021-07-22  Sandra Loosemore  <sandra@codesourcery.com>
	    Tobias Burnus  <tobias@codesourcery.com>

libgfortran/
	PR libfortran/101305
	* ISO_Fortran_binding.h: Fix hard-coded sizes and split into...
	* ISO_Fortran_binding-1-tmpl.h: New file.
	* ISO_Fortran_binding-2-tmpl.h: New file.
	* ISO_Fortran_binding-3-tmpl.h: New file.
	* Makefile.am: Add rule for generating ISO_Fortran_binding.h.
	Adjust pathnames to that file.
	* Makefile.in: Regenerated.
	* mk-kinds-h.sh: New file.
	* runtime/ISO_Fortran_binding.c: Fix include path.
2021-07-27 21:20:21 -07:00
Kewen Lin
89b3c97eed vect: Fix wrong check in vect_recog_mulhs_pattern [PR101596]
As PR101596 showed, vect_recog_mulhs_pattern uses target_precision to
check the scale_term is expected or not, it could be wrong when the
precision of the actual used new_type larger than target_precision as
shown by the example.

This patch is to use precision of new_type instead of target_precision
for the scale_term matching check.

Bootstrapped & regtested on powerpc64le-linux-gnu P10,
powerpc64-linux-gnu P8, x86_64-redhat-linux and aarch64-linux-gnu.

gcc/ChangeLog:

	PR tree-optimization/101596
	* tree-vect-patterns.c (vect_recog_mulhs_pattern): Fix wrong check
	by using new_type's precision instead.

gcc/testsuite/ChangeLog:

	PR tree-optimization/101596
	* gcc.target/powerpc/pr101596-1.c: New test.
	* gcc.target/powerpc/pr101596-2.c: Likewise.
	* gcc.target/powerpc/pr101596-3.c: Likewise.
2021-07-27 22:04:22 -05:00
liuhongt
872da9a6f6 Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct.
gcc/ChangeLog:

	PR target/99881
	* config/i386/i386.h (processor_costs): Add new member
	integer_to_sse.
	* config/i386/x86-tune-costs.h (ix86_size_cost, i386_cost,
	i486_cost, pentium_cost, lakemont_cost, pentiumpro_cost,
	geode_cost, k6_cost, athlon_cost, k8_cost, amdfam10_cost,
	bdver_cost, znver1_cost, znver2_cost, znver3_cost,
	btver1_cost, btver2_cost, btver3_cost, pentium4_cost,
	nocona_cost, atom_cost, atom_cost, slm_cost, intel_cost,
	generic_cost, core_cost): Initialize integer_to_sse same value
	as sse_op.
	(skylake_cost): Initialize integer_to_sse twice as much as sse_op.
	* config/i386/i386.c (ix86_builtin_vectorization_cost):
	Use integer_to_sse instead of sse_op to calculate the cost of
	vec_construct.

gcc/testsuite/ChangeLog:

	PR target/99881
	* gcc.target/i386/pr99881.c: New test.
2021-07-28 10:48:39 +08:00
GCC Administrator
af3f12e6e8 Daily bump. 2021-07-28 00:16:25 +00:00
Bill Schmidt
7590016ba9 rs6000: Write static initializations for overload tables
2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-gen-builtins.c (write_ovld_static_init): New
	function.
	(write_init_file): Call write_ovld_static_init.
2021-07-27 18:55:33 -04:00
Bill Schmidt
bb4d8febb3 rs6000: Write static initializations for built-in table
2021-07-26  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-gen-builtins.c (write_bif_static_init): New
	function.
	(write_init_file): Call write_bif_static_init.
2021-07-27 18:55:25 -04:00
Bill Schmidt
5b58057b6e rs6000: Write output to the builtins init file, part 3 of 3
2021-07-27  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-gen-builtins.c (typemap): New struct.
	(TYPE_MAP_SIZE): New macro.
	(type_map): New initialized variable.
	(typemap_cmp): New function.
	(write_type_node): Likewise.
	(write_fntype_init): Implement.
2021-07-27 18:55:14 -04:00
Martin Sebor
6aacd901b8 Let -Wuninitialized assume built-ins don't change const arguments [PR101584].
PR tree-optimization/101584 - missing -Wuninitialized with an allocated object after a built-in call

gcc/ChangeLog:

	PR tree-optimization/101584
	* tree-ssa-uninit.c (builtin_call_nomodifying_p): New function.
	(check_defs): Call it.

gcc/testsuite/ChangeLog:

	PR tree-optimization/101584
	* gcc.dg/uninit-38.c: Remove assertions.
	* gcc.dg/uninit-41.c: New test.
2021-07-27 16:02:54 -06:00
Jonathan Wakely
9360d6cd17 libstdc++: Simplify std::optional::value()
The structure of these functions likely dates from the time before G++
fully supported C++14 extended constexpr, so that the throw expression
had to be the operand of a conditional expression. That is not true now,
so we can use a more straightforward version of the code.

We can also simplify the declaration of __throw_bad_optional_access by
using the C++11-style [[noreturn]] attribute so that a separate
declaration isn't needed.

libstdc++-v3/ChangeLog:

	* include/experimental/optional (__throw_bad_optional_access):
	Replace GNU attribute with C++11 attribute.
	(optional::value, optional::value_or): Use if statements
	instead of conditional expressions.
	* include/std/optional (__throw_bad_optional_access)
	(optional::value, optional::value_or): Likewise.
2021-07-27 21:36:01 +01:00
Jonathan Wakely
b7195fb01f testsuite: Add missing C++ includes to tests [PR101646]
These tests stopped working after some libstdc++ refactoring, because
they aren't including what they use.

gcc/testsuite/ChangeLog:

	PR testsuite/101646
	* g++.dg/coroutines/pr99047.C:
	* g++.dg/pr71655.C:
2021-07-27 21:36:01 +01:00
Martin Sebor
a0f9a5dcc3 Use OEP_DECL_NAME when comparing VLA bounds [PR101585].
Resolves:
PR c/101585 - Bad interaction of -fsanitize=undefined and -Wvla-parameters

gcc/c-family:

	PR c/101585
	* c-warn.c (warn_parm_ptrarray_mismatch): Use OEP_DECL_NAME.

gcc/testsuite:
	PR c/101585
	* gcc.dg/Wvla-parameter-13.c: New test.
2021-07-27 13:51:55 -06:00
Ulrich Drepper
7123ae2455 Implement OpenMP 5.1 section 3.15: omp_display_env
This is a new interface which is easily implemented using the
already existing code for the handling of the OMP_DISPLAY_ENV
environment variable.

libgomp/
	* env.c (wait_policy, stacksize): New static variables,
	move out of handle_omp_display_env.
	(omp_display_env): New function.  The meat of the old
	handle_omp_display_env function.
	(handle_omp_display_env): Change to not take parameters
	and instead use the global variables.  Only perform
	parsing, defer to omp_display_env for the implementation.
	(initialize_env): Remove local variables wait_policy and
	stacksize.  Don't pass parameters to handle_omp_display_env.
	* fortran.c: Add ialias_redirect for omp_display_env.
	(omp_display_env_, omp_display_env_8_): New functions.
	* libgomp.map (OMP_5.1): New version.  Add omp_display_env,
	omp_display_env_, and omp_display_env_8_.
	* omp.h.in: Declare omp_display_env.
	* omp_lib.f90.in: Likewise.
	* omp_lib.h.in: Likewise.
2021-07-27 21:08:41 +02:00
Jeff Law
0853f392a2 Fix argument to pthread_join
gcc/testsuite
	* g++.dg/gcov/gcov-threads-1.C: Fix argument to pthread_join.
2021-07-27 14:15:07 -04:00
Aldy Hernandez
573e20aaca Abstract out (forward) jump threader state handling.
The *forward* jump threader has multiple places where it pushes and
pops state, and where it sets context up for the jump threading
simplifier callback.  Not only are the idioms repetitive, but the only
reason for passing const_and_copies, avail_exprs_stack, and the evrp
engine around are so we can set up context.

As part of my jump threading work, I will divorce the evrp engine from
the DOM jump threader, replacing it with a subset of the path solver I
have just contributed.  Since this will entail passing even more
context around, I've abstracted out the state handling so it can be
passed around in one object.  This cleans up the code, and also makes
it trivial to set up context with another engine in the future.

FWIW, I've used these cleanups and the path solver in a POC to improve
DOM's threaded edges by an additional 5%, and the overall threading
opportunities in the compiler by 1%.  This is in addition to the gains
I have documented in the backwards threader rewrite.

There are no functional changes with this patch.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-dom.c (dom_jump_threader_simplifier):
	Put avail_exprs_stack in the class, instead of passing it to
	jump_threader_simplifier.
	(dom_jump_threader_simplifier::simplify): Add state argument.
	(dom_opt_dom_walker): Add state.
	(pass_dominator::execute): Pass state to threader.
	(dom_opt_dom_walker::before_dom_children): Use state.
	* tree-ssa-threadedge.c (jump_threader::jump_threader): Replace
	arguments by state.
	(jump_threader::record_temporary_equivalences_from_phis):
	Register equivalences through the state variable.
	(jump_threader::record_temporary_equivalences_from_stmts_at_dest):
	Record ranges in a statement through the state variable.
	(jump_threader::simplify_control_stmt_condition): Pass state to
	simplify.
	(jump_threader::simplify_control_stmt_condition_1): Same.
	(jump_threader::thread_around_empty_blocks): Remove obsolete
	comment.
	(jump_threader::thread_through_normal_block): Record equivalences
	on edge through the state variable.
	(jump_threader::thread_across_edge): Abstract state pushing.
	(jt_state::jt_state): New.
	(jt_state::push): New.
	(jt_state::pop): New.
	(jt_state::register_equiv): New.
	(jt_state::record_ranges_from_stmt): New.
	(jt_state::register_equivs_on_edge): New.
	(jump_threader_simplifier::jump_threader_simplifier): Move from
	header.
	(jump_threader_simplifier::simplify): Add state argument.
	* tree-ssa-threadedge.h (class jt_state): New.
	(class jump_threader): Add state to constructor.
	(class jump_threader_simplifier): Add state to simplify.  Remove
	avail_exprs_stack from class.
	* tree-vrp.c (vrp_jump_threader_simplifier::simplify): Add state
	argument.
	(vrp_jump_threader::vrp_jump_threader): Add state.
	(vrp_jump_threader::~vrp_jump_threader): Cleanup state.
2021-07-27 17:58:14 +02:00
Marek Polacek
bee2f80b90 c++: Reject ordered comparison of null pointers [PR99701]
When implementing DR 1512 in r11-467 I neglected to reject ordered
comparison of two null pointers, like nullptr < nullptr.  This patch
fixes that omission.

	DR 1512
	PR c++/99701

gcc/cp/ChangeLog:

	* cp-gimplify.c (cp_fold): Remove {LE,LT,GE,GT_EXPR} from
	a switch.
	* typeck.c (cp_build_binary_op): Reject ordered comparison
	of two null pointers.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/nullptr11.C: Remove invalid tests.
	* g++.dg/cpp0x/nullptr46.C: Add dg-error.
	* g++.dg/cpp2a/spaceship-err7.C: New test.
	* g++.dg/expr/ptr-comp4.C: New test.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
	Move a line...
	* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
	...here.  New test.
2021-07-27 11:38:59 -04:00
Jonathan Wakely
7ffba77d01 libstdc++: Adjust whitespace in <bits/cow_string.h>
libstdc++-v3/ChangeLog:

	* include/bits/cow_string.h: Consistently use tab for
	indentation.
2021-07-27 12:13:42 +01:00
Jonathan Wakely
7b527614dd libstdc++: Move COW string definitions to separate header
This moves the definitions of the COW string to a separate file, so that
they don't need to be preprocessed for the common case. We could also
move the SSO string definitions to a new file, so that they don't need
to be preprocessed for the old ABI case, but that would require more
shovel work because there are some parts of <bits/basic_string.h> and
<bits/basic_string.tcc> that are common to both definitions.

libstdc++-v3/ChangeLog:

	* include/Makefile.am: Add new header.
	* include/Makefile.in: Regenerate.
	* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
	(basic_string): Move definition of Copy-on-Write string to
	new file.
	* include/bits/basic_string.tcc: Likewise.
	* include/bits/cow_string.h: New file.
2021-07-27 12:04:18 +01:00
Jonathan Wakely
16158c9649 libstdc++: Remove unnecessary uses of <utility>
The <algorithm> header includes <utility>, with a comment referring to
UK-300, a National Body comment on the C++11 draft. That comment
proposed to move std::swap to <utility> and then require <algorithm> to
include <utility>. The comment was rejected, so we do not need to
implement the suggestion. For backwards compatibility with C++03 we do
want <algorithm> to define std::swap, but it does so anyway via
<bits/move.h>. We don't need the whole of <utility> to do that.

A few other headers that need std::swap can include <bits/move.h> to
get it, instead of <utility>.

There are several headers that include <utility> to get std::pair, but
they can use <bits/stl_pair.h> to get it without also including the
rel_ops namespace and other contents of <utility>.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/std/algorithm: Do not include <utility>.
	* include/std/functional: Likewise.
	* include/std/regex: Include <bits/stl_pair.h> instead of
	<utility>.
	* include/debug/map.h: Likewise.
	* include/debug/multimap.h: Likewise.
	* include/debug/multiset.h: Likewise.
	* include/debug/set.h: Likewise.
	* include/debug/vector: Likewise.
	* include/bits/fs_path.h: Likewise.
	* include/bits/unique_ptr.h: Do not include <utility>.
	* include/experimental/any: Likewise.
	* include/experimental/executor: Likewise.
	* include/experimental/memory: Likewise.
	* include/experimental/optional: Likewise.
	* include/experimental/socket: Use __exchange instead
	of std::exchange.
	* src/filesystem/ops-common.h: Likewise.
	* testsuite/20_util/default_delete/48631_neg.cc: Adjust expected
	errors to not use a hardcoded line number.
	* testsuite/20_util/default_delete/void_neg.cc: Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/constrained.cc:
	Include <utility> for std::as_const.
	* testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constrained.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_move/constrained.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constrained.cc:
	Likewise.
	* testsuite/23_containers/vector/cons/destructible_debug_neg.cc:
	Adjust dg-error line number.
2021-07-27 12:04:18 +01:00
Jonathan Wakely
261d5a4a45 libstdc++: Reduce header dependencies on <array> and <utility>
This refactoring reduces the memory usage and compilation time to parse
a number of headers that depend on std::pair, std::tuple or std::array.
Previously the headers for these class templates were all intertwined,
due to the common dependency on std::tuple_size, std::tuple_element and
their std::get overloads. This decouples the headers by moving some
parts of <utility> into a new <bits/utility.h> header. This means that
<array> and <tuple> no longer need to include the whole of <utility>,
and <tuple> no longer needs to include <array>.

This decoupling benefits headers such as <thread> and <scoped_allocator>
which only need std::tuple, and so no longer have to parse std::array.

Some other headers such as <any>, <optional> and <variant> no longer
need to include <utility> just for the std::in_place tag types, so
do not have to parse the std::pair definitions.

Removing direct uses of <utility> also means that the std::rel_ops
namespace is not transitively declared by other headers.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/Makefile.am: Add bits/utility.h header.
	* include/Makefile.in: Regenerate.
	* include/bits/utility.h: New file.
	* include/std/utility (tuple_size, tuple_element): Move
	to new header.
	* include/std/type_traits (__is_tuple_like_impl<tuple<T...>>):
	Move to <tuple>.
	(_Index_tuple, _Build_index_tuple, integer_sequence): Likewise.
	(in_place_t, in_place_index_t, in_place_type_t): Likewise.
	* include/bits/ranges_util.h: Include new header instead of
	<utility>.
	* include/bits/stl_pair.h (tuple_size, tuple_element): Move
	partial specializations for std::pair here.
	(get): Move overloads for std::pair here.
	* include/std/any: Include new header instead of <utility>.
	* include/std/array: Likewise.
	* include/std/memory_resource: Likewise.
	* include/std/optional: Likewise.
	* include/std/variant: Likewise.
	* include/std/tuple: Likewise.
	(__is_tuple_like_impl<tuple<T...>>): Move here.
	(get) Declare overloads for std::array.
	* include/std/version (__cpp_lib_tuples_by_type): Change type
	to long.
	* testsuite/20_util/optional/84601.cc: Include <utility>.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill/constrained.cc:
	Likewise.
	* testsuite/23_containers/array/tuple_interface/get_neg.cc:
	Adjust dg-error line numbers.
	* testsuite/std/ranges/access/cbegin.cc: Include <utility>.
	* testsuite/std/ranges/access/cend.cc: Likewise.
	* testsuite/std/ranges/access/end.cc: Likewise.
	* testsuite/std/ranges/single_view.cc: Likewise.
2021-07-27 12:04:18 +01:00
Aldy Hernandez
fcc7c6369f Implement basic block path solver.
This is is the main basic block path solver for use in the ranger-based
backwards threader.  Given a path of BBs, the class can solve the final
conditional or any SSA name used in calculating the final conditional.

gcc/ChangeLog:

	* Makefile.in (OBJS): Add gimple-range-path.o.
	* gimple-range-path.cc: New file.
	* gimple-range-path.h: New file.
2021-07-27 12:01:37 +02:00
Jonathan Wright
3bc9db6a98 simplify-rtx: Push sign/zero-extension inside vec_duplicate
As a general principle, vec_duplicate should be as close to the root
of an expression as possible. Where unary operations have
vec_duplicate as an argument, these operations should be pushed
inside the vec_duplicate.

This patch modifies unary operation simplification to push
sign/zero-extension of a scalar inside vec_duplicate.

This patch also updates all RTL patterns in aarch64-simd.md to use
the new canonical form.

gcc/ChangeLog:

2021-07-19  Jonathan Wright  <jonathan.wright@arm.com>

	* config/aarch64/aarch64-simd.md: Push sign/zero-extension
	inside vec_duplicate for all patterns.
	* simplify-rtx.c (simplify_context::simplify_unary_operation_1):
	Push sign/zero-extension inside vec_duplicate.
2021-07-27 10:42:33 +01:00
Thomas Schwinge
d88a695158 Don't use libgomp 'cbuf' buffering with OpenACC 'async'
The host data might not be computed yet (by an earlier asynchronous compute
region, for example.

	libgomp/
	* target.c (gomp_coalesce_buf_add): Update comment.
	(gomp_copy_host2dev, gomp_map_vars_internal): Don't expect to see
	'aq && cbuf'.
	(gomp_map_vars_internal): Only 'if (!aq)', do
	'gomp_coalesce_buf_add'.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Remove
	XFAIL.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
2021-07-27 11:16:37 +02:00
Julian Brown
9c41f5b9cd Fix OpenACC "ephemeral" asynchronous host-to-device copies
This patch fixes several places in libgomp/target.c where "ephemeral" data
(on the stack or in temporary heap locations) may be used as the source of
an asynchronous host-to-device copy that may not complete before the host
data disappears.

An existing, but flawed, workaround for this problem in the AMD GCN
libgomp offloading plugin is currently present on mainline, and was
posted for the og9 branch here:

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-08/msg00901.html

and previous versions of this patch were posted here (for mainline/og9):

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01482.html
  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01026.html

libgomp/
	* libgomp.h (gomp_copy_host2dev): Update prototype.
	* oacc-mem.c (memcpy_tofrom_device, update_dev_host): Add new
	argument to gomp_copy_host2dev (false).
	* plugin/plugin-gcn.c (struct copy_data): Remove free_src field.
	(copy_data): Don't free src.
	(queue_push_copy): Remove free_src handling.
	(GOMP_OFFLOAD_dev2dev): Update call to queue_push_copy.
	(GOMP_OFFLOAD_openacc_async_host2dev): Remove source-data
	snapshotting.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call to
	queue_push_copy.
	* target.c (goacc_device_copy_async): Add SRCADDR_ORIG parameter.
	(gomp_copy_host2dev): Add EPHEMERAL parameter.  Snapshot source
	data when true, and set up deferred freeing of temporary buffer.
	(gomp_copy_dev2host): Update call to goacc_device_copy_async.
	(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer)
	(gomp_detach_pointer, gomp_map_vars_internal, gomp_update): Update
	calls to gomp_copy_host2dev with appropriate ephemeral argument.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: Remove
	XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-07-27 11:16:27 +02:00
Thomas Schwinge
88c40c36db Add 'libgomp.oacc-c-c++-common/async-data-1-{1,2}.c'
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: New file.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Likewise.

Co-Authored-By: Tom de Vries <tom@codesourcery.com>
2021-07-27 11:16:26 +02:00
Thomas Schwinge
29ddaf43f7 [OpenACC] Clarify sequencing of 'async' data copying vs. profiling events in 'libgomp.oacc-c-c++-common/acc_prof-{init,parallel}-1.c'
... as noticed with GCN offloading.

Fix-up for r271346 (commit 5fae049dc2)
"OpenACC Profiling Interface (incomplete)".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Clarify
	sequencing of 'async' data copying vs. profiling events.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Likewise.
2021-07-27 11:16:25 +02:00
Thomas Schwinge
599e275d7e Fix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', 'libgomp.oacc-fortran/lib-16{,-2}.f90'
Fix-up for r265842 (commit 58168bbf6f)
"[OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/lib-94.c: Fix OpenACC
	'async'/'wait' issue.
	* testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-16.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
2021-07-27 11:16:24 +02:00
Richard Biener
66030d68a7 tree-optimization/101573 - improve uninit warning at -O0
We can improve uninit warnings from the early pass by looking
at PHI arguments on fallthru edges that are uninitialized and
have uses that are before a possible loop exit.  This catches
some cases earlier that we'd only warn in a more confusing
way after early inlining as seen by testcase adjustments.

It introduces

FAIL: gcc.dg/uninit-23.c (test for excess errors)

where we additionally warn

gcc.dg/uninit-23.c:21:13: warning: 't4' is used uninitialized [-Wuninitialized]

which I think is OK even if it's not obvious that the new
warning is an improvement when you look at the obvious source.

Somehow for all cases I never get the `'foo' was declared here`
notes, I didn't dig why that happens but it's odd.

2021-07-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101573
	* tree-ssa-uninit.c (warn_uninit_phi_uses): New function
	looking at uninitialized PHI arg defs in some constrained cases.
	(warn_uninitialized_vars): Call it.
	(execute_early_warn_uninitialized): Calculate dominators.

	* gcc.dg/uninit-pr101573.c: New testcase.
	* gcc.dg/uninit-15-O0.c: Adjust.
	* gcc.dg/uninit-15.c: Likewise.
	* gcc.dg/uninit-23.c: Likewise.
	* c-c++-common/uninit-17.c: Likewise.
2021-07-27 10:46:22 +02:00
Richard Biener
c8ce54c6e6 tree-optimization/39821 - fix cost classification for widening arith
This adjusts the vectorizer to cost vector_stmt for widening
arithmetic instead of vec_promote_demote in the line of telling
the target that stmt_info->stmt is the meaningful piece we cost.

2021-07-27  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/39821
	* tree-vect-stmts.c (vect_model_promotion_demotion_cost): Use
	vector_stmt for widening arithmetic.
	(vectorizable_conversion): Adjust.
2021-07-27 10:41:43 +02:00