mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-12-19 00:55:13 +08:00

Author	SHA1	Message	Date
François Dumont	e7fac1e1a5	libstdc++: [_GLIBCXX_DEBUG] Enhance std::erase_if for vector/deque libstdc++-v3/ChangeLog: * include/std/deque (erase_if): Use _GLIBCXX_STD_C container reference and __niter_wrap to limit _GLIBCXX_DEBUG mode impact. * include/std/vector (erase_if): Likewise.	2021-12-08 19:09:47 +01:00
Hans-Peter Nilsson	60147c2b7d	testsuite: Use attribute "noipa" in sibcall tests ...instead of attribute "noinline". For cris-elf, testsuite/gcc.dg/sibcall-3.c and sibcall-4.c "XPASS", without sibcalls being implemented. On inspection, recurser_void2 is set to be an assembly-level alias for recurser_void1 as in ".set _recurser_void2,_recurser_void1" for both these cases. IOW, those "__attribute__((noinline))" should be "__attribute__((noipa))". The astute reader will notice that I also adjust test-cases where self-recursion should occur: as mentioned in sibcall-1.c "self-recursion tail calls are optimized for all targets, regardless of presence of sibcall patterns". But, that optimization happens even with "noipa", as observed by the test-cases still passing for cris-elf after patching. Being of a small mind, I like consistency, but not all the time, so there's hope. testsuite: * gcc.dg/sibcall-1.c, gcc.dg/sibcall-10.c, gcc.dg/sibcall-2.c, gcc.dg/sibcall-3.c, gcc.dg/sibcall-4.c, gcc.dg/sibcall-9.c: Replace attribute "noinline" with "noipa".	2021-12-08 18:50:54 +01:00
Chung-Lin Tang	6c0399378e	OpenMP 5.0: Remove array section base-pointer mapping semantics and other front-end adjustments This patch implements three pieces of functionality: (1) Adjust array section mapping to have standards conforming behavior, mapping array sections should NOT also map the base-pointer: struct S { int ptr; ... }; struct S s; Instead of generating this during gimplify: map(to:_1 [len: 400]) map(attach:s.ptr [bias: 0]) Now, adjust to: (i.e. do not map the base-pointer together. The attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: (A small Fortran front-end patch to trans-openmp.c:gfc_trans_omp_array_section is also included, which removes generation of a GOMP_MAP_ALWAYS_POINTER for array types, which appears incorrect and causes a regression in libgomp.fortranlibgomp.fortran/struct-elem-map-1.f90) (2) Related to the first item above, are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. These changes are enabled for both OpenACC and OpenMP. gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree offsetp' parameter, accomodate case where 'offset' return of get_inner_reference is non-NULL. (is_or_contains_p): Further robustify conditions. (omp_target_reorder_clauses): In alloc/to/from sorting phase, also move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting phase where we make sure pointers with an attach/detach map are ordered correctly. (gimplify_scan_omp_clauses): Add modifications to avoid creating GOMP_MAP_STRUCT and associated alloc map for attach/detach maps. gcc/testsuite/ChangeLog: c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase. * c-c++-common/gomp/target-enter-data-1.c: New testcase. * c-c++-common/gomp/target-implicit-map-2.c: New testcase. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Make sure attached pointer is not overwritten during cross-host/device copying. (gomp_update): Likewise. (gomp_exit_data): Likewise. * testsuite/libgomp.c++/target-11.C: Adjust testcase. * testsuite/libgomp.c++/target-12.C: Likewise. * testsuite/libgomp.c++/target-15.C: Likewise. * testsuite/libgomp.c++/target-16.C: Likewise. * testsuite/libgomp.c++/target-17.C: Likewise. * testsuite/libgomp.c++/target-21.C: Likewise. * testsuite/libgomp.c++/target-23.C: Likewise. * testsuite/libgomp.c/target-23.c: Likewise. * testsuite/libgomp.c/target-29.c: Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.	2021-12-09 00:01:10 +08:00
Roger Sayle	6b49d50a27	nvptx: Use cvt to perform sign-extension of truncation This patch introduces some new define_insn rules to the nvptx backend, to perform sign-extension of a truncation (from and to the same mode), using a single cvt instruction. As an example, the following function int foo(int x) { return (char)x; } with -O2 currently generates: mov.u32 %r24, %ar0; mov.u32 %r26, %r24; cvt.s32.s8 %value, %r26; and with this patch, now generates: mov.u32 %r24, %ar0; cvt.s32.s8 %value, %r24; This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu with a top-level "make" (including newlib) and a "make check" with no new regressions. gcc/ChangeLog: * config/nvptx/nvptx.md (extend_trunc_<mode>2_qi, extend_trunc_<mode>2_hi, extend_trunc_di2_si): New insns. Use cvt to perform sign-extension of truncation in one step. gcc/testsuite/ChangeLog: gcc.target/nvptx/exttrunc-2.c: New test case. * gcc.target/nvptx/exttrunc-3.c: New test case. * gcc.target/nvptx/exttrunc-4.c: New test case. * gcc.target/nvptx/exttrunc-5.c: New test case. * gcc.target/nvptx/exttrunc-6.c: New test case.	2021-12-08 16:42:12 +01:00
Roger Sayle	d3d44a00e5	nvptx: Add test-case gcc.target/nvptx/exttrunc-1.c Add new test-case converting short to char and back to short. Tested on nvptx. gcc/testsuite/ChangeLog: * gcc.target/nvptx/exttrunc-1.c: New test case.	2021-12-08 16:42:04 +01:00
Chung-Lin Tang	0ab29cf0bb	openmp: Improve OpenMP target support for C++ (PR92120) This patch implements several C++ specific mapping capabilities introduced for OpenMP 5.0, including implicit mapping of this[:1] for non-static member functions, zero-length array section mapping of pointer-typed members, lambda captured variable access in target regions, and use of lambda objects inside target regions. Several adjustments to the C/C++ front-ends to allow more member-access syntax as valid is also included. PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (finish_omp_target_clauses): Likewise. * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. (cp_parser_omp_target): Factor out code, adjust into calls to new function finish_omp_target. * pt.c (tsubst_expr): Add call to finish_omp_target_clauses for OMP_TARGET case. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. Remove case of rejecting 'this' when not in declare simd. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. Handle 'A->member' case in map clauses. Remove case of rejecting 'this' when not in declare simd. (struct omp_target_walk_data): New struct for walking over target-directive tree body. (finish_omp_target_clauses_r): New function for tree walk. (finish_omp_target_clauses): New function. (finish_omp_target): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in call to c_parser_omp_variable_list to 'true'. * c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in array base handling. (c_finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> . Adjust struct map handling to handle cases of A and A->B expressions. Under !DECL_P case of GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to struct_deref_set for map(ptr_to_struct) cases. Add MEM_REF case when handling component_ref_p case. Add unshare_expr and gimplification when created GOMP_MAP_STRUCT is not a DECL. Add code to add firstprivate pointer for pointer-to-struct case. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/gomp/target-3.c: New testcase. * g++.dg/gomp/target-3.C: New testcase. * g++.dg/gomp/target-lambda-1.C: New testcase. * g++.dg/gomp/target-lambda-2.C: New testcase. * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. * g++.dg/gomp/target-this-5.C: New testcase. * g++.dg/gomp/this-2.C: Adjust testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-23.C: New testcase. * testsuite/libgomp.c++/target-lambda-1.C: New testcase. * testsuite/libgomp.c++/target-lambda-2.C: New testcase. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase. * testsuite/libgomp.c++/target-this-5.C: New testcase.	2021-12-08 22:29:06 +08:00
Maged Michael	dbf8bd3c2f	libstdc++: Skip atomic instructions in shared_ptr when both counts are 1 This rewrites _Sp_counted_base::_M_release to skip the two atomic instructions that decrement each of the use count and the weak count when both are 1. Benefits: Save the cost of the last atomic decrements of each of the use count and the weak count in _Sp_counted_base. Atomic instructions are significantly slower than regular loads and stores across major architectures. How current code works: _M_release() atomically decrements the use count, checks if it was 1, if so calls _M_dispose(), atomically decrements the weak count, checks if it was 1, and if so calls _M_destroy(). How the proposed algorithm works: _M_release() loads both use count and weak count together atomically (assuming suitable alignment, discussed later), checks if the value corresponds to a 0x1 value in the individual count members, and if so calls _M_dispose() and _M_destroy(). Otherwise, it follows the original algorithm. Why it works: When the current thread executing _M_release() finds each of the counts is equal to 1, then no other threads could possibly hold use or weak references to this control block. That is, no other threads could possibly access the counts or the protected object. There are two crucial high-level issues that I'd like to point out first: - Atomicity of access to the counts together - Proper alignment of the counts together The patch is intended to apply the proposed algorithm only to the case of 64-bit mode, 4-byte counts, and 8-byte aligned _Sp_counted_base. Atomicity - The proposed algorithm depends on the mutual atomicity among 8-byte atomic operations and 4-byte atomic operations on each of the 4-byte halves of the 8-byte aligned 8-byte block. - The standard does not guarantee atomicity of 8-byte operations on a pair of 8-byte aligned 4-byte objects. - To my knowledge this works in practice on systems that guarantee native implementation of 4-byte and 8-byte atomic operations. - __atomic_always_lock_free is used to check for native atomic operations. Alignment - _Sp_counted_base is an internal base class, with a virtual destructor, so it has a vptr at the beginning of the class, and will be aligned to alignof(void) i.e. 8 bytes. - The first members of the class are the 4-byte use count and 4-byte weak count, which will occupy 8 contiguous bytes immediately after the vptr, i.e. they form an 8-byte aligned 8 byte range. Other points: - The proposed algorithm can interact correctly with the current algorithm. That is, multiple threads using different versions of the code with and without the patch operating on the same objects should always interact correctly. The intent for the patch is to be ABI compatible with the current implementation. - The proposed patch involves a performance trade-off between saving the costs of atomic instructions when the counts are both 1 vs adding the cost of loading the 8-byte combined counts and comparison with {0x1, 0x1}. - I noticed a big difference between the code generated by GCC vs LLVM. GCC seems to generate noticeably more code and what seems to be redundant null checks and branches. - The patch has been in use (built using LLVM) in a large environment for many months. The performance gains outweigh the losses (roughly 10 to 1) across a large variety of workloads. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: include/bits/c++config (_GLIBCXX_TSAN): Define macro indicating that TSan is in use. * include/bits/shared_ptr_base.h (_Sp_counted_base::_M_release): Replace definition in primary template with explicit specializations for _S_mutex and _S_atomic policies. (_Sp_counted_base<_S_mutex>::_M_release): New specialization. (_Sp_counted_base<_S_atomic>::_M_release): New specialization, using a single atomic load to access both reference counts at once. (_Sp_counted_base::_M_release_last_use): New member function.	2021-12-08 11:39:34 +00:00
Andrew Stubbs	13b6c7639c	dwarf: Multi-register CFI address support. Add support for architectures such as AMD GCN, in which the pointer size is larger than the register size. This allows the CFI information to include multi-register locations for the stack pointer, frame pointer, and return address. This patch was originally posted by Andrew Stubbs in https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552873.html It has now been re-worked according to the review comments. It does not use DW_OP_piece or DW_OP_LLVM_piece_end. Instead it uses DW_OP_bregx/DW_OP_shl/DW_OP_bregx/DW_OP_plus to build the CFA from multiple consecutive registers. Here is how .debug_frame looks before and after this patch: $ cat factorial.c int factorial(int n) { if (n == 0) return 1; return n * factorial (n - 1); } $ amdgcn-amdhsa-gcc -g factorial.c -O0 -c -o fac.o $ llvm-dwarfdump -debug-frame fac.o * without this patch (edited for brevity)* 00000000 00000014 ffffffff CIE DW_CFA_def_cfa: reg48 +0 DW_CFA_register: reg16 reg50 00000018 0000002c 00000000 FDE cie=00000000 pc=00000000...000001ac DW_CFA_advance_loc4: 96 DW_CFA_offset: reg46 0 DW_CFA_offset: reg47 4 DW_CFA_offset: reg50 8 DW_CFA_offset: reg51 12 DW_CFA_offset: reg16 8 DW_CFA_advance_loc4: 4 DW_CFA_def_cfa_sf: reg46 -16 * with this patch (edited for brevity)* 00000000 00000024 ffffffff CIE DW_CFA_def_cfa_expression: DW_OP_bregx SGPR49+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR48+0, DW_OP_plus DW_CFA_expression: reg16 DW_OP_bregx SGPR51+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR50+0, DW_OP_plus 00000028 0000003c 00000000 FDE cie=00000000 pc=00000000...000001ac DW_CFA_advance_loc4: 96 DW_CFA_offset: reg46 0 DW_CFA_offset: reg47 4 DW_CFA_offset: reg50 8 DW_CFA_offset: reg51 12 DW_CFA_offset: reg16 8 DW_CFA_advance_loc4: 4 DW_CFA_def_cfa_expression: DW_OP_bregx SGPR47+0, DW_OP_const1u 0x20, DW_OP_shl, DW_OP_bregx SGPR46+0, DW_OP_plus, DW_OP_lit16, DW_OP_minus gcc/ChangeLog: * dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg. (dw_frame_pointer_regnum): Likewise. (new_cfi_row): Use set_by_dwreg. (get_cfa_from_loc_descr): Use set_by_dwreg. Support register spans. handle DW_OP_bregx with DW_OP_breg{0-31}. Support DW_OP_lit, DW_OP_const, DW_OP_minus, DW_OP_shl and DW_OP_plus. (lookup_cfa_1): Use set_by_dwreg. (def_cfa_0): Update for cfa_reg and support register spans. (reg_save): Change sreg parameter to struct cfa_reg. Support register spans. (dwf_cfa_reg): New function. (dwarf2out_flush_queued_reg_saves): Use dwf_cfa_reg instead of dwf_regno. (dwarf2out_frame_debug_def_cfa): Likewise. (dwarf2out_frame_debug_adjust_cfa): Likewise. (dwarf2out_frame_debug_cfa_offset): Likewise. Update reg_save usage. (dwarf2out_frame_debug_cfa_register): Likewise. (dwarf2out_frame_debug_expr): Likewise. (create_pseudo_cfg): Use set_by_dwreg. (initial_return_save): Use set_by_dwreg and dwf_cfa_reg, (create_cie_data): Use dwf_cfa_reg. (execute_dwarf2_frame): Use dwf_cfa_reg. (dump_cfi_row): Use set_by_dwreg. * dwarf2out.c (build_span_loc, build_breg_loc): New function. (build_cfa_loc): Support register spans. (build_cfa_aligned_loc): Update cfa_reg usage. (convert_cfa_to_fb_loc_list): Use set_by_dwreg. * dwarf2out.h (struct cfa_reg): New type. (struct dw_cfa_location): Use struct cfa_reg. (build_span_loc): New prototype. co-authored-By: Hafiz Abid Qadeer <abidh@codesourcery.com>	2021-12-08 09:47:16 +00:00
Haochen Jiang	691f05c219	Add combine splitter to transform vpcmpeqd/vpxor/vblendvps to vblendvps for ~op0 gcc/ChangeLog: PR target/100738 * config/i386/sse.md (<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_not_ltint): Add new define_insn_and_split. gcc/testsuite/ChangeLog: PR target/100738 g++.target/i386/pr100738-1.C: New test.	2021-12-08 14:12:07 +08:00
Alexandre Oliva	0485ce9128	[PR103149] detach values through mem only if general regs won't do When hardening compares or conditional branches, we perform redundant tests, and to prevent them from being optimized out, we use asm statements that preserve a value used in a compare, but in a way that the compiler can no longer assume it's the same value, so it can't optimize the redundant test away. We used to use +g, but that requires general regs or mem. You might think that, if a reg constraint can't be satisfied, the register allocator will fall back to memory, but that's not so: we decide on matching MEMs very early on, by using the same addressable operand on both input and output, and only if the constraint does not allow registers. If it does, we use gimple registers and then pseudos as inputs and outputs, and then inputs can be substituted by equivalent expressions, and then, if no register contraint fits (e.g. because that mode won't fit in general regs, or won't fit in regs at all), the register allocator will give up before even trying to allocate some temporary memory to unify input and output. This patch arranges for us to create and use the temporary stack slot if we can tell the mode requires memory, or won't otherwise fit in general regs, and thus to use +m for that asm. for gcc/ChangeLog PR middle-end/103149 * gimple-harden-conditionals.cc (detach_value): Use memory if general regs won't do. for gcc/testsuite/ChangeLog PR middle-end/103149 * gcc.target/aarch64/pr103149.c: New.	2021-12-07 22:14:02 -03:00
GCC Administrator	1f6b0003b6	Daily bump.	2021-12-08 00:16:23 +00:00
Harald Anlauf	9eec77c0df	Fortran: perform array subscript checks only for valid INTEGER bounds gcc/fortran/ChangeLog: PR fortran/103607 * frontend-passes.c (do_subscript): Ensure that array bounds are of type INTEGER before performing checks on array subscripts. gcc/testsuite/ChangeLog: PR fortran/103607 * gfortran.dg/pr103607.f90: New test.	2021-12-07 23:16:18 +01:00
Marek Polacek	cf2cd61dce	c++: Fix decltype-bitfield1.C on i?86 This test was failing on i?86 because of: warning: width of 'A::l' exceeds its type so change the type to 'long long' and make the test run only on arches where sizeof(long long) == 8 to avoid failing like this again. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype-bitfield1.C: Change a type to unsigned long long. Only run on longlong64 targets.	2021-12-07 16:08:20 -05:00
Peter Bergner	4394fd6726	testsuite: Fix check_effective_target_rop_ok [PR103556, PR103586] The new rop_ok effective target test doesn't correctly compute its expression result because a new line starts a new statement. Solution is to remove the new line. 2021-12-07 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ PR testsuite/103556 PR testsuite/103586 * lib/target-supports.exp (check_effective_target_rop_ok): Remove '\n'.	2021-12-07 14:43:30 -06:00
Harald Anlauf	652c287362	Fortran: catch failed simplification of bad stride expression gcc/fortran/ChangeLog: PR fortran/103588 * array.c (gfc_ref_dimen_size): Do not generate internal error on failed simplification of stride expression; just return failure. gcc/testsuite/ChangeLog: PR fortran/103588 * gfortran.dg/pr103588.f90: New test.	2021-12-07 18:46:52 +01:00
Harald Anlauf	f47662204d	Fortran: add check for type of upper bound in case range gcc/fortran/ChangeLog: PR fortran/103591 * match.c (match_case_selector): Check type of upper bound in case range. gcc/testsuite/ChangeLog: PR fortran/103591 * gfortran.dg/select_9.f90: New test.	2021-12-07 18:23:25 +01:00
Martin Liska	8e836af61b	Fix --help -Q output PR middle-end/103438 gcc/ChangeLog: * config/s390/s390.c (s390_valid_target_attribute_inner_p): Use new enum CLVC_INTEGER. * opt-functions.awk: Use new CLVC_INTEGER. * opts-common.c (set_option): Likewise. (option_enabled): Return -1,0,1 for CLVC_INTEGER. (get_option_state): Use new CLVC_INTEGER. (control_warning_option): Likewise. * opts.h (enum cl_var_type): Likewise.	2021-12-07 14:37:02 +01:00
Marek Polacek	3a2257e6b3	c++: Fix for decltype and bit-fields [PR95009] Here, decltype deduces the wrong type for certain expressions involving bit-fields. Unlike in C, in C++ bit-field width is explicitly not part of the type, so I think decltype should never deduce to 'int:N'. The problem isn't that we're not calling unlowered_expr_type--we are--it's that is_bitfield_expr_with_lowered_type only handles certain codes, but not others. For example, += works fine but ++ does not. This also fixes decltype-bitfield2.C where we were crashing (!), but unfortunately it does not fix 84516 or 70733 where the problem is likely a missing call to unlowered_expr_type. It occurs to me now that typeof likely has had the same issue, but this patch should fix that too. PR c++/95009 gcc/cp/ChangeLog: * typeck.c (is_bitfield_expr_with_lowered_type) <case MODIFY_EXPR>: Handle UNARY_PLUS_EXPR, NEGATE_EXPR, NON_LVALUE_EXPR, BIT_NOT_EXPR, PCREMENT_EXPR too. gcc/testsuite/ChangeLog: g++.dg/cpp0x/decltype-bitfield1.C: New test. * g++.dg/cpp0x/decltype-bitfield2.C: New test.	2021-12-07 08:26:25 -05:00
H.J. Lu	7ef68c37b3	x86: Check FUNCTION_DECL before calling cgraph_node::get gcc/ PR target/103594 * config/i386/i386.c (ix86_call_use_plt_p): Check FUNCTION_DECL before calling cgraph_node::get. gcc/testsuite/ PR target/103594 * gcc.dg/pr103594.c: New test.	2021-12-07 05:24:47 -08:00
Richard Biener	6e8a31275f	tree-optimization/103596 - fix missed propagation into switches may_propagate_copy unnecessarily restricts propagating non-abnormals into places that currently contain an abnormal SSA name but are not the PHI argument for an abnormal edge. This causes VN to not elide a CFG path that it assumes is elided, resulting in released SSA names in the IL. The fix is to enhance the may_propagate_copy API to specify the destination is _not_ a PHI argument. I chose to not update only the relevant caller in VN and the may_propagate_copy_into_stmt API at this point because this is a regression and needs backporting. 2021-12-07 Richard Biener <rguenther@suse.de> PR tree-optimization/103596 * tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt): Note we are not propagating into a PHI argument to may_propagate_copy. * tree-ssa-propagate.h (may_propagate_copy): Add argument specifying whether we propagate into a PHI arg. * tree-ssa-propagate.c (may_propagate_copy): Likewise. When not doing so we can replace an abnormal with something else. (may_propagate_into_stmt): Update may_propagate_copy calls. (replace_exp_1): Move propagation checking code to propagate_value and rename to ... (replace_exp): ... this and elide previous wrapper. (propagate_value): Perform checking with adjusted may_propagate_copy call and dispatch to replace_exp. * gcc.dg/torture/pr103596.c: New testcase.	2021-12-07 14:07:40 +01:00
Matthias Kretz	c93e704b9e	Fix hash_map::traverse overload The hash_map::traverse overload taking a non-const Value pointer breaks if the callback returns false. The other overload should behave the same. Signed-off-by: Matthias Kretz <m.kretz@gsi.de> gcc/ChangeLog: * hash-map.h (hash_map::traverse): Let both overloads behave the same. * predict.c (assert_is_empty): Return true, thus not changing behavior.	2021-12-07 13:11:47 +01:00
Tamar Christina	ba6bb287f0	Revert "libstdc++: Fix ctype changed after newlib update." Newlib has reverted the commit that caused us to require a workaround. As such we can now revert the workaround. This reverts commit `0e510ab534`. libstdc++-v3/ChangeLog: PR libstdc++/103305 * config/os/newlib/ctype_base.h (upper, lower, alpha, digit, xdigit, space, print, graph, cntrl, punct, alnum, blank): Revert.	2021-12-07 10:37:30 +00:00
YunQiang Su	30a08286e6	MIPS: R6: load/store can process unaligned address MIPS release 6 requires the lw/ld/sw/sd can work with unaligned address, while it can be implemented by full hardware or trap&emulate. Since it doesn't have to be fully done by hardware, we add a pair of options -m(no-)unaligned-access. Kernels may need them. gcc/ChangeLog: * config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS, STRICT_ALIGNMENT): R6 can unaligned access. * config/mips/mips.md (movmisalign<mode>): Likewise. * config/mips/mips.opt: add -m(no-)unaligned-access * doc/invoke.texi: Likewise. gcc/testsuite/ChangeLog: * gcc.target/mips/mips.exp: add unaligned-access * gcc.target/mips/unaligned-2.c: New test. * gcc.target/mips/unaligned-3.c: New test.	2021-12-07 10:01:23 +08:00
Eugene Rozenfeld	3d9e676793	Improve AutoFDO count propagation algorithm When a basic block A has been annotated with a count and it has only one successor (or predecessor) B, we can propagate the A's count to B. The algoritm without this change could leave B without an annotation if B had other unannotated predecessors (or successors). For example, in the test case I added, the loop header block was left unannotated, which prevented loop unrolling. gcc/ChangeLog: * auto-profile.c (afdo_propagate_edge): Improve count propagation algorithm. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/init-array.c: New test for unrolling inner loops.	2021-12-06 16:59:31 -08:00
GCC Administrator	3a580f967e	Daily bump.	2021-12-07 00:16:23 +00:00
David Malcolm	c9543403c1	analyzer: fix equivalence class state purging [PR103533] Whilst debugging state explosions seen when enabling taint detection with -fanalyzer (PR analyzer/103533), I noticed that constraint manager instances could contain stray, redundant constants, such as this instance: constraint_manager: equiv classes: ec0: {(int)0 == [m_constant]‘0’} ec1: {(size_t)4 == [m_constant]‘4’} constraints: where there are two equivalence classes, each just containing a constant, with no constraints using them. This patch makes constraint_manager::canonicalize more aggressive about purging state, handling the case of purging a redundant EC containing just a constant. gcc/analyzer/ChangeLog: PR analyzer/103533 * constraint-manager.cc (equiv_class::contains_non_constant_p): New. (constraint_manager::canonicalize): Call it when determining redundant ECs. (selftest::test_purging): New selftest. (selftest::run_constraint_manager_tests): Likewise. * constraint-manager.h (equiv_class::contains_non_constant_p): New decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-12-06 18:36:33 -05:00
Paul A. Clarke	325c6163a3	rs6000: Fix errant "vector" instead of "__vector" Fixes `85289ba36c`. 2021-12-06 Paul A. Clarke <pc@us.ibm.com> gcc PR target/103545 * config/rs6000/xmmintrin.h (_mm_movemask_ps): Replace "vector" with "__vector".	2021-12-06 16:17:16 -06:00
Navid Rahimi	7754fddd01	MAINTAINERS: Add myself to write after approval and DCO sections. * MAINTAINERS: Adding myself.	2021-12-06 13:57:54 -08:00
Jose E. Marchesi	bd0a61befc	bpf: mark/remove unused arguments and remove an unused function This patch does a little bit of cleanup by removing some unused arguments, or marking them as unused. It also removes the function ctfc_debuginfo_early_finish_p and the corresponding hook macro definition, which are not used by GCC. gcc/ * config/bpf/bpf.c (bpf_handle_preserve_access_index_attribute): Mark arguments `args' and flags' as unused. (bpf_core_newdecl): Remove unused local `newdecl'. (bpf_core_newdecl): Remove unused argument `loc'. (ctfc_debuginfo_early_finish_p): Remove unused function. (TARGET_CTFC_DEBUGINFO_EARLY_FINISH_P): Remove definition. (bpf_core_walk): Do not pass a location to bpf_core_newdecl.	2021-12-06 21:57:53 +01:00
Richard Sandiford	63c59f054a	ranger: Add shortcuts for single-successor blocks When compiling an optabs.ii at -O2 with a release-checking build, there were 6,643,575 calls to gimple_outgoing_range_stmt_p. 96.8% of them were for blocks with a single successor, which never have a control statement that generates new range info. This patch therefore adds a shortcut for that case. This gives a ~1% compile-time improvement for the test. I tried making the function inline (in the header) so that the single_succ_p didn't need to be repeated, but it seemed to make things slightly worse. gcc/ * gimple-range-edge.cc (gimple_outgoing_range::edge_range_p): Add a shortcut for blocks with single successors. * gimple-range-gori.cc (gori_map::calculate_gori): Likewise.	2021-12-06 18:29:31 +00:00
Richard Sandiford	d27b7e6987	ranger: Optimise irange_union When compiling an optabs.ii at -O2 with a release-checking build, the hottest function in the profile was irange_union. This patch tries to optimise it a bit. The specific changes are: - Use quick_push rather than safe_push, since the final number of entries is known in advance. - Avoid assigning wi::to_wide & co. to a temporary wide_int, such as in: wide_int val_j = wi::to_wide (res[j]); wi::to_wide returns a wide_int "view" of the in-place INTEGER_CST storage. Assigning the result to wide_int forces an unnecessary copy to temporary storage. This is one area where "auto" helps a lot. In the end though, it seemed more readable to inline the wi::to_s rather than use auto. - Use to_widest_int rather than to_wide_int. Both are functionally correct, but to_widest_int is more efficient, for three reasons: - to_wide returns a wide-int representation in which the most significant element might not be canonically sign-extended. This is because we want to allow the storage of an INTEGER_CST like 0x1U << 31 to be accessed directly with both a wide_int view (where only 32 bits matter) and a widest_int view (where many more bits matter, and where the 32 bits are zero-extended to match the unsigned type). However, operating on uncanonicalised wide_int forms is less efficient than operating on canonicalised forms. - to_widest_int has a constant rather than variable precision and there are never any redundant upper bits to worry about. - Using widest_int avoids the need for an overflow check, since there is enough precision to add 1 to any IL constant without wrap-around. This gives a ~2% compile-time speed up with the test above. I also tried adding a path for two single-pair ranges, but it wasn't a win. gcc/ value-range.cc (irange::irange_union): Use quick_push rather than safe_push. Use widest_int rather than wide_int. Avoid assigning wi::to_* results to wide*_int temporaries.	2021-12-06 18:29:30 +00:00
Andrew MacLeod	14dc5b71d7	Use dominators to reduce cache-flling. Before walking the CFG and filling all cache entries, check if the same information is available in a dominator. * gimple-range-cache.cc (ranger_cache::fill_block_cache): Check for a range from dominators before filling the cache. (ranger_cache::range_from_dom): New. * gimple-range-cache.h (ranger_cache::range_from_dom): Add prototype.	2021-12-06 13:27:10 -05:00
Andrew MacLeod	ed4a5f571b	Add BB option for outgoing_edge_range_p and may_reocmpute_p. There are times we only need to know if any edge from a block can calculate a range. * gimple-range-gori.h (class gori_compute):: Add prototypes. * gimple-range-gori.cc (gori_compute::has_edge_range_p): Add alternate API for basic block. Call for edge alterantive. (gori_compute::may_recompute_p): Ditto.	2021-12-06 13:14:53 -05:00
H.J. Lu	2a20407bac	libsanitizer: Update LOCAL_PATCHES Add commit `70b043845d` Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Nov 30 05:31:26 2021 -0800 libsanitizer: Use SSE to save and restore XMM registers to LOCAL_PATCHES. * LOCAL_PATCHES: Add commit `70b043845d`.	2021-12-06 08:17:49 -08:00
H.J. Lu	70b043845d	libsanitizer: Use SSE to save and restore XMM registers Use SSE, instead of AVX, to save and restore XMM registers to support processors without AVX. The affected codes are unused in upstream since https://github.com/llvm/llvm-project/commit/66d4ce7e26a5 and will be removed in https://reviews.llvm.org/D112604 This fixed FAIL: g++.dg/tsan/pthread_cond_clockwait.C -O0 execution test FAIL: g++.dg/tsan/pthread_cond_clockwait.C -O2 execution test on machines without AVX. PR sanitizer/103466 * tsan/tsan_rtl_amd64.S (__tsan_trace_switch_thunk): Replace vmovdqu with movdqu. (__tsan_report_race_thunk): Likewise.	2021-12-06 08:16:49 -08:00
Richard Biener	0dc77a0c49	tree-optimization/103581 - fix masked gather on x86 The recent fix to PR103527 exposed an issue with how the various special casing for AVX512 masks in vect_build_gather_load_calls are handled. The following makes that more obvious, fixing the miscompile of 403.gcc. 2021-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/103581 * tree-vect-stmts.c (vect_build_gather_load_calls): Properly guard all the AVX512 mask cases. * gcc.dg/vect/pr103581.c: New testcase.	2021-12-06 16:17:05 +01:00
Martin Liska	11013814fc	contrib: Filter out -Wreturn-type in fold-const-call.c. contrib/ChangeLog: * filter-clang-warnings.py: Filter out one warning.	2021-12-06 14:09:22 +01:00
Richard Biener	ee01694151	tree-optimization/103544 - SLP reduction chain as SLP reduction issue When SLP reduction chain vectorization support added handling of an outer conversion in the chain picking a failed reduction up as SLP reduction that broke the invariant that the whole reduction was forward reachable. The following plugs that hole noting a future enhancement possibility. 2021-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/103544 * tree-vect-slp.c (vect_analyze_slp): Only add a SLP reduction opportunity if the stmt in question is the reduction root. (dot_slp_tree): Add missing check for NULL child. * gcc.dg/vect/pr103544.c: New testcase.	2021-12-06 12:53:43 +01:00
Jakub Jelinek	4dc6d19222	avr: Fix AVR build [PR71934] On Mon, Dec 06, 2021 at 11:00:30AM +0100, Martin Liška wrote: > Jakub, I think the patch broke avr-linux target: > > g++ -fno-PIE -c -g -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-erro > /home/marxin/Programming/gcc/gcc/config/avr/avr.c: In function ‘void avr_output_data_section_asm_op(const void)’: > /home/marxin/Programming/gcc/gcc/config/avr/avr.c:10097:26: error: invalid conversion from ‘const void’ to ‘const char’ [-fpermissive] This patch fixes that. 2021-12-06 Jakub Jelinek <jakub@redhat.com> PR pch/71934 config/avr/avr.c (avr_output_data_section_asm_op, avr_output_bss_section_asm_op): Change argument type from const void * to const char *.	2021-12-06 11:20:59 +01:00
Tamar Christina	c2c843849a	cse: Make sure duplicate elements are not entered into the equivalence set [PR103404] CSE uses equivalence classes to keep track of expressions that all have the same values at the current point in the program. Normal equivalences through SETs only insert and perform lookups in this set but equivalence determined from comparisons, e.g. (insn 46 44 47 7 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 105 [ iD.2893 ]) (const_int 0 [0]))) "cse.c":18:22 7 {cmpsi_ccno_1} (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ]) (nil))) creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]). This causes a merge to happen between the two equivalence sets denoted by (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively. The operation happens through merge_equiv_classes however this function has an invariant that the classes to be merge not contain any duplicates. This is because it frees entries before merging. The given testcase when using the supplied flags trigger an ICE due to the equivalence set being (rr) p dump_class (class1) Equivalence chain for (reg:SI 105 [ iD.2893 ]): (reg:SI 105 [ iD.2893 ]) $3 = void (rr) p dump_class (class2) Equivalence chain for (const_int 0 [0]): (const_int 0 [0]) (reg:SI 97 [ _10 ]) (reg:SI 97 [ _10 ]) $4 = void This happens because the original INSN being recorded is (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0) (const_vector:V1SI [ (const_int 0 [0]) ])) "cse.c":11:9 1363 {movv1si_internal} (expr_list:REG_UNUSED (reg:SI 97 [ _10 ]) (nil))) and we end up generating two equivalences. the first one is simply that reg:SI 97 is 0. The second one is that 0 can be extracted from the V1SI, so subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0. This nested subreg gets folded away to just reg:SI 97 and we re-insert the same equivalence. This patch changes it so that if the nunits of a subreg is 1 then don't generate a vec_select from the subreg as the subreg will be folded away and we get a dup. gcc/ChangeLog: PR rtl-optimization/103404 * cse.c (find_sets_in_insn): Don't select elements out of a V1 mode subreg. gcc/testsuite/ChangeLog: PR rtl-optimization/103404 * gcc.target/i386/pr103404.c: New test.	2021-12-06 10:16:22 +00:00
liuhongt	d1011a41ef	Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class. When moves between integer and sse registers are cheap. 2021-12-06 Hongtao Liu <Hongtao.liu@intel.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/95740 * config/i386/i386.c (ix86_preferred_reload_class): Allow integer regs when moves between register units are cheap. * config/i386/i386.h (INT_SSE_CLASS_P): New. gcc/testsuite/ChangeLog: * gcc.target/i386/pr95740.c: New test.	2021-12-06 18:15:42 +08:00
Nelson Chu	45116f3420	RISC-V: jal cannot refer to a default visibility symbol for shared object. This is the original binutils bugzilla report, https://sourceware.org/bugzilla/show_bug.cgi?id=28509 And this is the first version of the proposed binutils patch, https://sourceware.org/pipermail/binutils/2021-November/118398.html After applying the binutils patch, I get the the unexpected error when building libgcc, /scratch/nelsonc/riscv-gnu-toolchain/riscv-gcc/libgcc/config/riscv/div.S:42: /scratch/nelsonc/build-upstream/rv64gc-linux/build-install/riscv64-unknown-linux-gnu/bin/ld: relocation R_RISCV_JAL against `__udivdi3' which may bind externally can not be used when making a shared object; recompile with -fPIC Therefore, this patch add an extra hidden alias symbol for __udivdi3, and then use HIDDEN_JUMPTARGET to target a non-preemptible symbol instead. The solution is similar to glibc as follows, https://sourceware.org/git/?p=glibc.git;a=commit;h=68389203832ab39dd0dbaabbc4059e7fff51c29b libgcc/ChangeLog: * config/riscv/div.S: Add the hidden alias symbol for __udivdi3, and then use HIDDEN_JUMPTARGET to target it since it is non-preemptible. * config/riscv/riscv-asm.h: Added new macros HIDDEN_JUMPTARGET and HIDDEN_DEF.	2021-12-06 10:55:01 +08:00
GCC Administrator	b880d1514c	Daily bump.	2021-12-06 00:16:21 +00:00
Iain Sandoe	c9419faef0	Objective-C, NeXT: Reorganise meta-data declarations. This moves the GTY declaration of the meta-data indentifier array into the header that enumerates these and provides shorthand defines for them. This avoids a problem seen with a relocatable PCH implementation. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/objc/ChangeLog: * objc-next-metadata-tags.h (objc_rt_trees): Declare here. * objc-next-runtime-abi-01.c: Remove from here. * objc-next-runtime-abi-02.c: Likewise. * objc-runtime-shared-support.c: Reorder headers, provide a GTY declaration the definition of objc_rt_trees.	2021-12-05 20:22:16 +00:00
David Edelsohn	8d4ef2299c	aix: Move AIX math builtins before new builtin machinery. The new builtin machinery has an early exit, so move the AIX-specific builtins before the new machinery. gcc/ChangeLog: * config/rs6000/rs6000-call.c (rs6000_init_builtins): Move AIX math builtin initialization before new_builtins_are_live.	2021-12-04 20:28:17 -05:00
GCC Administrator	70e4cb66c1	Daily bump.	2021-12-05 00:16:28 +00:00
Marek Polacek	066b3258bb	c++: Add fixed test [PR93614] This was fixed by r11-86. PR c++/93614 gcc/testsuite/ChangeLog: * g++.dg/template/lookup18.C: New test.	2021-12-04 15:29:46 -05:00
Tobias Burnus	689407ef91	Fortran/OpenMP: Support most of 5.1 atomic extensions Implements moste of OpenMP 5.1 atomic extensions, except that 'compare' is parsed but rejected during resolution. (As the trans-openmp.c handling is missing.) gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle weak/compare/fail clause. * gfortran.h (gfc_omp_clauses): Add weak, compare, fail. * openmp.c (enum omp_mask1, gfc_match_omp_clauses, OMP_ATOMIC_CLAUSES): Update for new clauses. (gfc_match_omp_atomic): Update for 5.1 atomic changes. (is_conversion): Support widening in one go. (is_scalar_intrinsic_expr): New. (resolve_omp_atomic): Update for 5.1 atomic changes. * parse.c (parse_omp_oacc_atomic): Update for compare. * resolve.c (gfc_resolve_blocks): Update asserts. * trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/atomic-2.f90: Move now supported code to ... * gfortran.dg/gomp/atomic.f90: here. * gfortran.dg/gomp/atomic-10.f90: New test. * gfortran.dg/gomp/atomic-12.f90: New test. * gfortran.dg/gomp/atomic-15.f90: New test. * gfortran.dg/gomp/atomic-16.f90: New test. * gfortran.dg/gomp/atomic-17.f90: New test. * gfortran.dg/gomp/atomic-18.f90: New test. * gfortran.dg/gomp/atomic-19.f90: New test. * gfortran.dg/gomp/atomic-20.f90: New test. * gfortran.dg/gomp/atomic-22.f90: New test. * gfortran.dg/gomp/atomic-24.f90: New test. * gfortran.dg/gomp/atomic-25.f90: New test. * gfortran.dg/gomp/atomic-26.f90: New test. libgomp/ChangeLog * libgomp.texi (OpenMP 5.1): Update status.	2021-12-04 19:43:46 +01:00
Jonathan Wakely	87710ec7b2	libstdc++: Initialize member in std::match_results [PR103549] This fixes a -Wuninitialized warning for std::cmatch m1, m2; m1=m2; Also name the template parameters in the forward declaration, to get rid of the <template-parameter-1-1> noise in diagnostics. libstdc++-v3/ChangeLog: PR libstdc++/103549 * include/bits/regex.h (match_results): Give names to template parameters in first declaration. (match_results::_M_begin): Add default member-initializer.	2021-12-04 15:55:01 +00:00
Tobias Burnus	b09af56214	libgomp.texi: Update OMP_PLACES libgomp/ChangeLog: * libgomp.texi (OMP_PLACES): Extend description for OMP 5.1 changes.	2021-12-04 13:28:03 +01:00

1 2 3 4 5 ...

190249 Commits