mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-12-21 10:05:27 +08:00

Author	SHA1	Message	Date
David Malcolm	e1c0c908f8	analyzer: fix overeager sharing of bounded_range instances [PR102662] This was leading to an assertion failure ICE on a switch stmt when using -fstrict-enums, due to erroneously reusing a range involving one enum with a range involving a different enum. gcc/analyzer/ChangeLog: PR analyzer/102662 * constraint-manager.cc (bounded_range::operator==): Require the types to be the same for equality. gcc/testsuite/ChangeLog: PR analyzer/102662 * g++.dg/analyzer/pr102662.C: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-16 10:23:04 -05:00
Jason Merrill	132f1c2777	c++: improve print_node of PTRMEM_CST It's been inconvenient that pretty-printing of PTRMEM_CST didn't display what member the constant refers to. Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P nodes; the simplest fix for that is to use the tcc_exceptional hook for tcc_constant as well. gcc/cp/ChangeLog: * ptree.c (cxx_print_xnode): Handle PTRMEM_CST. gcc/ChangeLog: * langhooks.h (struct lang_hooks): Adjust comment. * print-tree.c (print_node): Also call print_xnode hook for tcc_constant class.	2021-11-16 10:20:30 -05:00
Andrew Pinski	11c4a06a6c	tree-optimization: [PR103218] Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit This folds Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit inside match.pd. This was already handled in fold-cost by: /* A < 0 ? <sign bit of A> : 0 is simply (A & <sign bit of A>). / I have not removed as we only simplify "a ? POW2 : 0" at the gimple level to "a << CST1" and fold actually does the reverse of folding "(a<0)<<CST" into "(a<0) ? 1<<CST : 0". OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/103218 gcc/ChangeLog: match.pd: New pattern for "((type)(a<0)) << SIGNBITOFA". gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr103218-1.c: New test.	2021-11-16 15:07:39 +00:00
Jonathan Wakely	8d8e8f3ad5	libstdc++: Fix out-of-bound array accesses in testsuite I fixed some undefined behaviour in string tests in r238609, but I only fixed the narrow char versions. This applies the same fixes to the wchar_t ones. These problems were found when testing a patch to make std::basic_string usable in constexpr. libstdc++-v3/ChangeLog: * testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc: Fix reads past the end of strings. * testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc: Likewise. * testsuite/experimental/string_view/operations/compare/wchar_t/1.cc: Likewise.	2021-11-16 14:09:00 +00:00
Jonathan Wakely	9719769471	libstdc++: Fix typos in tests libstdc++-v3/ChangeLog: * testsuite/21_strings/basic_string/allocator/71964.cc: Fix typo. * testsuite/23_containers/set/allocator/71964.cc: Likewise.	2021-11-16 14:08:42 +00:00
Claudiu Zissulescu	b796ab35d1	arc: Update (u)maddhisi4 patterns The (u)maddsihi4 patterns are using the ARC's VMAC2H(U) instruction with null destination, however, VMAC2H(U) doesn't rewrite the accumulator. This patch solves the destination issue of VMAC2H by replacing it with DMACH(U) instruction. gcc/ * config/arc/arc.md (maddhisi4): Use a single move to accumulator. (umaddhisi4): Likewise. (machi): Update pattern. (umachi): Likewise. gcc/testsuite/ * gcc.target/arc/tmac-4.c: New test. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>	2021-11-16 12:34:59 +02:00
Richard Biener	0452064503	tree-optimization/102880 - improve CD-DCE The PR shows a missed control-dependent DCE caused by CFG cleanup merging a forwarder resulting in a partially degenerate PHI node. With control-dependent DCE we need to mark control dependences of incoming edges into PHIs as necessary but that is unnecessarily conservative for the case when two edges have the same value. There is no easy way to mark only a subset of control dependences of both edges necessary so the fix is to produce forwarder blocks where then the control dependence captures the requirements more precisely. For gcc.dg/tree-ssa/ssa-dom-thread-7.c the number of edges in the CFG decrease as we have commonized PHI arguments which in turn results in different threadings. The testcase is too complex and the dump scanning too simple to do anything meaningful here but to adjust the number of expected threads. The same CFG massaging could be useful at RTL expansion time to reduce the number of copies we need to insert on edges. FAIL: gcc.dg/tree-ssa/ssa-hoist-4.c scan-tree-dump-times optimized "MAX_EXPR" 1 2021-11-12 Richard Biener <rguenther@suse.de> PR tree-optimization/102880 * tree-ssa-dce.c (sort_phi_args): New function. (make_forwarders_with_degenerate_phis): Likewise. (perform_tree_ssa_dce): Call make_forwarders_with_degenerate_phis. * gcc.dg/tree-ssa/pr102880.c: New testcase. * gcc.dg/tree-ssa/pr69270-3.c: Robustify. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Change the number of expected threadings.	2021-11-16 11:31:56 +01:00
Richard Biener	f98f373dd8	tree-optimization/102880 - make PHI-OPT recognize more CFGs This allows extra edges into the middle BB for the PHI-OPT transforms using replace_phi_edge_with_variable that do not end up moving stmts from that middle BB. This avoids regressing gcc.dg/tree-ssa/ssa-hoist-4.c with the actual fix for PR102880 where CFG cleanup has the choice to remove two forwarders and picks "the wrong" leading to if (a > b) / /\ / / <BB> / \| # PHI <a, b> rather than if (a > b) \| /\ \| <BB> \ \| / \ \| # PHI <a, b, b> but it's relatively straight-forward to support extra edges into the middle-BB in paths ending in replace_phi_edge_with_variable and that do not require moving stmts. That's because we really only want to remove the edge from the condition to the middle BB. Of course actually doing that means updating dominators in non-trival ways which is why I kept the original code for the single edge case and simply defer to CFG cleanup by adjusting the condition for the complicated case. The testcase needs to be a GIMPLE one since it's quite unreliable to produce the desired CFG. 2021-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/102880 * tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Push single_pred (bb1) condition to places that really need it. (match_simplify_replacement): Likewise. (value_replacement): Likewise. (replace_phi_edge_with_variable): Deal with extra edges into the middle BB. * gcc.dg/tree-ssa/phi-opt-26.c: New testcase.	2021-11-16 11:31:05 +01:00
Claudiu Zissulescu	d699f03720	arc: Update arc specific tests Update assembly output test pattern. Take into consideration also for which platform we do execute the test (baremetal or linux). gcc/testsuite/ChangeLog: * gcc.target/arc/add_n-combine.c: Update test patterns. * gcc.target/arc/builtin_eh.c: Update test for linux platforms. * gcc.target/arc/mul64-1.c: Disable this test while running on linux. * gcc.target/arc/tls-gd.c: Update matching patterns. * gcc.target/arc/tls-ie.c: Likewise. * gcc.target/arc/tls-ld.c: Likewise. * gcc.target/arc/uncached-8.c: Likewise. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>	2021-11-16 11:58:24 +02:00
Martin Jambor	23125fab7b	Replace more DEBUG_EXPR_DECL creations with build_debug_expr_decl As discussed on the mailing list, this patch replaces all but one remaining open coded constructions of DEBUG_EXPR_DECL with calls to build_debug_expr_decl, even if - in order not to introduce any functional change - the mode of the constructed decl is then overwritten. It is not clear if changing the mode has any effect in practice and therefore I have added a FIXME note to code which does it, as requested. After this patch, DEBUG_EXPR_DECLs are created only by build_debug_expr_decl and make_debug_expr_from_rtl which looks like it should be left alone. gcc/ChangeLog: 2021-11-11 Martin Jambor <mjambor@suse.cz> * cfgexpand.c (expand_gimple_basic_block): Use build_debug_expr_decl, add a fixme note about the mode assignment perhaps being unnecessary. * ipa-param-manipulation.c (ipa_param_adjustments::modify_call): Likewise. (ipa_param_body_adjustments::mark_dead_statements): Likewise. (ipa_param_body_adjustments::reset_debug_stmts): Likewise. * tree-inline.c (remap_ssa_name): Likewise. (tree_function_versioning): Likewise. * tree-into-ssa.c (rewrite_debug_stmt_uses): Likewise. * tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise. * tree-ssa.c (insert_debug_temp_for_var_def): Likewise.	2021-11-16 10:45:46 +01:00
Martin Jambor	9f7fc82014	ipa-sra: Testcase that removing a "returns_nonnull" retval works Since we can now remove return values of functions with return_nonnull type attribute, I'll feel a bit safer if we can test this does not ICE when someone attempts to access a non-existent call LHS. Eventually we should probably drop the attribute when this happens. gcc/testsuite/ChangeLog: 2021-11-15 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/ipa-sra-ret-nonull.c: New test.	2021-11-16 10:45:32 +01:00
Jakub Jelinek	9ceaf0fee3	libgomp: Mark thread_limit clause to target construct as implemented After the Fortran changes we can mark it as implemented... 2021-11-16 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (OpenMP 5.1): Mark thread_limit clause to target construct as implemented.	2021-11-16 10:21:56 +01:00
Jakub Jelinek	47de0b56ee	openmp: Regimplify operands of GIMPLE_COND in a few more places [PR103208] As the testcase shows, the non-rectangular loop expansion code didn't try to regimplify operands of GIMPLE_CONDs it built in some cases. I have added a helper function which does that and used it in some places that were regimplifying already to simplify those spots, plus added it in a couple of other places where it was needed. 2021-11-16 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/103208 * omp-expand.c (expand_omp_build_cond): New function. (expand_omp_for_init_counts, expand_omp_for_init_vars, expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Use it. * c-c++-common/gomp/loop-11.c: New test.	2021-11-16 10:19:22 +01:00
Jakub Jelinek	eacdfaf7ca	waccess: Fix up pass_waccess::check_alloc_size_call [PR102009] This function punts if the builtins have no arguments, but as can be seen on the testcase, even if it has some arguments but alloc_size attribute's arguments point to arguments that aren't passed, we get a warning earlier from the FE but should punt rather than ICE on it. Other users of alloc_size attribute e.g. in tree-object-size.c (alloc_object_size) punt similarly and similarly even in the same TU maybe_warn_nonstring_arg correctly verifies calls have enough arguments. 2021-11-16 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102009 * gimple-ssa-warn-access.cc (pass_waccess::check_alloc_size_call): Punt if any of alloc_size arguments is out of bounds vs. number of call arguments. * gcc.dg/pr102009.c: New test.	2021-11-16 10:18:25 +01:00
Roger Sayle	473b5e8734	x86_64: Avoid rorx rotation instructions with -Os. This patch teaches the i386 backend to avoid using BMI2's rorx instructions when optimizing for size. The benefits are shown with the following example: unsigned int ror1(unsigned int x) { return (x >> 1) \| (x << 31); } unsigned int ror2(unsigned int x) { return (x >> 2) \| (x << 30); } unsigned int rol2(unsigned int x) { return (x >> 30) \| (x << 2); } unsigned int rol1(unsigned int x) { return (x >> 31) \| (x << 1); } which currently with -Os -march=cascadelake generates: ror1: rorx $1, %edi, %eax // 6 bytes ret ror2: rorx $2, %edi, %eax // 6 bytes ret rol2: rorx $30, %edi, %eax // 6 bytes ret rol1: rorx $31, %edi, %eax // 6 bytes ret but with this patch now generates: ror1: movl %edi, %eax // 2 bytes rorl %eax // 2 bytes ret ror2: movl %edi, %eax // 2 bytes rorl $2, %eax // 3 bytes ret rol2: movl %edi, %eax // 2 bytes roll $2, %eax // 3 bytes ret rol1: movl %edi, %eax // 2 bytes roll %eax // 2 bytes ret I've confirmed that this patch is a win on the CSiBE benchmark, even though rotations are rare, where for example libmspack/test/md5.o shrinks from 5824 bytes to 5632 bytes. 2021-11-16 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (bmi2_rorx<mode3>_1): Make conditional on !optimize_function_for_size_p. (<any_rotate><mode>3_1): Add preferred_for_size attribute. (define_splits): Conditionalize on !optimize_function_for_size_p. (bmi2_rorxsi3_1_zext): Likewise. (<any_rotate>si2_1_zext): Add preferred_for_size attribute. (define_splits): Conditionalize on !optimize_function_for_size_p.	2021-11-16 08:55:21 +00:00
Jan Hubicka	e69b7c5779	Fix uninitialized access in merge_call_side_effects gcc/ChangeLog: PR ipa/103262 * ipa-modref.c (merge_call_side_effects): Fix uninitialized access. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/modref-dse-5.c: New test.	2021-11-16 09:15:39 +01:00
Andrew Pinski	3200de91bc	tree-optimization: [PR103245] Improve detection of abs pattern using multiplication So while working on PR 103228 (and a few others), I noticed the testcase for PR 94785 was failing. The problem is that the nop_convert moved from being inside the IOR to be outside of it. I also noticed the patch for PR 103228 was not needed to reproduce the issue either. This patch combines the two patterns together for the abs match when using multiplication and adds a few places where nop_convert are optional. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/103245 gcc/ChangeLog: * match.pd: Combine the abs pattern matching using multiplication. Adding optional nop_convert too. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr103245-1.c: New test.	2021-11-16 03:31:57 +00:00
H.J. Lu	074ee8d9a9	Add a missing return when transforming atomic bit test and operations When failing to transform equivalent, but slighly different cases of atomic bit test and operations to their canonical forms, return immediately. gcc/ PR middle-end/103268 * tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing return. gcc/testsuite/ PR middle-end/103268 * gcc.dg/pr103268-1.c: New test. * gcc.dg/pr103268-2.c: Likewise.	2021-11-15 19:23:58 -08:00
Jim Wilson	a031aaa2ac	Update my email address. * MAINTAINERS: Update my address.	2021-11-15 17:02:40 -08:00
GCC Administrator	e2b57363fc	Daily bump.	2021-11-16 00:16:31 +00:00
Jason Merrill	87c2080b05	c++: Add -fimplicit-constexpr With each successive C++ standard the restrictions on the use of the constexpr keyword for functions get weaker and weaker; it recently occurred to me that it is heading toward the same fate as the C register keyword, which was once useful for optimization but became obsolete. Similarly, it seems to me that we should be able to just treat inlines as constexpr functions and not make people add the extra keyword everywhere. There were a lot of testcase changes needed; many disabling errors about non-constexpr functions that are now constexpr, and many disabling implicit constexpr so that the tests can check the same thing as before, whether that's mangling or whatever. gcc/c-family/ChangeLog: * c.opt: Add -fimplicit-constexpr. * c-cppbuiltin.c: Define __cpp_implicit_constexpr. * c-opts.c (c_common_post_options): Disable below C++14. gcc/cp/ChangeLog: * cp-tree.h (struct lang_decl_fn): Add implicit_constexpr. (decl_implicit_constexpr_p): New. * class.c (type_maybe_constexpr_destructor): Use TYPE_HAS_TRIVIAL_DESTRUCTOR and maybe_constexpr_fn. (finalize_literal_type_property): Simplify. * constexpr.c (is_valid_constexpr_fn): Check for dtor. (maybe_save_constexpr_fundef): Try to set DECL_DECLARED_CONSTEXPR_P on inlines. (cxx_eval_call_expression): Use maybe_constexpr_fn. (maybe_constexpr_fn): Handle flag_implicit_constexpr. (var_in_maybe_constexpr_fn): Use maybe_constexpr_fn. (potential_constant_expression_1): Likewise. (decl_implicit_constexpr_p): New. * decl.c (validate_constexpr_redeclaration): Allow change with -fimplicit-constexpr. (grok_special_member_properties): Use maybe_constexpr_fn. * error.c (dump_function_decl): Don't print 'constexpr' if it's implicit. * Make-lang.in (check-c++-all): Update. libstdc++-v3/ChangeLog: * testsuite/20_util/to_address/1_neg.cc: Adjust error. * testsuite/26_numerics/random/concept.cc: Adjust asserts. gcc/testsuite/ChangeLog: * lib/g++-dg.exp: Handle "impcx". * lib/target-supports.exp (check_effective_target_implicit_constexpr): New. * g++.dg/abi/abi-tag16.C: * g++.dg/abi/abi-tag18a.C: * g++.dg/abi/guard4.C: * g++.dg/abi/lambda-defarg1.C: * g++.dg/abi/mangle26.C: * g++.dg/cpp0x/constexpr-diag3.C: * g++.dg/cpp0x/constexpr-ex1.C: * g++.dg/cpp0x/constexpr-ice5.C: * g++.dg/cpp0x/constexpr-incomplete2.C: * g++.dg/cpp0x/constexpr-memfn1.C: * g++.dg/cpp0x/constexpr-neg3.C: * g++.dg/cpp0x/constexpr-specialization.C: * g++.dg/cpp0x/inh-ctor19.C: * g++.dg/cpp0x/inh-ctor30.C: * g++.dg/cpp0x/lambda/lambda-mangle3.C: * g++.dg/cpp0x/lambda/lambda-mangle5.C: * g++.dg/cpp1y/auto-fn12.C: * g++.dg/cpp1y/constexpr-loop5.C: * g++.dg/cpp1z/constexpr-lambda7.C: * g++.dg/cpp2a/constexpr-dtor3.C: * g++.dg/cpp2a/constexpr-new13.C: * g++.dg/cpp2a/constinit11.C: * g++.dg/cpp2a/constinit12.C: * g++.dg/cpp2a/constinit14.C: * g++.dg/cpp2a/constinit15.C: * g++.dg/cpp2a/spaceship-constexpr1.C: * g++.dg/cpp2a/spaceship-eq3.C: * g++.dg/cpp2a/udlit-class-nttp-neg2.C: * g++.dg/debug/dwarf2/auto1.C: * g++.dg/debug/dwarf2/cdtor-1.C: * g++.dg/debug/dwarf2/lambda1.C: * g++.dg/debug/dwarf2/pr54508.C: * g++.dg/debug/dwarf2/pubnames-2.C: * g++.dg/debug/dwarf2/pubnames-3.C: * g++.dg/ext/is_literal_type3.C: * g++.dg/ext/visibility/template7.C: * g++.dg/gcov/gcov-12.C: * g++.dg/gcov/gcov-2.C: * g++.dg/ipa/devirt-35.C: * g++.dg/ipa/devirt-36.C: * g++.dg/ipa/devirt-37.C: * g++.dg/ipa/devirt-44.C: * g++.dg/ipa/imm-devirt-1.C: * g++.dg/lookup/builtin5.C: * g++.dg/lto/inline-crossmodule-1_0.C: * g++.dg/modules/enum-1_a.C: * g++.dg/modules/fn-inline-1_c.C: * g++.dg/modules/pmf-1_b.C: * g++.dg/modules/used-1_c.C: * g++.dg/tls/thread_local11.C: * g++.dg/tls/thread_local11a.C: * g++.dg/tm/pr46653.C: * g++.dg/ubsan/pr70035.C: * g++.old-deja/g++.other/delete6.C: * g++.dg/modules/pmf-1_a.H: Adjust for implicit constexpr.	2021-11-15 18:50:07 -05:00
Jason Merrill	29e4163a09	c++: split_nonconstant_init and flexarrays split_nonconstant_init was doing the wrong thing for both the initialization and cleanup here; we know the size from the initializer, and we can pass it along. This doesn't make the testcase work, since the y destructor is still broken, but it removes the wrong error for the aggregate initialization. gcc/cp/ChangeLog: * typeck2.c (split_nonconstant_init_1): Handle flexarrays better. gcc/testsuite/ChangeLog: * g++.dg/ext/flexary37.C: Remove expected error.	2021-11-15 18:48:04 -05:00
Siddhesh Poyarekar	323026c7df	gimple-fold: Use ranges to simplify strncat and snprintf Use ranges for lengths and object sizes in strncat and snprintf to determine if they can be transformed into simpler operations. gcc/ChangeLog: * gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to determine if it is safe to transform to strcat. (gimple_fold_builtin_snprintf): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/fold-stringops-2.c: Define size_t. (safe1): Adjust. (safe4): New test. * gcc.dg/fold-stringops-3.c: New test. Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>	2021-11-16 04:20:46 +05:30
Siddhesh Poyarekar	cea4dab861	gimple-fold: Use ranges to simplify _chk calls Instead of comparing LEN and SIZE only if they are constants, use their ranges to decide if LEN will always be lower than or same as SIZE. This change ends up putting the stringop-overflow warning line number against the strcpy implementation, so adjust the warning check to be line number agnostic. gcc/ChangeLog: * gimple-fold.c (known_lower): New function. (gimple_fold_builtin_strncat_chk, gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk, gimple_fold_builtin_stxncpy_chk, gimple_fold_builtin_snprintf_chk, gimple_fold_builtin_sprintf_chk): Use it. gcc/testsuite/ChangeLog: * gcc.dg/Wobjsize-1.c: Make warning change line agnostic. * gcc.dg/fold-stringops-2.c: New test. Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>	2021-11-16 04:20:31 +05:30
Siddhesh Poyarekar	d1753b4be9	gimple-fold: Transform stpcpy_chk to strcpy directly Avoid going through another folding cycle and use the ignore flag to directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set, likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY. Dump the transformation in dump_file so that we can verify in tests that the direct transformation actually happened. gcc/ChangeLog: * gimple-fold.c (dump_transformation): New function. (gimple_fold_builtin_stxcpy_chk, gimple_fold_builtin_stxncpy_chk): Use it. Simplify to BUILT_IN_STRNCPY if return value is not used. gcc/testsuite/ChangeLog: * gcc.dg/fold-stringops-1.c: New test. Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>	2021-11-16 04:19:51 +05:30
H.J. Lu	4c19122bf5	Check optab before transforming atomic bit test and operations Check optab before transforming equivalent, but slighly different cases of atomic bit test and operations to their canonical forms. gcc/ PR middle-end/103184 * tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab before transforming equivalent, but slighly different cases to their canonical forms. gcc/testsuite/ PR middle-end/103184 * gcc.dg/pr103184-1.c: New test. * gcc.dg/pr103184-2.c: Likewise.	2021-11-15 12:58:56 -08:00
Iain Sandoe	fabe8cc41e	IPA: Provide a mechanism to register static DTORs via cxa_atexit. For at least one target (Darwin) the platform convention is to register static destructors (i.e. __attribute__((destructor))) with __cxa_atexit rather than placing them into a list that is run by some other mechanism. This patch provides a target hook that allows a target to opt into this and handling for the process in ipa_cdtor_merge (). When the mode is enabled (dtors_from_cxa_atexit is set) we: * Generate new CTORs to register static destructors with __cxa_atexit and add them to the existing list of CTORs; we then process the revised CTORs list. * We sort the DTORs into priority and then TU order, this means that they are registered in that order with __cxa_atexit () and therefore will be run in the reverse order. * Likewise, CTORs are sorted into priority and then TU order, which means that they will run in that order. This matches the behavior of using init/fini (or mod_init_func/mod_term_func) sections. This also fixes a bug where Fortran needs a DTOR to be run to close IO. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> PR fortran/102992 gcc/ChangeLog: * config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New. * doc/tm.texi: Regenerated. * doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook. * ipa.c (cgraph_build_static_cdtor_1): Return the built function decl. (build_cxa_atexit_decl): New. (build_dso_handle_decl): New. (build_cxa_dtor_registrations): New. (compare_cdtor_tu_order): New. (build_cxa_atexit_fns): New. (ipa_cdtor_merge): If dtors_from_cxa_atexit is set, process the DTORs/CTORs accordingly. (pass_ipa_cdtor_merge::gate): Also run if dtors_from_cxa_atexit is set. * target.def (dtors_from_cxa_atexit): New hook.	2021-11-15 19:48:56 +00:00
Iain Sandoe	d3cc82dc9c	configure, Darwin: Check ld64 support for -platform-version. Newer versions of ld64 allow specifiying the OS target (e.g. macos or ios) the version and the SDK version all in a single command. This checks the availability of the command for the current toolchain. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config.in: Regenerate. * configure: Regenerate. * configure.ac: Test ld64 for -platform-version support.	2021-11-15 19:35:10 +00:00
Iain Sandoe	bd5159bdd4	testsuite, Darwin: In tsvc.h, use malloc for Darwin <= 9. Earlier Darwin versions fdo not have posix_memalign() but the malloc implementation is guaranteed to produce memory suitably aligned for the largest vector type. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/testsuite/ChangeLog: * gcc.dg/vect/tsvc/tsvc.h: Use malloc for Darwin 9 and earlier.	2021-11-15 19:28:07 +00:00
Iain Sandoe	b7f0147833	Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds. Most of the time we get away with using the dsymutil that is installed with the latest Xcode, however for some cross-compilation cases that does not work. We now have the ability to specify the correct dsymutil to use for the toolchain (--with-dsymutil=) and we should use that specified tool for debug link. Fixes cross-compilers from x86-64 to powerpc. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ada/ChangeLog: * gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in libgnat/libgnarl recipies.	2021-11-15 19:27:24 +00:00
François Dumont	d10b863fa3	libstdc++: Unordered containers merge re-use hash code When merging 2 unordered containers with same hasher we can re-use the hash code from the cache if any. Also in the context of the merge operation on multi-container use previous insert iterator as a hint for the next insert. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h: (_Hash_code_base<>::_M_hash_code(const _Hash&, const _Hash_node_value<_Value, true>&)): New. (_Hash_code_base<>::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): New. * include/bits/hashtable.h (_Hashtable<>::_M_merge_unique): Use latter. (_Hashtable<>::_M_merge_multi): Likewise. * testsuite/23_containers/unordered_multiset/modifiers/merge.cc (test05): New test. * testsuite/23_containers/unordered_set/modifiers/merge.cc (test04): New test.	2021-11-15 18:52:07 +01:00
Thomas Schwinge	f861ed8b29	Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map' Instead of hard-coded '0'/'UINT_MAX', we now use the 'RESERVED_LOCATION_P' values 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' as spare values for 'Empty'/'Deleted', and generally simplify the code. gcc/ * diagnostic-spec.h (typedef xint_hash_t) (typedef xint_hash_map_t): Replace with... (typedef nowarn_map_t): ... this. (nowarn_map): Adjust. * diagnostic-spec.c (nowarn_map, suppress_warning_at): Likewise.	2021-11-15 17:57:54 +01:00
Thomas Schwinge	bcebd05720	Use 'location_hash' for 'seen_locations' in 'gcc/profile.c:branch_prob' Follow-up to commit `102fcf94e6` "Fix GCOV CFG related issues": considering the current 'int_hash <location_t, 0, 2>', per 'libcpp/include/line-map.h': Actual \| Value \| Meaning -----------+-------------------------------+------------------------------- 0x00000000 \| UNKNOWN_LOCATION (gcc/input.h)\| Unknown/invalid location. -----------+-------------------------------+------------------------------- 0x00000001 \| BUILTINS_LOCATION \| The location for declarations \| (gcc/input.h) \| in "<built-in>" -----------+-------------------------------+------------------------------- 0x00000002 \| RESERVED_LOCATION_COUNT \| The first location to be \| (also \| handed out, and the \| ordmap[0]->start_location) \| first line in ordmap 0 ... this currently uses value '0' ('UNKNOWN_LOCATION') as spare values for 'Empty', and value '2' ('RESERVED_LOCATION_COUNT') as spare values for 'Deleted', which is questionable? What actually does get put into 'seen_locations' is (mostly...) restricted/gated by '!RESERVED_LOCATION_P' (which is true unless 'UNKNOWN_LOCATION' or 'BUILTINS_LOCATION'), thus we may simply use 'location_hash'. gcc/ * profile.c (branch_prob): Use 'location_hash' for 'seen_locations'.	2021-11-15 17:56:49 +01:00
Aldy Hernandez	6c29c9d6a7	Drop tree overflow in irange setter. Drop meaningless overflow that may creep into the IL. gcc/ChangeLog: PR tree-optimization/103207 * value-range.cc (irange::set): Drop overflow. gcc/testsuite/ChangeLog: * gcc.dg/pr103207.c: New test.	2021-11-15 17:31:50 +01:00
Tobias Burnus	82ec4cb3c4	Fortran: openmp: Add support for thread_limit clause on target gcc/fortran/ChangeLog: * openmp.c (OMP_TARGET_CLAUSES): Add thread_limit. * trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to teams. libgomp/ChangeLog: * testsuite/libgomp.fortran/thread-limit-1.f90: New test.	2021-11-15 15:44:11 +01:00
Jakub Jelinek	b2e1ac5485	testsuite: Add testcase for already fixed PR [PR100469] This bug introduced in r11-7448-gff92ede8d269375f800e1b347a48f4698874b4a3 has been fixed already by r12-1354-g2d2ed777b23ab6503027039e0adbfe1162f52b2f aka PR100852 fix. 2021-11-15 Jakub Jelinek <jakub@redhat.com> PR debug/100469 * g++.dg/opt/pr100469.C: New test.	2021-11-15 14:47:44 +01:00
H.J. Lu	650108971b	x86: Add gcc.target/i386/pr103205-2.c PR target/103205 * gcc.target/i386/pr103205-2.c: New test.	2021-11-15 05:41:54 -08:00
H.J. Lu	7d768a9d6f	libffi: Update LOCAL_PATCHES Add commit `a91f844ef4` Author: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> Date: Mon Nov 15 10:24:27 2021 +0100 libffi: Use #define instead of .macro in src/x86/win64.S [PR102874] to LOCAL_PATCHES. * LOCAL_PATCHES: Add commit `a91f844ef4`.	2021-11-15 04:56:05 -08:00
Jakub Jelinek	aea7238683	openmp: Add support for thread_limit clause on target OpenMP 5.1 says that thread_limit clause can also appear on target, and similarly to teams should affect the thread-limit-var ICV. On combined target teams, the clause goes to both. We actually passed thread_limit internally on target already before, but only used it for gcn/ptx offloading to hint how many threads should be created and for ptx didn't set thread_limit_var in that case. Similarly for host fallback. Also, I found that we weren't copying the args array that contains encoded thread_limit and num_teams clause for target (etc.) for async target. 2021-11-15 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT to OMP_TARGET_CLAUSES if it isn't there already. gcc/c-family/ * c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>: Duplicate to both OMP_TARGET and OMP_TEAMS. gcc/c/ * c-parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. gcc/cp/ * parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. libgomp/ * task.c (gomp_create_target_task): Copy args array as well. * target.c (gomp_target_fallback): Add args argument. Set gomp_icv (true)->thread_limit_var if thread_limit is present. (GOMP_target): Adjust gomp_target_fallback caller. (GOMP_target_ext): Likewise. (gomp_target_task_fn): Likewise. * config/nvptx/team.c (gomp_nvptx_main): Set gomp_global_icv.thread_limit_var. * testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.	2021-11-15 13:20:53 +01:00
Aldy Hernandez	fcdf49a0ad	Fix PHI ordering problems in the path solver. After auditing the PHI range calculations, I'm not convinced we've caught all the corner cases. They haven't shown up in the wild (yet), but better safe than sorry. We shouldn't write anything to the cache or trigger additional lookups while calculating a PHI, as this may cause ordering problems. We should resolve the PHI with either the cache as it stands, or by asking for ranges on entry to the path. I've documented this. There was one dubious case where we called fold_range in ssa_range_in_phi, which mostly by luck wasn't triggering lookups, because fold_range solves a PHI by calling range_on_edge, which is set to pick up global ranges by default in path_range_query. This is fragile, so I've rewritten the call to explicitly use cached or global ranges. Also, the cache should be avoided in ssa_range_in_phi when the arg is defined in the PHI's block, as not doing so could create an ordering problem. We have a similar check when calculating relations in PHIs. Tested on x86-64 & ppc64le Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::internal_range_of_expr): Remove useless code. (path_range_query::ssa_defined_in_bb): New. (path_range_query::ssa_range_in_phi): Avoid fold_range call that could trigger additional lookups. Do not use the cache for ARGs defined in this block. (path_range_query::compute_ranges_in_block): Use ssa_defined_in_bb. (path_range_query::maybe_register_phi_relation): Same. (path_range_query::range_of_stmt): Adjust comment. * gimple-range-path.h (ssa_defined_in_bb): New.	2021-11-15 13:16:57 +01:00
Aldy Hernandez	540d92ae9b	path solver: Default to global range if nothing found. This has been a long time coming, but we weren't able to make the change because of some unrelated regressions. Tested on x86-64 & ppc64le Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::internal_range_of_expr): Default to global range if nothing found. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr31146-2.C: Add -fno-thread-jumps.	2021-11-15 13:16:56 +01:00
Richard Biener	220bd61874	tree-optimization/103237 - avoid vectorizing unhandled double reductions Double reductions which have multiple LC PHIs in the inner loop are not handled correctly during transformation since those PHIs are not properly classified as reduction. The following disables vectorizing them. 2021-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/103237 * tree-vect-loop.c (vect_is_simple_reduction): Fail for double reductions with multiple inner loop LC PHI nodes. * gcc.dg/torture/pr103237.c: New testcase.	2021-11-15 13:07:57 +01:00
Hongyu Wang	4d281ff7dd	PR target/103069: Relax cmpxchg loop for x86 target From the CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ xeon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and causes excessive cache line bouncing. The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under -march=x86-64 like: movl v(%rip), %eax .L2: movl %eax, %ecx movl %eax, %edx orl $1, %ecx lock cmpxchgl %ecx, v(%rip) jne .L2 movl %edx, %eax andl $1, %eax ret To relax above loop, GCC should first emit a normal load, check and jump to .L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to yield the CPU to another hyperthread and to save power, so the code is like .L84: movl (%rdi), %ecx movl %eax, %edx orl %esi, %edx cmpl %eax, %ecx jne .L82 lock cmpxchgl %edx, (%rdi) jne .L84 .L82: rep nop jmp .L84 This patch adds corresponding atomic_fetch_op expanders to insert load/ compare and pause for all the atomic logic fetch builtins. Add flag -mrelax-cmpxchg-loop to control whether to generate relaxed loop. gcc/ChangeLog: PR target/103069 * config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop): New expand function. * config/i386/i386-options.c (ix86_target_string): Add -mrelax-cmpxchg-loop flag. (ix86_valid_target_attribute_inner_p): Likewise. * config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop): New expand function prototype. * config/i386/i386.opt: Add -mrelax-cmpxchg-loop. * config/i386/sync.md (atomic_fetch_<logic><mode>): New expander for SI,HI,QI modes. (atomic_<logic>_fetch<mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise. (atomic_fetch_<logic><mode>): New expander for DI,TI modes. (atomic_<logic>_fetch<mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise. * doc/invoke.texi: Document -mrelax-cmpxchg-loop. gcc/testsuite/ChangeLog: PR target/103069 * gcc.target/i386/pr103069-1.c: New test. * gcc.target/i386/pr103069-2.c: Ditto.	2021-11-15 19:09:38 +08:00
Richard Biener	d1ca8aeaf3	tree-optimization/103219 - avoid ICE in unroll-and-jam For no particularly good reason unroll-and-jam uses single_dom_exit to determine the exit for the region it wants to run VN on. That happens to ICE because of the dominance restriction. Use single_exit instead. 2021-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/103219 * gimple-loop-jam.c (tree_loop_unroll_and_jam): Use single_exit to determine the exit for the VN region. * gcc.dg/torture/pr103219.c: New testcase.	2021-11-15 11:10:16 +01:00
Prathamesh Kulkarni	2551cd4f9b	[tree-vectorizer.c] Merge pass_vectorize::execute with vectorize_loops and replace occurences of cfun with function param. gcc/ChangeLog: * tree-ssa-loop.c (pass_vectorize): Move to tree-vectorizer.c. (pass_data_vectorize): Likewise. (make_pass_vectorize): Likewise. * tree-vectorizer.c (vectorize_loops): Merge with pass_vectorize::execute and replace cfun occurences with fun param. (adjust_simduid_builtins): Add fun param, replace cfun occurences with fun, and adjust callers approrpiately. (note_simd_array_uses): Likewise. (vect_loop_dist_alias_call): Likewise. (set_uid_loop_bbs): Likewise. (vect_transform_loops): Likewise. (try_vectorize_loop_1): Likewise. (try_vectorize_loop): Likewise.	2021-11-15 15:37:36 +05:30
Rainer Orth	a91f844ef4	libffi: Use #define instead of .macro in src/x86/win64.S [PR102874] The libffi 3.4.2 import badly broke Solaris/x86 bootstrap with the native assembler: Assembler: "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 : Illegal mnemonic Near line: ".macro epilogue" "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 : Syntax error Near line: ".macro epilogue" "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 : Illegal mnemonic Near line: ".endm" "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 : Syntax error Near line: ".endm" "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 : Illegal mnemonic Near line: " epilogue" "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 : Syntax error Near line: "epilogue" Solaris as doesn't support .macro/.endm. Fixed by using #define instead of the unportable .macro. Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu. The bug has been reported upstream (https://github.com/libffi/libffi/issues/665); a corresponding pull request is also pending (https://github.com/libffi/libffi/pull/669). 2021-10-21 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> libffi: PR libffi/102874 * src/x86/win64.S (epilogue): Use #define instead of .macro.	2021-11-15 10:24:27 +01:00
Rainer Orth	a68933da01	testsuite: i386: Require dfp in gcc.target/i386/pr101346.c gcc.target/i386/pr101346.c currently FAILs on Solaris/x86: FAIL: gcc.target/i386/pr101346.c (test for excess errors) Excess errors: /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:6:1: error: decimal floating-point not supported for this target /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:7:6: error: decimal floating-point not supported for this target /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:9:12: warning: implicit declaration of function '__builtin_fabsd128'; did you mean '__builtin_fabsf128'? [-Wimplicit-function-declaration] Fixed by requiring dfp support. Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu. 2021-10-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.target/i386/pr101346.c: Require dfp support.	2021-11-15 10:00:14 +01:00
Jakub Jelinek	625eef42e3	i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205] With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so we ICE. We don't really care if they are done promoted in SImode instead. 2021-11-15 Jakub Jelinek <jakub@redhat.com> PR target/103205 * config/i386/sync.md (atomic_bit_test_and_set<mode>, atomic_bit_test_and_complement<mode>, atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of OPTAB_DIRECT. * gcc.target/i386/pr103205.c: New test.	2021-11-15 09:30:08 +01:00
Jakub Jelinek	9fa72756d9	libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound Here is a PTX implementation of what I was talking about, that for num_teams_upper 0 or whenever num_teams_lower <= num_blocks, the current implementation is fine but if the user explicitly asks for more teams than we can provide in hardware, we need to stop assuming that omp_get_team_num () is equal to the hw team id, but instead need to use some team specific memory (it is .shared for PTX), or if none is provided, array indexed by the hw team id and run some teams serially within the same hw thread. 2021-11-15 Jakub Jelinek <jakub@redhat.com> * config/nvptx/team.c (__gomp_team_num): Define as __attribute__((shared)) var. (gomp_nvptx_main): Initialize __gomp_team_num to 0. * config/nvptx/target.c (__gomp_team_num): Declare as extern __attribute__((shared)) var. (GOMP_teams4): Use __gomp_team_num as the team number instead of %ctaid.x. If first, initialize it to %ctaid.x. If num_teams_lower is bigger than num_blocks, use num_teams_lower teams and arrange for bumping of __gomp_team_num if !first and returning false once we run out of teams. * config/nvptx/teams.c (__gomp_team_num): Declare as extern __attribute__((shared)) var. (omp_get_team_num): Return __gomp_team_num value instead of %ctaid.x.	2021-11-15 09:20:52 +01:00
Jakub Jelinek	d294459720	libgomp: Add a testcase for omp_get_num_teams inside of target inside of host teams This is https://github.com/OpenMP/spec/issues/3183 There is an agreement that we should return 1 team inside of target, even if that target is inside of host teams. We were doing that when offloading and not during host fallback, r12-5151 should fix that even for host fallback. 2021-11-15 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/teams-5.c: New test.	2021-11-15 08:58:39 +01:00

1 2 3 4 5 ...

189716 Commits