mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-11-25 03:44:04 +08:00

Author	SHA1	Message	Date
Tom de Vries	fc998c21c2	[omp, ftracer] Remove incorrect suggestion in ignore_bb_p In commit `ab3f4b27ab` "[omp, ftracer] Don't duplicate blocks in SIMT region" I added a comment in ignore_bb_p suggesting a reordering of SIMT_VOTE_ANY and SIMT_EXIT, which is not possible since VOTE_ANY may have data dependencies to storage that is deallocated by SIMT_EXIT. I've now opened a PR (PR97291) to describe the problem the reordering was intended to fix. Remove the incorrect suggestion. gcc/ChangeLog: 2020-10-05 Tom de Vries <tdevries@suse.de> * tracer.c (ignore_bb_p): Remove incorrect suggestion.	2020-10-05 14:19:22 +02:00
Mike Crowe	f33a43f9f7	libstdc++: Use correct duration for atomic_futex wait on custom clock [PR 91486] As Jonathan Wakely pointed out[1], my change in commit `f9ddb696a2` should have been rounding to the target clock duration type rather than the input clock duration type in __atomic_futex_unsigned::_M_load_when_equal_until just as (e.g.) condition_variable does. As well as fixing this, let's create a rather contrived test that fails with the previous code, but unfortunately only when run on a machine with an uptime of over 208.5 days, and even then not always. [1] https://gcc.gnu.org/pipermail/libstdc++/2020-September/051004.html libstdc++-v3/ChangeLog: PR libstdc++/91486 * include/bits/atomic_futex.h: (__atomic_futex_unsigned::_M_load_when_equal_until): Use target clock duration type when rounding. * testsuite/30_threads/async/async.cc (test_pr91486_wait_for): Rename from test_pr91486. (float_steady_clock): New class for test. (test_pr91486_wait_until): New test.	2020-10-05 11:32:10 +01:00
Mike Crowe	d5243c4626	libstdc++: Test C++11 implementation of std::chrono::__detail::ceil Commit `53ad6b1979` split the implementation of std::chrono::__detail::ceil so that when compiling for C++17 and later std::chrono::ceil is used but when compiling for earlier versions a separate implementation is used to comply with C++11's limited constexpr rules. Let's run the equivalent of the existing std::chrono::ceil test cases on std::chrono::__detail::ceil too to make sure that it doesn't get broken. libstdc++-v3/ChangeLog: * testsuite/20_util/duration_cast/rounding_c++11.cc: Copy rounding.cc and alter to support compilation for C++11 and to test std::chrono::__detail::ceil.	2020-10-05 11:09:03 +01:00
Jonathan Wakely	b98d3cc566	libstdc++: Add missing bugzilla PR numbers to ChangeLog We missed these out of the git commit messages.	2020-10-05 10:46:25 +01:00
Jakub Jelinek	3c022a4c73	options: Save and restore opts_set for Optimization and Target options fallout > This breaks ia64: > > In file included from ./tm.h:23, > from ../../gcc/gencheck.c:23: > ./options.h:7816:40: error: ISO C++ forbids zero-size array 'explicit_mask' [-Werror=pedantic] > 7816 \| unsigned HOST_WIDE_INT explicit_mask[0]; > \| ^ > ./options.h:7816:26: error: zero-size array member 'cl_target_option::explicit_mask' not at end of 'struct cl_target_option' [-Werror=pedantic] > 7816 \| unsigned HOST_WIDE_INT explicit_mask[0]; > \| ^~~~~~~~~~~~~ > ./options.h:7812:16: note: in the definition of 'struct cl_target_option' > 7812 \| struct GTY(()) cl_target_option > \| ^~~~~~~~~~~~~~~~ Oops, sorry. The following patch should fix that and should also fix streaming of the new explicit_mask_* members. 2020-10-05 Jakub Jelinek <jakub@redhat.com> * opth-gen.awk: Don't emit explicit_mask array if n_target_explicit is equal to n_target_explicit_mask. * optc-save-gen.awk: Compute has_target_explicit_mask and if false, don't emit code iterating over explicit_mask array elements. Stream also explicit_mask_* target members.	2020-10-05 09:34:42 +02:00
Jakub Jelinek	21f65995e0	store-merging: Fix up -Wnarrowing warning I've noticed a -Wnarrowing warning on gimple-ssa-store-merging.c, this change fixes that up. 2020-10-05 Jakub Jelinek <jakub@redhat.com> * gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Use ~0U instead of ~0 in unsigned int array initializer.	2020-10-05 09:09:41 +02:00
Tom de Vries	ab3f4b27ab	[omp, ftracer] Don't duplicate blocks in SIMT region When running the libgomp testsuite on x86_64-linux with nvptx accelerator on the test-case included in this patch, we run into: ... FAIL: libgomp.fortran/pr95654.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... The test-case is a minimal version of this FAIL: ... FAIL: libgomp.fortran/pr66199-5.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... but that one has stopped failing at commit `c2ebf4f10d` "openmp: Add support for non-rect simd and improve collapsed simd support". The problem is that ftracer duplicates a block containing GOMP_SIMT_VOTE_ANY. That is, before ftracer we have (dropping the GOMP_SIMT_ prefix): ... bb4(ENTER_ALLOC) ----------+ \| \ \| \ \| v \| v bb8 <------------ bb5(VOTE_ANY) -------------+ \| \| \| \| \| \| \| \| \| v \| v bb7(XCHG_IDX) <------------ bb6(EXIT) ... The XCHG_IDX internal-fn does inter-SIMT-lane communication, which for nvptx maps onto shfl, an operator which has the requirement that the warp executing the operator is convergent. The warp diverges at bb4, and reconverges at bb5, and does not diverge by going to bb7, so the shfl is indeed executed by a convergent warp. After ftracer, we have: ... bb4(ENTER_ALLOC) ----------+ \| \ \| \ \| \ \| \ v v * bb5(VOTE_ANY) bb8(VOTE_ANY) * * \|\ /\| \| \ +--------+ \| \| \/ \| \| /\ \| \| / +----------v \|/ * v bb7(XCHG_IDX) <-------------- bb6(EXIT) ... The warp diverges again at bb5, but does not reconverge again before bb6, so the shfl is executed by a divergent warp, which causes the FAIL. Fix this by making ftracer ignore blocks containing ENTER_ALLOC, VOTE_ANY and EXIT, effectively treating the SIMT region conservatively. An argument can be made that the test needs to be added in a more generic place, like gimple_can_duplicate_bb_p or some such, and that ftracer then needs to use the generic test. But that's a discussion with a much broader scope, so I'm leaving that for another patch. Bootstrapped and reg-tested on x86_64-linux. Build on x86_64-linux with nvptx accelerator, tested with libgomp. gcc/ChangeLog: PR fortran/95654 * tracer.c (ignore_bb_p): Ignore GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_VOTE_ANY and GOMP_SIMT_EXIT. libgomp/ChangeLog: 2020-10-05 Tom de Vries <tdevries@suse.de> PR fortran/95654 * testsuite/libgomp.fortran/pr95654.f90: New test.	2020-10-05 08:53:11 +02:00
GCC Administrator	4347d36f93	Daily bump.	2020-10-05 00:16:18 +00:00
Harald Anlauf	35d2c6b6e8	PR fortran/97272 - Wrong answer from MAXLOC with character arg The optional KIND argument to the MINLOC/MAXLOC intrinsic must not be passed to the library function, as the kind conversion of the result is treated explicitly elsewhere. gcc/fortran/ChangeLog: PR fortran/97272 * trans-intrinsic.c (strip_kind_from_actual): Helper function for removal of KIND argument. (gfc_conv_intrinsic_minmaxloc): Ignore KIND argument here, as it is treated elsewhere. gcc/testsuite/ChangeLog: PR fortran/97272 * gfortran.dg/pr97272.f90: New test.	2020-10-04 20:24:29 +02:00
GCC Administrator	11bd94806d	Daily bump.	2020-10-04 00:16:21 +00:00
Clément Chigot	5af2a2d30d	aix: apply aix_malloc more narrowly. In recent Technology Levels of AIX 7.2, new "#ifdef __cplusplus" have been added. Thus, the aix_malloc fix was applied in wrong locations. This patch increases the context to avoid this. fixincludes/ChangeLog: 2020-10-03 Clément Chigot <clement.chigot@atos.net> * inclhack.def (aix_malloc): Add more context to select. * fixincl.x: Regenerate. * tests/base/malloc.h: Update expected results.	2020-10-03 23:48:40 +00:00
Jakub Jelinek	ce531b1412	options: Fix up opts_set saving/restoring for underlying vars of Mask/InverseMask options Seems I've missed that set_option has special treatment for CLVC_BIT_CLEAR/CLVC_BIT_SET. Which means I'll need to change the generic handling, so that for global_options_set elements mentioned in CLVC_BIT_* options are treated differently, instead of using the accumulated bitmasks they'll need to use their specific bitmask variables during the option saving/restoring. Here is a patch that implements that. 2020-10-03 Jakub Jelinek <jakub@redhat.com> * opth-gen.awk: For variables referenced in Mask and InverseMask, don't use the explicit_mask bitmask array, but add separate explicit_mask_* members with the same types as the variables. * optc-save-gen.awk: Save, restore, compare and hash the separate explicit_mask_* members.	2020-10-03 21:22:03 +02:00
Jan Hubicka	a1f77106ec	Add gcc.dg/tree-ssa/modref-3.c testcase * gcc.dg/tree-ssa/modref-3.c: New test.	2020-10-03 17:20:54 +02:00
Jan Hubicka	c34db4b6f8	Track access ranges in ipa-modref this patch implements tracking of access ranges. This is only applied when base pointer is an arugment. Incrementally i will extend it to also track TBAA basetype so we can disambiguate ranges for accesses to same basetype (which makes is quite bit more effective). For this reason i track the access offset separately from parameter offset (the second track combined adjustments to the parameter). This is I think last feature I would like to add to the memory access summary this stage1. Further work will be needed to opitmize the summary and merge adjacent range/make collapsing more intelingent (so we do not lose track that often), but I wanted to keep basic patch simple. According to the cc1plus stats: Alias oracle query stats: refs_may_alias_p: 64108082 disambiguations, 74386675 queries ref_maybe_used_by_call_p: 142319 disambiguations, 65004781 queries call_may_clobber_ref_p: 23587 disambiguations, 29420 queries nonoverlapping_component_refs_p: 0 disambiguations, 38117 queries nonoverlapping_refs_since_match_p: 19489 disambiguations, 55748 must overlaps, 76044 queries aliasing_component_refs_p: 54763 disambiguations, 755876 queries TBAA oracle: 24184658 disambiguations 56823187 queries 16260329 are in alias set 0 10617146 queries asked about the same object 125 queries asked about the same alias set 0 access volatile 3960555 are dependent in the DAG 1800374 are aritificially in conflict with void * Modref stats: modref use: 10656 disambiguations, 47037 queries modref clobber: 1473322 disambiguations, 1961464 queries 5027242 tbaa queries (2.563005 per modref query) 649087 base compares (0.330920 per modref query) PTA query stats: pt_solution_includes: 977385 disambiguations, 13609749 queries pt_solutions_intersect: 1032703 disambiguations, 13187507 queries Which should still compare with https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554930.html there is about 2% more load disambiguations and 3.6% more store that is not great, but the TBAA part helps noticeably more and also this should help with -fno-strict-aliasing. I plan to work on improving param tracking too. Bootstrapped/regtested x86_64-linux with the other changes, OK? 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * ipa-modref-tree.c (test_insert_search_collapse): Update andling of accesses. (test_merge): Likewise. * ipa-modref-tree.h (struct modref_access_node): Add offset, size, max_size, parm_offset and parm_offset_known. (modref_access_node::useful_p): Constify. (modref_access_node::range_info_useful_p): New predicate. (modref_access_node::operator==): New. (struct modref_parm_map): New structure. (modref_tree::merge): Update for racking parameters) * ipa-modref.c (dump_access): Dump new fields. (get_access): Fill in new fields. (merge_call_side_effects): Update handling of parm map. (write_modref_records): Stream new fields. (read_modref_records): Stream new fields. (compute_parm_map): Update for new parm map. (ipa_merge_modref_summary_after_inlining): Update. (modref_propagate_in_scc): Update. * tree-ssa-alias.c (modref_may_conflict): Handle known ranges.	2020-10-03 17:20:16 +02:00
H.J. Lu	8510e3301b	doc: Replace roudnevenl with roundevenl PR other/97280 * doc/extend.texi: Replace roudnevenl with roundevenl	2020-10-03 07:20:48 -07:00
GCC Administrator	b0b9b8f02a	Daily bump.	2020-10-03 00:16:25 +00:00
David Edelsohn	9885183c08	rs6000: clean up headers in rs6000.c and rs6000-call.c When Andrew Macleod investigated the recent rs6000 bootstrap failure, he suggested a clean up of the headers in rs6000.c and rs6000-call.c. It now is recommended to include ssa.h instead of the individual headers. This also ensures that value-range.h is included and in the correct order so that the tree-ssa-propagate.h inclusion of value-query.h and its dependencies are satisfied. Bootstrapped on powerpc-ibm-aix7.2.0.0 and powerpc64le-linux. gcc/ChangeLog: 2020-10-02 David Edelsohn <dje.gcc@gmail.com> Andrew MacLeod <amacleod@redhat.com> * config/rs6000/rs6000.c: Include ssa.h. Reorder some headers. * config/rs6000/rs6000-call.c: Same.	2020-10-02 18:52:43 -04:00
Marek Polacek	47f09ec971	c++: Fix printing of C++20 template parameter object [PR97014] No one is interested in the mangled name of the C++20 template parameter object for a class NTTP. So instead of printing required for the satisfaction of ‘positive<T::ratio>’ [with T = X<::_ZTAXtl5ratioLin1ELi2EEE>] let's print required for the satisfaction of ‘positive<T::ratio>’ [with T = X<{-1, 2}>] I don't think adding a test is necessary for this. gcc/cp/ChangeLog: PR c++/97014 * cxx-pretty-print.c (pp_cxx_template_argument_list): If the argument is template_parm_object_p, print its DECL_INITIAL.	2020-10-02 18:48:39 -04:00
Jonathan Wakely	324118378e	libstdc++: Change test to work without 64-bit atomics This fixes a linker error for older ARM cores without 64-bit atomics. I think the { dg-add-options libatomic } is no longer needed, but it's harmless to keep it there. libstdc++-v3/ChangeLog: * testsuite/29_atomics/atomic_float/value_init.cc: Use float instead of double so that __atomic_load_8 isn't needed.	2020-10-02 22:18:51 +01:00
Jonathan Wakely	1ad08b64ce	libstdc++: Fix testcase by using terminate handler This test was supposed to verify that when __libc_single_threaded is available we successfully detect recursive static initialization even when linked to libpthread. But I forgot to that when recursive init is detected, we terminate, and so the test fails. This adds a terminate handler that exits cleanly, so the test passes when recursive init is detected. libstdc++-v3/ChangeLog: * testsuite/18_support/96817.cc: Use terminate handler that calls _Exit(0).	2020-10-02 22:18:51 +01:00
Nathan Sidwell	679dbc9dce	c++: Kill DECL_ANTICIPATED Here's the patch to remove DECL_ANTICIPATED, and with it hiddenness is managed entirely in the symbol table. Sadly I couldn't get rid of the actual field without more investigation -- it's repurposed for OMP_PRIVATIZED_MEMBER. It looks like a the VAR-related flags in lang_decl_base are not completely orthogonal, so perhaps some can be turned into an enumeration or something. But that's more than I want to do right now. DECL_FRIEND_P Is still slightly suspect as it appears to mean more than just in-class definition. However, I'm leaving that for now. gcc/cp/ * cp-tree.h (lang_decl_base): anticipated_p is not used for anticipatedness. (DECL_ANTICIPATED): Delete. * decl.c (duplicate_decls): Delete DECL_ANTICIPATED_management, use was_hidden. (cxx_builtin_function): Drop DECL_ANTICIPATED setting. (xref_tag_1): Drop DECL_ANTICIPATED assert. * name-lookup.c (name_lookup::adl_class_only): Drop DECL_ANTICIPATED check. (name_lookup::search_adl): Always dedup. (anticipated_builtin_p): Reimplement. (do_pushdecl): Drop DECL_ANTICIPATED asserts & update. (lookup_elaborated_type_1): Drop DECL_ANTICIPATED update. (do_pushtag): Drop DECL_ANTICIPATED setting. * pt.c (push_template_decl): Likewise. (tsubst_friend_class): Likewise. libcc1/ * libcp1plugin.cc (libcp1plugin.cc): Drop DECL_ANTICIPATED test.	2020-10-02 12:21:08 -07:00
Nathan Sidwell	7ee1c0413e	c++: Hash table iteration for namespace-member spelling suggestions For 'no such binding' errors, we iterate over binding levels to find a close match. At the namespace level we were using DECL_ANTICIPATED to skip undeclared builtins. But (a) there are other unnameable things there and (b) decl-anticipated is about to go away. This changes the namespace scanning to iterate over the hash table, and look at non-hidden bindings. This does mean we look at fewer strings (hurrarh), but the order we meet them is somewhat 'random'. Our distance measure is not very fine grained, and a couple of testcases change their suggestion. I notice for the c/c++ common one, we now match the output of the C compiler. For the other one we think 'int' and 'int64_t' have the same distance from 'int64', and now meet the former first. That's a little unfortunate. If it's too problematic I suppose we could sort the strings via an intermediate array before measuring distance. gcc/cp/ * name-lookup.c (consider_decl): New, broken out of ... (consider_binding_level): ... here. Iterate the hash table for namespace bindings. gcc/testsuite/ * c-c++-common/spellcheck-reserved.c: Adjust diagnostic. * g++.dg/spellcheck-typenames.C: Adjust diagnostic.	2020-10-02 11:22:42 -07:00
Nathan Sidwell	9340d1c97b	c++: cleanup ctor_omit_inherited_parms [PR97268] ctor_omit_inherited_parms was being somewhat abused. What I'd missed is that it checks for a base-dtor name, before proceeding with the check. But we ended up passing it that during cloning before we'd completed the cloning. It was also using DECL_ORIGIN to get to the in-charge ctor, but we sometimes zap DECL_ABSTRACT_ORIGIN, and it ends up processing the incoming function -- which happens to work. so, this breaks out a predicate that expects to get the incharge ctor, and will tell you whether its base ctor will need to omit the parms. We call that directly during cloning. Then the original fn is essentially just a wrapper, but uses DECL_CLONED_FUNCTION to get to the in-charge ctor. That uncovered abuse in add_method, which was happily passing TEMPLATE_DECLs to it. Let's not do that. add_method itself contained a loop mostly containing an 'if (nomatch) continue' idiom, except for a final 'if (match) {...}' check, which itself contained instances of the former idiom. I refactored that to use the former idiom throughout. In doing that I found a place where we'd issue an error, but then not actually reject the new member. gcc/cp/ * cp-tree.h (base_ctor_omit_inherited_parms): Declare. * class.c (add_method): Refactor main loop, only pass fns to ctor_omit_inherited_parms. (build_cdtor_clones): Rename bool parms. (clone_cdtor): Call base_ctor_omit_inherited_parms. * method.c (base_ctor_omit_inherited_parms): New, broken out of ... (ctor_omit_inherited_parms): ... here, call it with DECL_CLONED_FUNCTION. gcc/testsuite/ * g++.dg/inherit/pr97268.C: New.	2020-10-02 09:55:45 -07:00
Martin Jambor	3158482466	ipa-cp: Separate and increase the large-unit parameter A previous patch in the series has taught IPA-CP to identify the important cloning opportunities in 548.exchange2_r as worthwhile on their own, but the optimization is still prevented from taking place because of the overall unit-growh limit. This patches raises that limit so that it takes place and the benchmark runs 30% faster (on AMD Zen2 CPU at least). Before this patch, IPA-CP uses the following formulae to arrive at the overall_size limit: base = MAX(orig_size, param_large_unit_insns) unit_growth_limit = base + base * param_ipa_cp_unit_growth / 100 since param_ipa_cp_unit_growth has default 10, param_large_unit_insns has default value 10000. The problem with exchange2 (at least on zen2 but I have had a quick look on aarch64 too) is that the original estimated unit size is 10513 and so param_large_unit_insns does not apply and the default limit is therefore 11564 which is good enough only for one of the ideal 8 clonings, we need the limit to be at least 16291. I would like to raise param_ipa_cp_unit_growth a little bit more soon too, but most certainly not to 55. Therefore, the large_unit must be increased. In this patch, I decided to decouple the inlining and ipa-cp large-unit parameters. It also makes sense because IPA-CP uses it only at -O3 while inlining also at -O2 (IIUC). But if we agree we can try raising param_large_unit_insns to 13-14 thousand "instructions," perhaps it is not necessary. But then again, it may make sense to actually increase the IPA-CP limit further. I plan to experiment with IPA-CP tuning on a larger set of programs. Meanwhile, mainly to address the 548.exchange2_r regression, I'm suggesting this simple change. gcc/ChangeLog: 2020-09-07 Martin Jambor <mjambor@suse.cz> * params.opt (ipa-cp-large-unit-insns): New parameter. * ipa-cp.c (get_max_overall_size): Use the new parameter.	2020-10-02 18:41:35 +02:00
Martin Jambor	91153e0af9	ipa-cp: Add dumping of overall_size after cloning When experimenting with IPA-CP parameters, especially when looking into exchange2_r, it has been very useful to know what the value of overall_size is at different stages of the decision process. This patch therefore adds it to the generated dumps. gcc/ChangeLog: 2020-09-07 Martin Jambor <mjambor@suse.cz> * ipa-cp.c (estimate_local_effects): Add overeall_size to dumped string. (decide_about_value): Add dumping new overall_size.	2020-10-02 18:41:35 +02:00
Martin Jambor	67ce9099bc	ipa: Multiple predicates for loop properties, with frequencies This patch enhances the ability of IPA to reason under what conditions loops in a function have known iteration counts or strides because it replaces single predicates which currently hold conjunction of predicates for all loops with vectors capable of holding multiple predicates, each with a cumulative frequency of loops with the property. This second property is then used by IPA-CP to much more aggressively boost its heuristic score for cloning opportunities which make iteration counts or strides of frequent loops compile time constant. gcc/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * ipa-fnsummary.h (ipa_freqcounting_predicate): New type. (ipa_fn_summary): Change the type of loop_iterations and loop_strides to vectors of ipa_freqcounting_predicate. (ipa_fn_summary::ipa_fn_summary): Construct the new vectors. (ipa_call_estimates): New fields loops_with_known_iterations and loops_with_known_strides. * ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus with the expected frequencies of loops with known iteration count or stride. * ipa-fnsummary.c (add_freqcounting_predicate): New function. (ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of just two predicates. (remap_hint_predicate_after_duplication): Replace with function remap_freqcounting_preds_after_dup. (ipa_fn_summary_t::duplicate): Use it or duplicate new vectors. (ipa_dump_fn_summary): Dump the new vectors. (analyze_function_body): Compute the loop property vectors. (ipa_call_context::estimate_size_and_time): Calculate also loops_with_known_iterations and loops_with_known_strides. Adjusted dumping accordinly. (remap_hint_predicate): Replace with function remap_freqcounting_predicate. (ipa_merge_fn_summary_after_inlining): Use it. (inline_read_section): Stream loopcounting vectors instead of two simple predicates. (ipa_fn_summary_write): Likewise. * params.opt (ipa-max-loop-predicates): New parameter. * doc/invoke.texi (ipa-max-loop-predicates): Document new param. gcc/testsuite/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/ipcp-loophint-1.c: New test.	2020-10-02 18:41:35 +02:00
Martin Jambor	1e7fdc02cb	ipa: Bundle estimates of ipa_call_context::estimate_size_and_time A subsequent patch adds another two estimates that the code in ipa_call_context::estimate_size_and_time computes, and the fact that the function has a special output parameter for each thing it computes would make it have just too many. Therefore, this patch collapses all those ouptut parameters into one output structure. gcc/ChangeLog: 2020-09-02 Martin Jambor <mjambor@suse.cz> * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use ipa_call_estimates. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-fnsummary.h (struct ipa_call_estimates): New type. (ipa_call_context::estimate_size_and_time): Adjusted declaration. (estimate_ipcp_clone_size_and_time): Likewise. * ipa-cp.c (hint_time_bonus): Changed the type of the second argument to ipa_call_estimates. (perform_estimation_of_a_value): Adjusted to use ipa_call_estimates. (estimate_local_effects): Likewise. * ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted to return estimates in a single ipa_call_estimates parameter. (estimate_ipcp_clone_size_and_time): Likewise.	2020-10-02 18:41:34 +02:00
Martin Jambor	7d2cb2755a	ipa: Introduce ipa_cached_call_context Hi, as we discussed with Honza on the mailin glist last week, making cached call context structure distinct from the normal one may make it clearer that the cached data need to be explicitely deallocated. This patch does that division. It is not mandatory for the overall main goals of the patch set and can be dropped if deemed superfluous. gcc/ChangeLog: 2020-09-02 Martin Jambor <mjambor@suse.cz> * ipa-fnsummary.h (ipa_cached_call_context): New forward declaration and class. (class ipa_call_context): Make friend ipa_cached_call_context. Moved methods duplicate_from and release to it too. * ipa-fnsummary.c (ipa_call_context::duplicate_from): Moved to class ipa_cached_call_context. (ipa_call_context::release): Likewise, removed the parameter. * ipa-inline-analysis.c (node_context_cache_entry): Change the type of ctx to ipa_cached_call_context. (do_estimate_edge_time): Remove parameter from the call to ipa_cached_call_context::release.	2020-10-02 18:41:34 +02:00
Martin Jambor	9d5af1db2d	ipa: Bundle vectors describing argument values Hi, this large patch is mostly mechanical change which aims to replace uses of separate vectors about known scalar values (usually called known_vals or known_csts), known aggregate values (known_aggs), known virtual call contexts (known_contexts) and known value ranges (known_value_ranges) with uses of either new type ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply contain these vectors inside them. The need for two distinct comes from the fact that when the vectors are constructed from jump functions or lattices, we really should use auto_vecs with embedded storage allocated on stack. On the other hand, the bundle in ipa_call_context can be allocated on heap when in cache, one time for each call_graph node. ipa_call_context is constructible from ipa_auto_call_arg_values but then its vectors must not be resized, otherwise the vectors will stop pointing to the stack ones. Unfortunately, I don't think the structure embedded in ipa_call_context can be made constant because we need to manipulate and deallocate it when in cache. gcc/ChangeLog: 2020-09-01 Martin Jambor <mjambor@suse.cz> * ipa-prop.h (ipa_auto_call_arg_values): New type. (class ipa_call_arg_values): Likewise. (ipa_get_indirect_edge_target): Replaced vector arguments with ipa_call_arg_values in declaration. Added an overload for ipa_auto_call_arg_values. * ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals, m_known_contexts, m_known_aggs, duplicate_from, release and equal_to, new members m_avals, store_to_cache and equivalent_to_p. Adjusted construcotr arguments. (estimate_ipcp_clone_size_and_time): Replaced vector arguments with ipa_auto_call_arg_values in declaration. (evaluate_properties_for_edge): Likewise. * ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on ipa_call_arg_values rather than on separate vectors. Added an overload for ipa_auto_call_arg_values. (devirtualization_time_bonus): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (gather_context_independent_values): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (perform_estimation_of_a_value): Likewise. (estimate_local_effects): Likewise. (modify_known_vectors_with_val): Adjusted both variants to work on ipa_auto_call_arg_values and rename them to copy_known_vectors_add_val. (decide_about_value): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (decide_whether_version_node): Likewise. * ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise. (evaluate_properties_for_edge): Likewise. (ipa_fn_summary_t::duplicate): Likewise. (estimate_edge_devirt_benefit): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_edge_size_and_time): Likewise. (estimate_calls_size_and_time_1): Likewise. (summarize_calls_size_and_time): Adjusted calls to estimate_edge_size_and_time. (estimate_calls_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (ipa_call_context::ipa_call_context): Construct from a pointer to ipa_auto_call_arg_values instead of inividual vectors. (ipa_call_context::duplicate_from): Adjusted to access vectors within m_avals. (ipa_call_context::release): Likewise. (ipa_call_context::equal_to): Likewise. (ipa_call_context::estimate_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_ipcp_clone_size_and_time): Adjusted to work with ipa_auto_call_arg_values rather than on separate vectors. (ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to estimate_edge_size_and_time. (ipa_update_overall_fn_summary): Adjusted call to estimate_edge_size_and_time. * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with ipa_auto_call_arg_values rather than with separate vectors. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values): New destructor.	2020-10-02 18:41:34 +02:00
Patrick Palka	080a23bce1	libstdc++: Add missing P0896 changes to <iterator> I noticed that the following changes from this paper were not yet implemented. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (reverse_iterator::iter_move): Define for C++20 as per P0896. (reverse_iterator::iter_swap): Likewise. (move_iterator::operator): Apply P0896 changes for C++20. (move_iterator::operator[]): Likewise. testsuite/24_iterators/reverse_iterator/cust.cc: New test.	2020-10-02 10:51:31 -04:00
Joe Ramsay	251950d899	arm: Remove coercion from scalar argument to vmin & vmax intrinsics This patch fixes an issue with vmin* and vmax* intrinsics which accept a scalar argument. Previously when the scalar was of different width to the vector elements this would generate __ARM_undef. This change allows the scalar argument to be implicitly converted to the correct width. Also tidied up the relevant unit tests, some of which would have passed even if only one of two or three intrinsic calls had compiled correctly. Bootstrapped and tested on arm-none-eabi, gcc and CMSIS_DSP testsuites are clean. OK for trunk? Thanks, Joe gcc/ChangeLog: 2020-08-10 Joe Ramsay <joe.ramsay@arm.com> * config/arm/arm_mve.h (__arm_vmaxnmavq): Remove coercion of scalar argument. (__arm_vmaxnmvq): Likewise. (__arm_vminnmavq): Likewise. (__arm_vminnmvq): Likewise. (__arm_vmaxnmavq_p): Likewise. (__arm_vmaxnmvq_p): Likewise (and delete duplicate definition). (__arm_vminnmavq_p): Likewise. (__arm_vminnmvq_p): Likewise. (__arm_vmaxavq): Likewise. (__arm_vmaxavq_p): Likewise. (__arm_vmaxvq): Likewise. (__arm_vmaxvq_p): Likewise. (__arm_vminavq): Likewise. (__arm_vminavq_p): Likewise. (__arm_vminvq): Likewise. (__arm_vminvq_p): Likewise. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/intrinsics/vmaxavq_p_s16.c: Add test for mismatched width of scalar argument. * gcc.target/arm/mve/intrinsics/vmaxavq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u8.c: Likewise.	2020-10-02 15:38:58 +01:00
Kyrylo Tkachov	c8c77ed747	AArch64: Add neoversev1_tunings struct This patch adds a Neoverse V1-specific tuning struct that currently is just a deduplication of the N1 struct it was using before and specifying the SVE width. This will allow us to tweak Neoverse V1 things in the future as needed. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ * config/aarch64/aarch64.c (neoversev1_tunings): Define. * config/aarch64/aarch64-cores.def (zeus): Use it. (neoverse-v1): Likewise.	2020-10-02 15:23:19 +01:00
Jan Hubicka	762cca0023	Perforate fnspec strings gcc/ChangeLog: 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * attr-fnspec.h: Update documentation. (attr_fnsec::return_desc_size): Set to 2 (attr_fnsec::arg_desc_size): Set to 2 * builtin-attrs.def (STR1): Update fnspec. * internal-fn.def (UBSAN_NULL): Update fnspec. (UBSAN_VPTR): Update fnspec. (UBSAN_PTR): Update fnspec. (ASAN_CHECK): Update fnspec. (GOACC_DIM_SIZE): Remove fnspec. (GOACC_DIM_POS): Remove fnspec. * tree-ssa-alias.c (attr_fnspec::verify): Update verification. gcc/fortran/ChangeLog: 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * trans-decl.c (gfc_build_library_function_decl_with_spec): Verify fnspec. (gfc_build_intrinsic_function_decls): Update fnspecs. (gfc_build_builtin_function_decls): Update fnspecs. * trans-io.c (gfc_build_io_library_fndecls): Update fnspecs. * trans-types.c (create_fn_spec): Update fnspecs.	2020-10-02 15:56:12 +02:00
Nathan Sidwell	1d3e12c469	c++: Simplify __FUNCTION__ creation I had reason to wander into cp_make_fname, and noticed it's the only caller of cp_fname_init. Folding it in makes the code simpler. gcc/cp/ * cp-tree.h (cp_fname_init): Delete declaration. * decl.c (cp_fname_init): Merge into only caller ... (cp_make_fname): ... here & refactor.	2020-10-02 05:01:17 -07:00
Jan Hubicka	05d39f0de9	Commonize handling of attr-fnspec * attr-fnspec.h: New file. * calls.c (decl_return_flags): Use attr_fnspec. * gimple.c (gimple_call_arg_flags): Use attr_fnspec. (gimple_call_return_flags): Use attr_fnspec. * tree-into-ssa.c (pass_build_ssa::execute): Use attr_fnspec. * tree-ssa-alias.c (attr_fnspec::verify): New member fuction.	2020-10-02 13:31:05 +02:00
Jan Hubicka	b8e773e992	Break out ao_ref_init_from_ptr_and_range from ao_ref_init_from_ptr_and_size * tree-ssa-alias.c (ao_ref_init_from_ptr_and_range): Break out from ... (ao_ref_init_from_ptr_and_size): ... here.	2020-10-02 13:14:57 +02:00
Jan Hubicka	8d1cede1bb	Add poly_int64 streaming support 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * data-streamer-in.c (streamer_read_poly_int64): New function. * data-streamer-out.c (streamer_write_poly_int64): New function. * data-streamer.h (streamer_write_poly_int64): Declare. (streamer_read_poly_int64): Declare.	2020-10-02 13:01:01 +02:00
Richard Sandiford	0eb5e901f6	aarch64: Remove aarch64_sve_pred_dominates_p In r11-2922, Przemek fixed a post-RA instruction match failure caused by the SVE FP subtraction patterns.. This patch applies the same fix to the other patterns. To recap, the issue is around the handling of predication. We want to do two things: - Optimise cases in which a predicate is known to be all-true. - Differentiate cases in which the predicate on an _x ACLE function has to be kept as-is from cases in which we can make more lanes active. The former is true by default, the latter is true for certain combinations of flags in the -ffast-math group. This is handled by a boolean flag in the unspecs to say whether the predicate is “strict” or “relaxed”. When combining multiple strict operations, the predicates used in the operations generally need to match. When combining multiple relaxed operations, we can ignore the predicates on nested operations and just use the predicate on the “outermost” operation. Originally I'd tried to reduce combinatorial explosion by using aarch64_sve_pred_dominates_p. This required matching predicates for strict operations but allowed more combinations for relaxed operations. The problem (as I should have remembered) is that C conditions on insn patterns can't reliably enforce matching operands. If the same register is used in two different input operands, the RA is allowed to use different hard registers for those input operands (and sometimes it has to). So operands that match before RA might not match afterwards. The only sure way to force a match is via match_dup. This patch splits the cases into two. I cry bitter tears at having to do this, but I think it's the only backportable fix. There might be some way of using define_subst to generate the cond_* patterns from the pred_* patterns, with some alternatives strategically disabled in each case, but that's future work and might not be an improvement. Since so many patterns now do this, I moved the comments from the subtraction pattern to a new banner comment at the head of the file. gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p): Delete. * config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): Likewise. * config/aarch64/aarch64-sve.md: Add banner comment describing how merging predicated FP operations are represented. (cond_<SVE_COND_FP_UNARY:optab><mode>_2): Split into... (cond_<SVE_COND_FP_UNARY:optab><mode>_2_relaxed): ...this and... (cond_<SVE_COND_FP_UNARY:optab><mode>_2_strict): ...this. (cond_<SVE_COND_FP_UNARY:optab><mode>_any): Split into... (cond_<SVE_COND_FP_UNARY:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FP_UNARY:optab><mode>_any_strict): ...this. (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): Split into... (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_strict): ...this. (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Split into... (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_strict): ...this. (cond_<SVE_COND_FP_BINARY:optab><mode>_2): Split into... (cond_<SVE_COND_FP_BINARY:optab><mode>_2_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY:optab><mode>_2_strict): ...this. (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const): Split into... (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_strict): ...this. (cond_<SVE_COND_FP_BINARY:optab><mode>_3): Split into... (cond_<SVE_COND_FP_BINARY:optab><mode>_3_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY:optab><mode>_3_strict): ...this. (cond_<SVE_COND_FP_BINARY:optab><mode>_any): Split into... (cond_<SVE_COND_FP_BINARY:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY:optab><mode>_any_strict): ...this. (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const): Split into... (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_relaxed): ...this and... (cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_strict): ...this. (cond_add<mode>_2_const): Split into... (cond_add<mode>_2_const_relaxed): ...this and... (cond_add<mode>_2_const_strict): ...this. (cond_add<mode>_any_const): Split into... (cond_add<mode>_any_const_relaxed): ...this and... (cond_add<mode>_any_const_strict): ...this. (cond_<SVE_COND_FCADD:optab><mode>_2): Split into... (cond_<SVE_COND_FCADD:optab><mode>_2_relaxed): ...this and... (cond_<SVE_COND_FCADD:optab><mode>_2_strict): ...this. (cond_<SVE_COND_FCADD:optab><mode>_any): Split into... (cond_<SVE_COND_FCADD:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FCADD:optab><mode>_any_strict): ...this. (cond_sub<mode>_3_const): Split into... (cond_sub<mode>_3_const_relaxed): ...this and... (cond_sub<mode>_3_const_strict): ...this. (aarch64_pred_abd<mode>): Split into... (aarch64_pred_abd<mode>_relaxed): ...this and... (aarch64_pred_abd<mode>_strict): ...this. (aarch64_cond_abd<mode>_2): Split into... (aarch64_cond_abd<mode>_2_relaxed): ...this and... (aarch64_cond_abd<mode>_2_strict): ...this. (aarch64_cond_abd<mode>_3): Split into... (aarch64_cond_abd<mode>_3_relaxed): ...this and... (aarch64_cond_abd<mode>_3_strict): ...this. (aarch64_cond_abd<mode>_any): Split into... (aarch64_cond_abd<mode>_any_relaxed): ...this and... (aarch64_cond_abd<mode>_any_strict): ...this. (cond_<SVE_COND_FP_TERNARY:optab><mode>_2): Split into... (cond_<SVE_COND_FP_TERNARY:optab><mode>_2_relaxed): ...this and... (cond_<SVE_COND_FP_TERNARY:optab><mode>_2_strict): ...this. (cond_<SVE_COND_FP_TERNARY:optab><mode>_4): Split into... (cond_<SVE_COND_FP_TERNARY:optab><mode>_4_relaxed): ...this and... (cond_<SVE_COND_FP_TERNARY:optab><mode>_4_strict): ...this. (cond_<SVE_COND_FP_TERNARY:optab><mode>_any): Split into... (cond_<SVE_COND_FP_TERNARY:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FP_TERNARY:optab><mode>_any_strict): ...this. (cond_<SVE_COND_FCMLA:optab><mode>_4): Split into... (cond_<SVE_COND_FCMLA:optab><mode>_4_relaxed): ...this and... (cond_<SVE_COND_FCMLA:optab><mode>_4_strict): ...this. (cond_<SVE_COND_FCMLA:optab><mode>_any): Split into... (cond_<SVE_COND_FCMLA:optab><mode>_any_relaxed): ...this and... (cond_<SVE_COND_FCMLA:optab><mode>_any_strict): ...this. (aarch64_pred_fac<cmp_op><mode>): Split into... (aarch64_pred_fac<cmp_op><mode>_relaxed): ...this and... (aarch64_pred_fac<cmp_op><mode>_strict): ...this. (cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>): Split into... (cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_relaxed): ...this and... (cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_strict): ...this. (cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>): Split into... (cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_relaxed): ...this and... (cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_strict): ...this. * config/aarch64/aarch64-sve2.md (cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>): Split into... (cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_relaxed): ...this and... (cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_strict): ...this. (cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any): Split into... (cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_relaxed): ...this and... (cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_strict): ...this. (cond_<SVE2_COND_INT_UNARY_FP:optab><mode>): Split into... (cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_relaxed): ...this and... (*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_strict): ...this.	2020-10-02 11:53:06 +01:00
Richard Sandiford	bb78e5876a	arm: Make more use of the new mode macros As Christophe pointed out, my r11-3522 patch didn't in fact fix all of the armv8_2-fp16-arith-2.c failures introduced by allowing FP16 vectorisation without -funsafe-math-optimizations. I must have only tested the final patch on my usual arm-linux-gnueabihf bootstrap, which it turns out treats the test as unsupported. The focus of the original patch was to use mode macros for patterns that are shared between Advanced SIMD, iwMMXt and MVE. This patch uses the mode macros for general neon.md patterns too. gcc/ * config/arm/neon.md (sub<VDQ:mode>3_neon): Use the new mode macros for the insn condition. (sub<VH:mode>3, mul<VDQW:mode>3_neon): Likewise. (mul<VDQW:mode>3add<VDQW:mode>_neon): Likewise. (mul<VH:mode>3add<VH:mode>_neon): Likewise. (mul<VDQW:mode>3neg<VDQW:mode>add<VDQW:mode>_neon): Likewise. (fma<VCVTF:mode>4, fma<VH:mode>4, fmsub<VCVTF:mode>4): Likewise. (quad_halves_<code>v4sf, reduc_plus_scal_<VD:mode>): Likewise. (reduc_plus_scal_<VQ:mode>, reduc_smin_scal_<VD:mode>): Likewise. (reduc_smin_scal_<VQ:mode>, reduc_smax_scal_<VD:mode>): Likewise. (reduc_smax_scal_<VQ:mode>, mul<VH:mode>3): Likewise. (neon_vabd<VF:mode>_2, neon_vabd<VF:mode>_3): Likewise. (fma<VH:mode>4_intrinsic): Delete. (neon_vadd<VCVTF:mode>): Use the new mode macros to decide which form of instruction to generate. (neon_vmla<VDQW:mode>, neon_vmls<VDQW:mode>): Likewise. (neon_vsub<VCVTF:mode>): Likewise. (neon_vfma<VH:mode>): Generate the main fma<mode>4 form instead of using fma<mode>4_intrinsic. gcc/testsuite/ gcc.target/arm/armv8_2-fp16-arith-2.c (float16_t): Use _Float16_t rather than __fp16. (float16x4_t, float16x4_t): Likewise. (fp16_abs): Use __builtin_fabsf16.	2020-10-02 11:53:05 +01:00
Alex Coplan	01c288035a	aarch64: ilp32 testsuite fixes This fixes test failures on ilp32 introduced in r11-3032-gd4febc75e8dfab23bd3132d5747eded918f85107. The assembler checks in extend-syntax.c simply needed adjusting for 32-bit pointers. It appears the subsp.c test has never passed on ILP32 due to a missed optimisation there. Since this isn't a code quality regression, disable that check on ILP32. gcc/testsuite/ChangeLog: * gcc.target/aarch64/extend-syntax.c: Fix assembler checks for ilp32, disable check-function-bodies on ilp32. * gcc.target/aarch64/subsp.c: Only check second scan-assembler on lp64 since the code on ilp32 is missing the optimization needed for this test to pass.	2020-10-02 11:17:00 +01:00
Martin Liska	f8dcbea5d2	GCOV: do not mangle .gcno files. gcc/ChangeLog: PR gcov-profile/97193 * coverage.c (coverage_init): GCDA note files should not be mangled and should end in output directory.	2020-10-02 12:10:03 +02:00
Tobias Burnus	2fe5a545e0	libgomp: Regenerate configure files with automake 1.15.1 libgomp/ChangeLog: * Makefile.in: Regenerate with automake 1.15.1. * aclocal.m4: Likewise. * configure: Likewise. * testsuite/Makefile.in: Likewise.	2020-10-02 12:08:47 +02:00
Jason Merrill	4f4ced2882	c++: Set CALL_FROM_NEW_OR_DELETE_P on more calls. We were failing to set the flag on a delete call in a new expression, in a deleting destructor, and in a coroutine. Fixed by setting it in the function that builds the call. 2020-10-02 Jason Merril <jason@redhat.com> gcc/cp/ChangeLog: * call.c (build_operator_new_call): Set CALL_FROM_NEW_OR_DELETE_P. (build_op_delete_call): Likewise. * init.c (build_new_1, build_vec_delete_1, build_delete): Not here. (build_delete): gcc/ChangeLog: * gimple.h (gimple_call_operator_delete_p): Rename from gimple_call_replaceable_operator_delete_p. * gimple.c (gimple_call_operator_delete_p): Likewise. * tree.h (DECL_IS_REPLACEABLE_OPERATOR_DELETE_P): Remove. * tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Adjust. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. gcc/testsuite/ChangeLog: * g++.dg/pr94314.C: new/delete no longer omitted.	2020-10-02 11:22:20 +02:00
Richard Biener	0b945f959f	make use of CALL_FROM_NEW_OR_DELETE_P This fixes points-to analysis and DCE to only consider new/delete operator calls from new or delete expressions and not direct calls. 2020-10-01 Richard Biener <rguenther@suse.de> * gimple.h (GF_CALL_FROM_NEW_OR_DELETE): New call flag. (gimple_call_set_from_new_or_delete): New. (gimple_call_from_new_or_delete): Likewise. * gimple.c (gimple_build_call_from_tree): Set GF_CALL_FROM_NEW_OR_DELETE appropriately. * ipa-icf-gimple.c (func_checker::compare_gimple_call): Compare gimple_call_from_new_or_delete. * tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Make sure to only consider new/delete calls from new or delete expressions. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. * g++.dg/tree-ssa/pta-delete-1.C: New testcase.	2020-10-02 11:22:20 +02:00
Jason Merrill	b6158faacb	c++: Move CALL_FROM_NEW_OR_DELETE_P to tree.h As discussed with richi, we should be able to use TREE_PROTECTED for this flag, since CALL_FROM_THUNK_P will never be set on a call to an operator new or delete. 2020-10-01 Jason Merril <jason@redhat.com> gcc/cp/ChangeLog: * lambda.c (call_from_lambda_thunk_p): New. * cp-gimplify.c (cp_genericize_r): Use it. * pt.c (tsubst_copy_and_build): Use it. * typeck.c (check_return_expr): Use it. * cp-tree.h: Declare it. (CALL_FROM_NEW_OR_DELETE_P): Move to gcc/tree.h. gcc/ChangeLog: * tree.h (CALL_FROM_NEW_OR_DELETE_P): Move from cp-tree.h. * tree-core.h: Document new usage of protected_flag.	2020-10-02 11:21:28 +02:00
Aldy Hernandez	6a0423c52e	Implement irange::fits_p. This should have been included in the irange_allocator patch, as a method to see if the current object can hold a passed range without truncation. gcc/ChangeLog: * value-range.h (irange::fits_p): New.	2020-10-02 10:36:17 +02:00
GCC Administrator	6c2675fa2b	Daily bump.	2020-10-02 00:16:27 +00:00
Ian Lance Taylor	3e52eaab8c	compiler: set varargs correctly for type of method expression Fixes golang/go#41737 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/258977	2020-10-01 16:10:17 -07:00
Alan Modra	4c69e61f43	[RS6000] ICE in decompose, at rtl.h:2282 during RTL pass: fwprop1 gcc.dg/pr82596.c: In function 'test_cststring': gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282 -m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after applying rtx_cost patches, which exposed a backend bug. legitimize_address when presented with the following address (plus (reg) (const_int 0x7ffffffff)) attempts to rewrite it as a high/low sum. The low part is 0xffff, or -1, making the high part 0x80000000. But this is no longer canonical for SImode. * config/rs6000/rs6000.c (rs6000_legitimize_address): Use gen_int_mode for high part of address constant.	2020-10-02 08:36:25 +09:30
Alan Modra	d26cc5885a	[RS6000] rs6000_linux64_override_options fix Commit `c6be439b37` wrongly left a block of code inside an "else" block, which changed the default for power10 TARGET_NO_FP_IN_TOC accidentally. We don't want FP constants in the TOC when -mcmodel=medium can address them just as efficiently outside the TOC. * config/rs6000/rs6000.c (rs6000_linux64_override_options): Formatting. Correct setting of TARGET_NO_FP_IN_TOC and TARGET_NO_SUM_IN_TOC.	2020-10-02 08:13:44 +09:30

... 2 3 4 5 6 ...

179956 Commits