mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-12-13 22:03:43 +08:00

Author	SHA1	Message	Date
Chung-Lin Tang	6c0399378e	OpenMP 5.0: Remove array section base-pointer mapping semantics and other front-end adjustments This patch implements three pieces of functionality: (1) Adjust array section mapping to have standards conforming behavior, mapping array sections should NOT also map the base-pointer: struct S { int ptr; ... }; struct S s; Instead of generating this during gimplify: map(to:_1 [len: 400]) map(attach:s.ptr [bias: 0]) Now, adjust to: (i.e. do not map the base-pointer together. The attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: (A small Fortran front-end patch to trans-openmp.c:gfc_trans_omp_array_section is also included, which removes generation of a GOMP_MAP_ALWAYS_POINTER for array types, which appears incorrect and causes a regression in libgomp.fortranlibgomp.fortran/struct-elem-map-1.f90) (2) Related to the first item above, are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. These changes are enabled for both OpenACC and OpenMP. gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree offsetp' parameter, accomodate case where 'offset' return of get_inner_reference is non-NULL. (is_or_contains_p): Further robustify conditions. (omp_target_reorder_clauses): In alloc/to/from sorting phase, also move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting phase where we make sure pointers with an attach/detach map are ordered correctly. (gimplify_scan_omp_clauses): Add modifications to avoid creating GOMP_MAP_STRUCT and associated alloc map for attach/detach maps. gcc/testsuite/ChangeLog: c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase. * c-c++-common/gomp/target-enter-data-1.c: New testcase. * c-c++-common/gomp/target-implicit-map-2.c: New testcase. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Make sure attached pointer is not overwritten during cross-host/device copying. (gomp_update): Likewise. (gomp_exit_data): Likewise. * testsuite/libgomp.c++/target-11.C: Adjust testcase. * testsuite/libgomp.c++/target-12.C: Likewise. * testsuite/libgomp.c++/target-15.C: Likewise. * testsuite/libgomp.c++/target-16.C: Likewise. * testsuite/libgomp.c++/target-17.C: Likewise. * testsuite/libgomp.c++/target-21.C: Likewise. * testsuite/libgomp.c++/target-23.C: Likewise. * testsuite/libgomp.c/target-23.c: Likewise. * testsuite/libgomp.c/target-29.c: Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.	2021-12-09 00:01:10 +08:00
Chung-Lin Tang	0ab29cf0bb	openmp: Improve OpenMP target support for C++ (PR92120) This patch implements several C++ specific mapping capabilities introduced for OpenMP 5.0, including implicit mapping of this[:1] for non-static member functions, zero-length array section mapping of pointer-typed members, lambda captured variable access in target regions, and use of lambda objects inside target regions. Several adjustments to the C/C++ front-ends to allow more member-access syntax as valid is also included. PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (finish_omp_target_clauses): Likewise. * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. (cp_parser_omp_target): Factor out code, adjust into calls to new function finish_omp_target. * pt.c (tsubst_expr): Add call to finish_omp_target_clauses for OMP_TARGET case. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. Remove case of rejecting 'this' when not in declare simd. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. Handle 'A->member' case in map clauses. Remove case of rejecting 'this' when not in declare simd. (struct omp_target_walk_data): New struct for walking over target-directive tree body. (finish_omp_target_clauses_r): New function for tree walk. (finish_omp_target_clauses): New function. (finish_omp_target): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in call to c_parser_omp_variable_list to 'true'. * c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in array base handling. (c_finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> . Adjust struct map handling to handle cases of A and A->B expressions. Under !DECL_P case of GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to struct_deref_set for map(ptr_to_struct) cases. Add MEM_REF case when handling component_ref_p case. Add unshare_expr and gimplification when created GOMP_MAP_STRUCT is not a DECL. Add code to add firstprivate pointer for pointer-to-struct case. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/gomp/target-3.c: New testcase. * g++.dg/gomp/target-3.C: New testcase. * g++.dg/gomp/target-lambda-1.C: New testcase. * g++.dg/gomp/target-lambda-2.C: New testcase. * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. * g++.dg/gomp/target-this-5.C: New testcase. * g++.dg/gomp/this-2.C: Adjust testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-23.C: New testcase. * testsuite/libgomp.c++/target-lambda-1.C: New testcase. * testsuite/libgomp.c++/target-lambda-2.C: New testcase. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase. * testsuite/libgomp.c++/target-this-5.C: New testcase.	2021-12-08 22:29:06 +08:00
GCC Administrator	70e4cb66c1	Daily bump.	2021-12-05 00:16:28 +00:00
Tobias Burnus	689407ef91	Fortran/OpenMP: Support most of 5.1 atomic extensions Implements moste of OpenMP 5.1 atomic extensions, except that 'compare' is parsed but rejected during resolution. (As the trans-openmp.c handling is missing.) gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle weak/compare/fail clause. * gfortran.h (gfc_omp_clauses): Add weak, compare, fail. * openmp.c (enum omp_mask1, gfc_match_omp_clauses, OMP_ATOMIC_CLAUSES): Update for new clauses. (gfc_match_omp_atomic): Update for 5.1 atomic changes. (is_conversion): Support widening in one go. (is_scalar_intrinsic_expr): New. (resolve_omp_atomic): Update for 5.1 atomic changes. * parse.c (parse_omp_oacc_atomic): Update for compare. * resolve.c (gfc_resolve_blocks): Update asserts. * trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/atomic-2.f90: Move now supported code to ... * gfortran.dg/gomp/atomic.f90: here. * gfortran.dg/gomp/atomic-10.f90: New test. * gfortran.dg/gomp/atomic-12.f90: New test. * gfortran.dg/gomp/atomic-15.f90: New test. * gfortran.dg/gomp/atomic-16.f90: New test. * gfortran.dg/gomp/atomic-17.f90: New test. * gfortran.dg/gomp/atomic-18.f90: New test. * gfortran.dg/gomp/atomic-19.f90: New test. * gfortran.dg/gomp/atomic-20.f90: New test. * gfortran.dg/gomp/atomic-22.f90: New test. * gfortran.dg/gomp/atomic-24.f90: New test. * gfortran.dg/gomp/atomic-25.f90: New test. * gfortran.dg/gomp/atomic-26.f90: New test. libgomp/ChangeLog * libgomp.texi (OpenMP 5.1): Update status.	2021-12-04 19:43:46 +01:00
Tobias Burnus	b09af56214	libgomp.texi: Update OMP_PLACES libgomp/ChangeLog: * libgomp.texi (OMP_PLACES): Extend description for OMP 5.1 changes.	2021-12-04 13:28:03 +01:00
GCC Administrator	ea6ef320b0	Daily bump.	2021-12-03 00:17:04 +00:00
Chung-Lin Tang	1ac7a8c9e4	fortran: OpenMP/OpenACC array mapping alignment fix (PR90030) Fix issue with the Fortran front-end when mapping arrays: when creating the data MEM_REF for the map clause, there was a convention of casting the referencing pointer to 'c_char ' by fold_convert (build_pointer_type (char_type_node), ptr). This causes the alignment passed to the libgomp runtime for array data hardwared to '1', and causes alignment errors on the offload target. This patch fixes this by removing the char_type_node pointer converts, and adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)). PR fortran/90030 gcc/fortran/ChangeLog: trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer to char_type_node, add gcc_assert of POINTER_TYPE_P. (gfc_trans_omp_array_section): Likewise. (gfc_trans_omp_clauses): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/finalize-1.f: Adjust scan test. * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. * gfortran.dg/gomp/map-3.f90: Likewise. * gfortran.dg/gomp/pr78260-2.f90: Likewise. * gfortran.dg/gomp/pr78260-3.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. * testsuite/libgomp.fortran/pr90030.f90: New test.	2021-12-02 18:27:16 +08:00
GCC Administrator	c177e80609	Daily bump.	2021-12-01 00:17:04 +00:00
Kwok Cheung Yeung	f1a58ab0db	[OpenACC] Allow gang reductions inside serial constructs ... fixing a regression introduced in the preceding commit `2b7dac2c0d` "Make OpenACC orphan gang reductions errors". gcc/fortran/ * openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New. (resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of oacc_is_parallel. libgomp/ * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove temporary skip. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>	2021-11-30 12:58:53 +01:00
Cesar Philippidis	2b7dac2c0d	Make OpenACC orphan gang reductions errors This patch promotes all OpenACC gang reductions on orphan loops as errors. Accord to the spec, orphan loops are those which are not lexically nested inside an OpenACC parallel or kernels regions. I.e., acc loops inside acc routines. At first I thought this could be a warning because the gang reduction finalizer uses an atomic update. However, because there is no synchronization between gangs, there is way to guarantee that reduction will have completed once a single gang entity returns from the acc routine call. gcc/c/ * c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC gang reductions. gcc/cp/ * semantics.c (finish_omp_clauses): Emit an error on orphan OpenACC gang reductions. gcc/fortran/ * openmp.c (oacc_is_parallel, oacc_is_kernels): New 'static' functions. (resolve_oacc_loop_blocks): Emit an error on orphan OpenACC gang reductions. gcc/ * omp-general.h (enum oacc_loop_flags): Add OLF_REDUCTION enum. * omp-low.c (lower_oacc_head_mark): Use it to mark OpenACC reductions. * omp-offload.c (oacc_loop_auto_partitions): Don't assign gang level parallelism to orphan reductions. gcc/testsuite/ * c-c++-common/goacc/nested-reductions-1-routine.c: Adjust. * c-c++-common/goacc/nested-reductions-2-routine.c: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise. * c-c++-common/goacc/orphan-reductions-1.c: New test. * c-c++-common/goacc/orphan-reductions-2.c: New test. * gfortran.dg/goacc/orphan-reductions-1.f90: New test. * gfortran.dg/goacc/orphan-reductions-2.f90: New test. libgomp/ * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Temporarily skip. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>	2021-11-30 12:58:45 +01:00
GCC Administrator	87cd82c81d	Daily bump.	2021-11-30 00:16:44 +00:00
Richard Biener	16507dea75	Remove unreachable returns This removes unreachable return statements as diagnosed by the -Wunreachable-code patch. Some cases are more obviously an improvement than others - in fact some may get you the idea to replace them with gcc_unreachable () instead, leading to cases of the 'Remove unreachable gcc_unreachable () at the end of functions' patch. 2021-11-25 Richard Biener <rguenther@suse.de> * vec.c (qsort_chk): Do not return the void return value from the noreturn qsort_chk_error. * ccmp.c (expand_ccmp_expr_1): Remove unreachable return. * df-scan.c (df_ref_equal_p): Likewise. * dwarf2out.c (is_base_type): Likewise. (add_const_value_attribute): Likewise. * fixed-value.c (fixed_arithmetic): Likewise. * gimple-fold.c (gimple_fold_builtin_fputs): Likewise. * gimple-ssa-strength-reduction.c (stmt_cost): Likewise. * graphite-isl-ast-to-gimple.c (gcc_expression_from_isl_expr_op): Likewise. (gcc_expression_from_isl_expression): Likewise. * ipa-fnsummary.c (will_be_nonconstant_expr_predicate): Likewise. * lto-streamer-in.c (lto_input_mode_table): Likewise. gcc/c-family/ * c-opts.c (c_common_post_options): Remove unreachable return. * c-pragma.c (handle_pragma_target): Likewise. (handle_pragma_optimize): Likewise. gcc/c/ * c-typeck.c (c_tree_equal): Remove unreachable return. * c-parser.c (get_matching_symbol): Likewise. libgomp/ * oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): Remove unreachable return.	2021-11-29 11:17:22 +01:00
GCC Administrator	d9ca4b45bd	Daily bump.	2021-11-25 00:16:29 +00:00
Jakub Jelinek	5bca26742c	openmp: Fix up handling of kind(host) and kind(nohost) in ACCEL_COMPILERs [PR103384] As the testcase shows, we weren't handling kind(host) and kind(nohost) properly in the ACCEL_COMPILERs, the code written in there is valid for the host compiler only, where if we are maybe offloaded, we defer resolution after IPA, otherwise return 0 for kind(nohost) and accept it for kind(host). Note, omp_maybe_offloaded is false after IPA. If ACCEL_COMPILER is defined, it is the other way around, but also we know we are after IPA. 2021-11-24 Jakub Jelinek <jakub@redhat.com> PR middle-end/103384 gcc/ * omp-general.c (omp_context_selector_matches): For ACCEL_COMPILER, return 0 for kind(host) and continue for kind(nohost). libgomp/ * testsuite/libgomp.c/declare-variant-2.c: New test.	2021-11-24 10:30:32 +01:00
GCC Administrator	483092d3d9	Daily bump.	2021-11-19 00:16:34 +00:00
David Edelsohn	1b2b930152	Fix typo. libgomp/ChangeLog: * alloc.c (gomp_aligned_alloc): Fix typo.	2021-11-18 10:26:55 -05:00
Jakub Jelinek	17da2c7425	libgomp: Ensure that either gomp_team is properly aligned [PR102838] struct gomp_team has struct gomp_work_share array inside of it. If that latter structure has 64-byte aligned member in the middle, the whole struct gomp_team needs to be 64-byte aligned, but we weren't allocating it using gomp_aligned_alloc. This patch fixes that, except that on gcn team_malloc is special, so I've instead decided at least for now to avoid using aligned member and use the padding instead on gcn. 2021-11-18 Jakub Jelinek <jakub@redhat.com> PR libgomp/102838 * libgomp.h (GOMP_USE_ALIGNED_WORK_SHARES): Define if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined and __AMDGCN__ is not. (struct gomp_work_share): Use GOMP_USE_ALIGNED_WORK_SHARES instead of GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC. * work.c (alloc_work_share, gomp_work_share_start): Likewise. * team.c (gomp_new_team): If GOMP_USE_ALIGNED_WORK_SHARES, use gomp_aligned_alloc instead of team_malloc.	2021-11-18 09:10:40 +01:00
Jakub Jelinek	7a2aa63fad	libgomp: Fix up aligned_alloc arguments [PR102838] C says that aligned_alloc size must be an integral multiple of alignment. While glibc doesn't care about it, apparently Solaris does. So, this patch decreases the priority of aligned_alloc among the other variants because it needs more work and can waste more memory and rounds up the size to multiple of alignment. 2021-11-18 Jakub Jelinek <jakub@redhat.com> PR libgomp/102838 * alloc.c (gomp_aligned_alloc): Prefer _aligned_alloc over memalign over posix_memalign over aligned_alloc over fallback with malloc instead of aligned_alloc over _aligned_alloc over posix_memalign over memalign over fallback with malloc. For aligned_alloc, round up size up to multiple of al.	2021-11-18 09:07:31 +01:00
GCC Administrator	6b1695f4a0	Daily bump.	2021-11-17 00:16:29 +00:00
Jakub Jelinek	9ceaf0fee3	libgomp: Mark thread_limit clause to target construct as implemented After the Fortran changes we can mark it as implemented... 2021-11-16 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (OpenMP 5.1): Mark thread_limit clause to target construct as implemented.	2021-11-16 10:21:56 +01:00
GCC Administrator	e2b57363fc	Daily bump.	2021-11-16 00:16:31 +00:00
Tobias Burnus	82ec4cb3c4	Fortran: openmp: Add support for thread_limit clause on target gcc/fortran/ChangeLog: * openmp.c (OMP_TARGET_CLAUSES): Add thread_limit. * trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to teams. libgomp/ChangeLog: * testsuite/libgomp.fortran/thread-limit-1.f90: New test.	2021-11-15 15:44:11 +01:00
Jakub Jelinek	aea7238683	openmp: Add support for thread_limit clause on target OpenMP 5.1 says that thread_limit clause can also appear on target, and similarly to teams should affect the thread-limit-var ICV. On combined target teams, the clause goes to both. We actually passed thread_limit internally on target already before, but only used it for gcn/ptx offloading to hint how many threads should be created and for ptx didn't set thread_limit_var in that case. Similarly for host fallback. Also, I found that we weren't copying the args array that contains encoded thread_limit and num_teams clause for target (etc.) for async target. 2021-11-15 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT to OMP_TARGET_CLAUSES if it isn't there already. gcc/c-family/ * c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>: Duplicate to both OMP_TARGET and OMP_TEAMS. gcc/c/ * c-parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. gcc/cp/ * parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. libgomp/ * task.c (gomp_create_target_task): Copy args array as well. * target.c (gomp_target_fallback): Add args argument. Set gomp_icv (true)->thread_limit_var if thread_limit is present. (GOMP_target): Adjust gomp_target_fallback caller. (GOMP_target_ext): Likewise. (gomp_target_task_fn): Likewise. * config/nvptx/team.c (gomp_nvptx_main): Set gomp_global_icv.thread_limit_var. * testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.	2021-11-15 13:20:53 +01:00
Jakub Jelinek	9fa72756d9	libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound Here is a PTX implementation of what I was talking about, that for num_teams_upper 0 or whenever num_teams_lower <= num_blocks, the current implementation is fine but if the user explicitly asks for more teams than we can provide in hardware, we need to stop assuming that omp_get_team_num () is equal to the hw team id, but instead need to use some team specific memory (it is .shared for PTX), or if none is provided, array indexed by the hw team id and run some teams serially within the same hw thread. 2021-11-15 Jakub Jelinek <jakub@redhat.com> * config/nvptx/team.c (__gomp_team_num): Define as __attribute__((shared)) var. (gomp_nvptx_main): Initialize __gomp_team_num to 0. * config/nvptx/target.c (__gomp_team_num): Declare as extern __attribute__((shared)) var. (GOMP_teams4): Use __gomp_team_num as the team number instead of %ctaid.x. If first, initialize it to %ctaid.x. If num_teams_lower is bigger than num_blocks, use num_teams_lower teams and arrange for bumping of __gomp_team_num if !first and returning false once we run out of teams. * config/nvptx/teams.c (__gomp_team_num): Declare as extern __attribute__((shared)) var. (omp_get_team_num): Return __gomp_team_num value instead of %ctaid.x.	2021-11-15 09:20:52 +01:00
Jakub Jelinek	d294459720	libgomp: Add a testcase for omp_get_num_teams inside of target inside of host teams This is https://github.com/OpenMP/spec/issues/3183 There is an agreement that we should return 1 team inside of target, even if that target is inside of host teams. We were doing that when offloading and not during host fallback, r12-5151 should fix that even for host fallback. 2021-11-15 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/teams-5.c: New test.	2021-11-15 08:58:39 +01:00
GCC Administrator	af2852b9dc	Daily bump.	2021-11-13 00:16:39 +00:00
Jakub Jelinek	f49c7a4fb2	libgomp: Unbreak gcn offload build My recent libgomp change apparently broke libgomp build for gcn offloading. The problem is that gcn, unlike nvptx, doesn't override teams.c source file and the patch I've committed assumed all the non-LIBGOMP_USE_PTHREADS targets do not use it. My understanding is that gcn included omp_get_num_teams and omp_get_team_num definitions in both icv-device.o and teams.o, with the definitions only in the former working correctly. This patch brings gcn into sync with how nvptx does it, that teams.c is overridden, provides a dummy GOMP_teams_reg and omp_get_{num_teams,team_num} definitions and icv-device.c doesn't provide those. 2021-11-12 Jakub Jelinek <jakub@redhat.com> PR target/103201 * config/gcn/icv-device.c (omp_get_num_teams, omp_get_team_num): Move to ... * config/gcn/teams.c: ... here. New file.	2021-11-12 16:11:02 +01:00
Chung-Lin Tang	b7e2048063	openmp: Relax handling of implicit map vs. existing device mappings This patch implements relaxing the requirements when a map with the implicit attribute encounters an overlapping existing map. As the OpenMP 5.0 spec describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22): "If a single contiguous part of the original storage of a list item with an implicit data-mapping attribute has corresponding storage in the device data environment prior to a task encountering the construct that is associated with the map clause, only that part of the original storage will have corresponding storage in the device data environment as a result of the map clause." 2021-11-12 Chung-Lin Tang <cltang@codesourcery.com> include/ChangeLog: * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define special bit macro. (GOMP_MAP_IMPLICIT): New special map kind bits value. (GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of special map kind bits. (GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds. gcc/ChangeLog: * tree.h (OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P): New access macro for 'implicit' bit, using 'base.deprecated_flag' field of tree_node. * tree-pretty-print.c (dump_omp_clause): Add support for printing implicit attribute in tree dumping. * gimplify.c (gimplify_adjust_omp_clauses_1): Set OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P to 1 if map clause is implicitly created. (gimplify_adjust_omp_clauses): Adjust place of adding implicitly created clauses, from simple append, to starting of list, after non-map clauses. * omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind values passed to libgomp for implicit maps. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-implicit-map-1.c: New test. * c-c++-common/goacc/combined-reduction.c: Adjust scan test pattern. * c-c++-common/goacc/firstprivate-mappings-1.c: Likewise. * c-c++-common/goacc/mdc-1.c: Likewise. * g++.dg/goacc/firstprivate-mappings-1.C: Likewise. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add implicit map handling to allow a "superset" existing map as valid case. (get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value. (get_implicit): New function to extract implicit status. (gomp_map_fields_existing): Adjust arguments in calls to gomp_map_vars_existing, and add uses of get_implicit. (gomp_map_vars_internal): Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.	2021-11-12 20:29:48 +08:00
Jakub Jelinek	7d6da11fce	openmp: Honor OpenMP 5.1 num_teams lower bound The following patch implements what I've been talking about earlier, honor that for explicit num_teams clause we create at least the lower-bound (if not specified, upper-bound) teams in the league. For host fallback, it still means we only have one thread doing all the teams, sequentially one after another. For PTX and GCN, I think the new teams-2.c test and maybe teams-4.c too will or might fail. For these offloads, I think it is ok to remove symbols no longer used from libgomp.a. If num_teams_lower is bigger than the provided num_blocks or num_workgroups, we should arrange for gomp_num_teams_var to be num_teams_lower - 1, stop using the %ctaid.x or __builtin_gcn_dim_pos (0) for omp_get_team_num () and instead use for it some .shared var that GOMP_teams4 initializes to %ctaid.x or __builtin_gcn_dim_pos (0) when first and for !first increment that by num_blocks or num_workgroups each time and only return false when we are above num_teams_lower. Any help with actually implementing this for the 2 architectures highly appreciated. 2021-11-12 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-builtins.def (BUILT_IN_GOMP_TEAMS): Remove. (BUILT_IN_GOMP_TEAMS4): New. * builtin-types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. * omp-low.c (lower_omp_teams): Use GOMP_teams4 instead of GOMP_teams, pass to it also num_teams lower-bound expression or a dup of upper-bound if it is missing and a flag whether it is the first call or not. gcc/fortran/ * types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. libgomp/ * libgomp_g.h (GOMP_teams4): Declare. * libgomp.map (GOMP_5.1): Export GOMP_teams4. * target.c (GOMP_teams4): New function. * config/nvptx/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * config/gcn/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * testsuite/libgomp.c/teams-4.c (main): Expect exactly 2 teams instead of <= 2. * testsuite/libgomp.c-c++-common/teams-2.c: New test.	2021-11-12 12:41:22 +01:00
GCC Administrator	b39265d4fe	Daily bump.	2021-11-12 00:16:32 +00:00
Tobias Burnus	407eaad25f	Fortran/openmp: Add support for 2 argument num_teams clause Fortran part to commit r12-5146-g48d7327f2aaf65 gcc/fortran/ChangeLog: * gfortran.h (struct gfc_omp_clauses): Rename num_teams to num_teams_upper, add num_teams_upper. * dump-parse-tree.c (show_omp_clauses): Update to handle lower-bound num_teams clause. * frontend-passes.c (gfc_code_walker): Likewise * openmp.c (gfc_free_omp_clauses, gfc_match_omp_clauses, resolve_omp_clauses): Likewise. * trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses, gfc_trans_omp_target): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/teams-1.f90: New test.	2021-11-11 17:27:00 +01:00
Jakub Jelinek	fa4fcb111a	libgomp: Use TLS storage for omp_get_num_teams()/omp_get_team_num() values When thinking about GOMP_teams3, I've realized that using global variables for the values returned by omp_get_num_teams()/omp_get_team_num() calls is incorrect even with our right now dumb way of implementing host teams. The problems are two, one is if host teams is used from multiple pthread_create created threads - the spec says that host teams can't be nested inside of explicit parallel or other teams constructs, but with pthread_create the standard says obviously nothing about it. Another more important thing is host fallback, right now we don't do anything for omp_get_num_teams() or omp_get_team_num() which was fine before host teams was introduced and the 5.1 requirement that num_teams clause specifies minimum of teams, but with the global vars it means inside of target teams num_teams (2) we happily return omp_get_num_teams() == 4 if the target teams is inside of host teams with num_teams(4). With target fallback being invoked from parallel regions global vars simply can't work right on the host. So, this patch moves them to struct gomp_thread and propagates those for parallel to child threads. For host fallback, the implicit zeroing of thr results in us returning omp_get_num_teams () == 1 and omp_get_team_num () == 0 which is fine for target teams without num_teams clause, for target teams with num_teams clause something to work on and for target without teams nested in it I've asked on omp-lang what should be done. 2021-11-11 Jakub Jelinek <jakub@redhat.com> libgomp.h (struct gomp_thread): Add num_teams and team_num members. * team.c (struct gomp_thread_start_data): Likewise. (gomp_thread_start): Initialize thr->num_teams and thr->team_num. (gomp_team_start): Initialize start_data->num_teams and start_data->team_num. Update nthr->num_teams and nthr->team_num. * teams.c (gomp_num_teams, gomp_team_num): Remove. (GOMP_teams_reg): Set and restore thr->num_teams and thr->team_num instead of gomp_num_teams and gomp_team_num. (omp_get_num_teams): Use thr->num_teams + 1 instead of gomp_num_teams. (omp_get_team_num): Use thr->team_num instead of gomp_team_num. * testsuite/libgomp.c/teams-4.c: New test.	2021-11-11 13:57:31 +01:00
Jakub Jelinek	48d7327f2a	openmp: Add support for 2 argument num_teams clause In OpenMP 5.1, num_teams clause can accept either one expression as before, but it in that case changed meaning, rather than create <= expression teams it is now create == expression teams. Or it accepts two expressions separated by :, with the meaning that the first is low bound and second upper bound on how many teams should be created. The other ways to set number of teams are upper bounds with lower bound of 1. The following patch does parsing of this for C/C++. For host teams, we actually don't need to do anything further right now, we always create (pretend to create) exactly the requested number of teams, so we can just evaluate and throw away the lower bound for now. For teams nested in target, we don't guarantee that though and further work will be needed. In particular, omplower now turns the teams part of: struct S { S (); S (const S &); ~S (); int s; }; void bar (S &, S &); int baz (); _Pragma ("omp declare target to (baz)"); void foo (void) { S a, b; #pragma omp target private (a) map (b) { #pragma omp teams firstprivate (b) num_teams (baz ()) { bar (a, b); } } } into: retval.0 = baz (); retval.1 = retval.0; { unsigned int retval.3; struct S * D.2549; struct S b; retval.3 = (unsigned int) retval.1; D.2549 = .omp_data_i->b; S::S (&b, D.2549); #pragma omp teams num_teams(retval.1) firstprivate(b) shared(a) __builtin_GOMP_teams (retval.3, 0); { bar (&a, &b); } S::~S (&b); #pragma omp return(nowait) } IMHO we want a new API, say GOMP_teams3 which will take 3 arguments instead of 2 (the lower and upper bounds from num_teams and thread_limit) and will return a bool whether it should do the teams body or not. And, we should add right before outermost {} above while (__builtin_GOMP_teams3 ((unsigned) retval.1, (unsigned) retval.1, 0)) and remove the __builtin_GOMP_teams call. The current function performs exit equivalent (at least on NVPTX) which seems bad because that means the destructors of e.g. private variables on target aren't invoked, and at the current placement neither destructors of the already constructed privatized variables in teams. I'll do this next on the compiler side, but I'm afraid I'll need help with the nvptx and amdgcn implementations. E.g. for nvptx, we won't be able to use %ctaid.x . I think ideal would be to use a .shared integer variable for the omp_get_team_num value, but I don't have any experience with that, are .shared variables zero initialized by default, or do they have random value at start? PTX docs say they aren't initializable. 2021-11-11 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_NUM_TEAMS_EXPR): Rename to ... (OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR): ... this. (OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR): Define. * tree.c (omp_clause_num_ops): Increase num ops for OMP_CLAUSE_NUM_TEAMS to 2. * tree-pretty-print.c (dump_omp_clause): Print optional lower bound for OMP_CLAUSE_NUM_TEAMS. * gimplify.c (gimplify_scan_omp_clauses): Gimplify OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR if non-NULL. (optimize_target_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR. Handle OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR. * omp-low.c (lower_omp_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR. * omp-expand.c (expand_teams_call, get_target_arguments): Likewise. gcc/c/ * c-parser.c (c_parser_omp_clause_num_teams): Parse optional lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR. Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR. (c_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before combined target teams even lower-bound expression. gcc/cp/ * parser.c (cp_parser_omp_clause_num_teams): Parse optional lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR. Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR. (cp_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before combined target teams even lower-bound expression. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR of OMP_CLAUSE_NUM_TEAMS clause. * pt.c (tsubst_omp_clauses): Likewise. (tsubst_expr): For OMP_CLAUSE_NUM_TEAMS evaluate before combined target teams even lower-bound expression. gcc/fortran/ * trans-openmp.c (gfc_trans_omp_clauses): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR. gcc/testsuite/ * c-c++-common/gomp/clauses-1.c (bar): Supply lower-bound expression to half of the num_teams clauses. * c-c++-common/gomp/num-teams-1.c: New test. * c-c++-common/gomp/num-teams-2.c: New test. * g++.dg/gomp/attrs-1.C (bar): Supply lower-bound expression to half of the num_teams clauses. * g++.dg/gomp/attrs-2.C (bar): Likewise. * g++.dg/gomp/num-teams-1.C: New test. * g++.dg/gomp/num-teams-2.C: New test. libgomp/ * testsuite/libgomp.c-c++-common/teams-1.c: New test.	2021-11-11 09:42:47 +01:00
GCC Administrator	c9b1334eec	Daily bump.	2021-11-10 00:16:28 +00:00
Thomas Schwinge	00c9ce13a6	Restore 'GOMP_OPENACC_DIM' environment variable parsing ... that got broken by recent commit `c057ed9c52` "openmp: Fix up strtoul and strtoull uses in libgomp", resulting in spurious FAILs for tests specifying 'dg-set-target-env-var "GOMP_OPENACC_DIM" "[...]"'. libgomp/ * env.c (parse_gomp_openacc_dim): Restore parsing.	2021-11-09 16:51:57 +01:00
GCC Administrator	0ef944629a	Daily bump.	2021-10-31 00:16:24 +00:00
Tobias Burnus	948d461954	OpenMP: Add strictly nested API call check [PR102972] The teams construct only permits omp_get_num_teams and omp_get_team_num as API call in strictly nested regions - check for it. Additionally, for Fortran, using DECL_NAME does not show the mangled name, hence, DECL_ASSEMBLER_NAME had to be used to. Finally, 'target device(ancestor:1)' wrongly rejected non-API calls as well. PR middle-end/102972 gcc/ChangeLog: * omp-low.c (omp_runtime_api_call): Use DECL_ASSEMBLER_NAME to get internal Fortran name; new permit_num_teams arg to permit omp_get_num_teams and omp_get_team_num. (scan_omp_1_stmt): Update call to it, add missing call for reverse offload, and check for strictly nested API calls in teams. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-device-ancestor-3.c: Add non-API routine test. * gfortran.dg/gomp/order-6.f90: Add missing bind(C). * c-c++-common/gomp/teams-3.c: New test. * gfortran.dg/gomp/teams-3.f90: New test. * gfortran.dg/gomp/teams-4.f90: New test. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/icv-3.c: Nest API calls inside parallel construct. * testsuite/libgomp.c-c++-common/icv-4.c: Likewise. * testsuite/libgomp.c/target-3.c: Likewise. * testsuite/libgomp.c/target-5.c: Likewise. * testsuite/libgomp.c/target-6.c: Likewise. * testsuite/libgomp.c/target-teams-1.c: Likewise. * testsuite/libgomp.c/teams-1.c: Likewise. * testsuite/libgomp.c/thread-limit-2.c: Likewise. * testsuite/libgomp.c/thread-limit-3.c: Likewise. * testsuite/libgomp.c/thread-limit-4.c: Likewise. * testsuite/libgomp.c/thread-limit-5.c: Likewise. * testsuite/libgomp.fortran/icv-3.f90: Likewise. * testsuite/libgomp.fortran/icv-4.f90: Likewise. * testsuite/libgomp.fortran/teams1.f90: Likewise.	2021-10-30 23:45:32 +02:00
GCC Administrator	4c61300f2b	Daily bump.	2021-10-30 00:16:25 +00:00
Aldy Hernandez	4b3a325f07	Remove VRP threader passes in exchange for better threading pre-VRP. This patch upgrades the pre-VRP threading passes to fully resolving backward threaders, and removes the post-VRP threading passes altogether. With it, we reduce the number of threaders in our pipeline from 9 to 7. This will leave DOM as the only forward threader client. When the ranger can handle floats, we should be able to upgrade the pre-DOM threaders to fully resolving threaders and kill the embedded DOM threader. The numbers are as follows: prev: # threads in backward + vrp-threaders = 92624 now: # threads in backward threaders = 94275 Gain: +1.78% prev: # total threads: 189495 now: # total threads: 193714 Gain: +2.22% The numbers are not as great as my initial proposal, but I've recently pushed all the work that got us to this point ;-). And... the compilation improves by 1.32%! There's a regression on uninit-pred-7_a.c that I've yet to look at. I want to make sure it's not a missing thread. If it is, I'll create a PR and own it. Also, the tree-ssa/phi_on_compare-.c tests have all regressed. This seems to be some special case the forward threader handles that the backward threader does not (edge_forwards_cmp_to_conditional_jump). I haven't dug deep to see if this is solveable within our infrastructure, but a cursory look shows that even though the VRP threader threads this, the .optimized dump ends with more conditional jumps than without the optimization. I'd like to punt on this for now, because DOM actually catches this through its lone use of the forward threader (I've adjusted the tests). However, we will need to address this sooner or later, if indeed it's still improving the final assembly. gcc/ChangeLog: passes.def: Replace the pass_thread_jumps before VRP* with pass_thread_jumps_full. Remove all pass_vrp_threader instances. * tree-ssa-threadbackward.c (pass_data_thread_jumps_full): Remove hyphen from "thread-full" name. libgomp/ChangeLog: * testsuite/libgomp.graphite/force-parallel-4.c: Adjust for threading changes. * testsuite/libgomp.graphite/force-parallel-8.c: Same. gcc/testsuite/ChangeLog: * gcc.dg/loop-unswitch-2.c: Adjust for threading changes. * gcc.dg/old-style-asm-1.c: Same. * gcc.dg/tree-ssa/phi_on_compare-1.c: Same. * gcc.dg/tree-ssa/phi_on_compare-2.c: Same. * gcc.dg/tree-ssa/phi_on_compare-3.c: Same. * gcc.dg/tree-ssa/phi_on_compare-4.c: Same. * gcc.dg/tree-ssa/pr20701.c: Same. * gcc.dg/tree-ssa/pr21001.c: Same. * gcc.dg/tree-ssa/pr21294.c: Same. * gcc.dg/tree-ssa/pr21417.c: Same. * gcc.dg/tree-ssa/pr21559.c: Same. * gcc.dg/tree-ssa/pr21563.c: Same. * gcc.dg/tree-ssa/pr49039.c: Same. * gcc.dg/tree-ssa/pr59597.c: Same. * gcc.dg/tree-ssa/pr61839_1.c: Same. * gcc.dg/tree-ssa/pr61839_3.c: Same. * gcc.dg/tree-ssa/pr66752-3.c: Same. * gcc.dg/tree-ssa/pr68198.c: Same. * gcc.dg/tree-ssa/pr77445-2.c: Same. * gcc.dg/tree-ssa/pr77445.c: Same. * gcc.dg/tree-ssa/ranger-threader-1.c: Same. * gcc.dg/tree-ssa/ranger-threader-2.c: Same. * gcc.dg/tree-ssa/ranger-threader-4.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-1.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. * gcc.dg/tree-ssa/ssa-thread-14.c: Same. * gcc.dg/tree-ssa/ssa-thread-backedge.c: Same. * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Same. * gcc.dg/tree-ssa/vrp02.c: Same. * gcc.dg/tree-ssa/vrp03.c: Same. * gcc.dg/tree-ssa/vrp05.c: Same. * gcc.dg/tree-ssa/vrp06.c: Same. * gcc.dg/tree-ssa/vrp07.c: Same. * gcc.dg/tree-ssa/vrp08.c: Same. * gcc.dg/tree-ssa/vrp09.c: Same. * gcc.dg/tree-ssa/vrp33.c: Same. * gcc.dg/uninit-pred-9_b.c: Same. * gcc.dg/uninit-pred-7_a.c: xfail.	2021-10-29 17:57:27 +02:00
GCC Administrator	04a2cf3fd6	Daily bump.	2021-10-28 00:16:39 +00:00
Jakub Jelinek	eef8114906	openmp: Document that non-rect loops are not supported in Fortran yet I've found we claim to support non-rectangular loops, but don't actually support those in Fortran, as can be seen on: integer i, j !$omp parallel do collapse(2) do i = 0, 10 do j = 0, i end do end do end To support this, the Fortran FE needs to allow the valid forms of non-rectangular loops and disallow others, so mainly it needs its updated version of c-omp.c c_omp_check_loop_iv etc., plus for non-rectangular lb or ub expressions emit a TREE_VEC instead of normal expression as the C/C++ FE do, plus testsuite coverage. 2021-10-27 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (OpenMP 5.0): Mention that Non-rectangular loop nests aren't implemented for Fortran yet.	2021-10-27 09:24:46 +02:00
Jakub Jelinek	2084b5f42a	openmp: Allow non-rectangular loops with pointer iterators This patch handles pointer iterators for non-rectangular loops. They are more limited than integral iterators of non-rectangular loops, in particular only var-outer, var-outer + a2, a2 + var-outer or var-outer - a2 can appear in lb or ub where a2 is some integral loop invariant expression, so no e.g. multiplication etc. 2021-10-27 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular iterators with pointer types. (expand_omp_for_init_vars, extract_omp_for_update_vars): Likewise. gcc/c-family/ * c-omp.c (c_omp_check_loop_iv_r): Don't clear 3rd bit for POINTER_PLUS_EXPR. (c_omp_check_nonrect_loop_iv): Handle POINTER_PLUS_EXPR. (c_omp_check_loop_iv): Set kind even if the iterator is non-integral. gcc/testsuite/ * c-c++-common/gomp/loop-8.c: New test. * c-c++-common/gomp/loop-9.c: New test. libgomp/ * testsuite/libgomp.c/loop-26.c: New test. * testsuite/libgomp.c/loop-27.c: New test.	2021-10-27 09:22:07 +02:00
GCC Administrator	b621508d6f	Daily bump.	2021-10-26 00:16:26 +00:00
Tobias Burnus	72dc270be7	libgomp.oacc-c-c++-common/loop-gwv-2.c: Use __builtin_alloca Some systems do not have <alloca.h> but provide alloca differently, e.g. via stdlib.h. Do it like other testcases do and use __builtin_alloca. libgomp/ChangeLog: PR testsuite/102910 * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Use __builtin_alloca instead of #include <alloca.h> + alloca.	2021-10-25 20:48:38 +02:00
GCC Administrator	ae5c540662	Daily bump.	2021-10-22 00:16:31 +00:00
Chung-Lin Tang	2e4659199e	openmp: Fortran strictly-structured blocks support This implements strictly-structured blocks support for Fortran, as specified in OpenMP 5.2. This now allows using a Fortran BLOCK construct as the body of most OpenMP constructs, with a "!$omp end ..." ending directive optional for that form. gcc/fortran/ChangeLog: * decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case together with COMP_BLOCK. * parse.c (parse_omp_structured_block): Change return type to 'gfc_statement', add handling for strictly-structured block case, adjust recursive calls to parse_omp_structured_block. (parse_executable): Adjust calls to parse_omp_structured_block. * parse.h (enum gfc_compile_state): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK. * trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case handling. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/cancel-1.f90: Adjust testcase. * gfortran.dg/gomp/nesting-3.f90: Adjust testcase. * gfortran.dg/gomp/strictly-structured-block-1.f90: New test. * gfortran.dg/gomp/strictly-structured-block-2.f90: New test. * gfortran.dg/gomp/strictly-structured-block-3.f90: New test. libgomp/ChangeLog: * libgomp.texi (Support of strictly structured blocks in Fortran): Adjust to 'Y'. * testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase.	2021-10-21 14:57:25 +08:00
GCC Administrator	674dda6be0	Daily bump.	2021-10-21 00:16:29 +00:00
Chung-Lin Tang	d98626bf45	openmp: in_reduction support for Fortran This patch implements support for the in_reduction clause for Fortran. It also includes more completion of the taskgroup construct inside the Fortran front-end, thus allowing task_reduction to work for task and target constructs. gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case. (gfc_match_omp_clauses): Add 'openmp_target' default false parameter, adjust call to gfc_match_omp_clause_reduction. (match_omp): Adjust call to gfc_match_omp_clauses * trans-openmp.c (gfc_trans_omp_taskgroup): Add call to gfc_match_omp_clause, create and return block. gcc/ChangeLog: * omp-low.c (omp_copy_decl_2): For !ctx, use record_vars to add new copy as local variable. (scan_sharing_clauses): Place copy of OMP_CLAUSE_IN_REDUCTION decl in ctx->outer instead of ctx. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan pattern. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-in-reduction-1.f90: New test. * testsuite/libgomp.fortran/target-in-reduction-2.f90: New test.	2021-10-20 23:25:02 +08:00
Jakub Jelinek	c7abdf46fb	openmp: Fix up struct gomp_work_share handling [PR102838] If GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is not defined, the intent was to treat the split of the structure between first cacheline (64 bytes) as mostly write-once, use afterwards and second cacheline as rw just as an optimization. But as has been reported, with vectorization enabled at -O2 it can now result in aligned vector 16-byte or larger stores. When not having posix_memalign/aligned_alloc/memalign or other similar API, alloc.c emulates it but it needs to allocate extra memory for the dynamic realignment. So, for the GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC not defined case, this patch stops using aligned (64) attribute in the middle of the structure and instead inserts padding that puts the second half of the structure at offset 64 bytes. And when GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, usually it was allocated as aligned, but for the orphaned case it could still be allocated just with gomp_malloc without guaranteed proper alignment. 2021-10-20 Jakub Jelinek <jakub@redhat.com> PR libgomp/102838 * libgomp.h (struct gomp_work_share_1st_cacheline): New type. (struct gomp_work_share): Only use aligned(64) attribute if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, otherwise just add padding before lock to ensure lock is at offset 64 bytes into the structure. (gomp_workshare_struct_check1, gomp_workshare_struct_check2): New poor man's static assertions. * work.c (gomp_work_share_start): Use gomp_aligned_alloc instead of gomp_malloc if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC.	2021-10-20 09:34:51 +02:00
Aldy Hernandez	d8edfadfc7	Disallow loop rotation and loop header crossing in jump threaders. There is a lot of fall-out from this patch, as there were many threading tests that assumed the restrictions introduced by this patch were valid. Some tests have merely shifted the threading to after loop optimizations, but others ended up with no threading opportunities at all. Surprisingly some tests ended up with more total threads. It was a crapshoot all around. On a postive note, there are 6 tests that no longer XFAIL, and one guality test which now passes. I felt a bit queasy about such a fundamental change wrt threading, so I ran it through my callgrind test harness (.ii files from a bootstrap). There was no change in overall compilation, DOM, or the VRP threaders. However, there was a slight increase of 1.63% in the backward threader. I'm pretty sure we could reduce this if we incorporated the restrictions into their profitability code. This way we could stop the search when we ran into one of these restrictions. Not sure it's worth it at this point. Tested on x86-64 Linux. Co-authored-by: Richard Biener <rguenther@suse.de> gcc/ChangeLog: * tree-ssa-threadupdate.c (cancel_thread): Dump threading reason on the same line as the threading cancellation. (jt_path_registry::cancel_invalid_paths): Avoid rotating loops. Avoid threading through loop headers where the path remains in the loop. libgomp/ChangeLog: * testsuite/libgomp.graphite/force-parallel-5.c: Remove xfail. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-87.c: Remove xfail. * gcc.dg/analyzer/pr94851-2.c: Remove xfail. * gcc.dg/graphite/pr69728.c: Remove xfail. * gcc.dg/graphite/scop-dsyr2k.c: Remove xfail. * gcc.dg/graphite/scop-dsyrk.c: Remove xfail. * gcc.dg/shrink-wrap-loop.c: Remove xfail. * gcc.dg/loop-8.c: Adjust for new threading restrictions. * gcc.dg/tree-ssa/ifc-20040816-1.c: Same. * gcc.dg/tree-ssa/pr21559.c: Same. * gcc.dg/tree-ssa/pr59597.c: Same. * gcc.dg/tree-ssa/pr71437.c: Same. * gcc.dg/tree-ssa/pr77445-2.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. * gcc.dg/vect/bb-slp-16.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Remove. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Remove. * gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Remove. * gcc.dg/tree-ssa/ssa-thread-invalid.c: New test.	2021-10-20 07:07:35 +02:00
GCC Administrator	ce4d1f632f	Daily bump.	2021-10-19 00:16:23 +00:00
Jakub Jelinek	3adcf7e104	openmp: Fix handling of numa_domains(1) If numa-domains is used with num-places count, sometimes the function could create more places than requested and crash. This depended on the content of /sys/devices/system/node/online file, e.g. if the file contains 0-1,16-17 and all NUMA nodes contain at least one CPU in the cpuset of the program, then numa_domains(2) or numa_domains(4) (or 5+) work fine while numa_domains(1) or numa_domains(3) misbehave. I.e. the function was able to stop after reaching limit on the , separators (or trivially at the end), but not within in the ranges. 2021-10-18 Jakub Jelinek <jakub@redhat.com> * config/linux/affinity.c (gomp_affinity_init_numa_domains): Add && gomp_places_list_len < count after nfirst <= nlast loop condition.	2021-10-18 15:00:46 +02:00
Tobias Burnus	64f9623765	Fortran: Fix Bind(C) Array-Descriptor Conversion gfortran uses internally a different array descriptor ("gfc") as Fortran 2018 alias TS291113 defines for C interoperability via ISO_Fortran_binding.h ("CFI"). Hence, when calling a C function from Fortran, it has to be converted in the callee - and if a BIND(C) procedure is written in Fortran, the CFI argument has to be converted to gfc in order work with the rest of the FE code and the library calls. Before this patch, part was handled in the FE generated code and other parts in libgfortran. With this patch, all code is generated and CFI is defined as proper type - visible in the debugger and to the middle end - avoiding both alias issues and missed optimization issues. This patch also fixes issues like: intent(out) deallocation in the bind(C) callee, using the CFI descriptor also for allocatable and pointer scalars and for len=* character strings. For 'select rank', it also optimizes the code + avoid accessing uninitialized memory if the dummy argument is allocatable/a pointer. It additionally rejects passing a descriptorless type() to an assumed-rank dummy argument. [F2018:C711] PR fortran/102086 PR fortran/92189 PR fortran/92621 PR fortran/101308 PR fortran/101309 PR fortran/101635 PR fortran/92482 gcc/fortran/ChangeLog: decl.c (gfc_verify_c_interop_param): Remove 'sorry' for scalar allocatable/pointer and len=. expr.c (is_CFI_desc): Return true for for those. * gfortran.h (CFI_type_kind_shift, CFI_type_mask, CFI_type_from_type_kind, CFI_VERSION, CFI_MAX_RANK, CFI_attribute_pointer, CFI_attribute_allocatable, CFI_attribute_other, CFI_type_Integer, CFI_type_Logical, CFI_type_Real, CFI_type_Complex, CFI_type_Character, CFI_type_ucs4_char, CFI_type_struct, CFI_type_cptr, CFI_type_cfunptr, CFI_type_other): New #define. * trans-array.c (CFI_FIELD_BASE_ADDR, CFI_FIELD_ELEM_LEN, CFI_FIELD_VERSION, CFI_FIELD_RANK, CFI_FIELD_ATTRIBUTE, CFI_FIELD_TYPE, CFI_FIELD_DIM, CFI_DIM_FIELD_LOWER_BOUND, CFI_DIM_FIELD_EXTENT, CFI_DIM_FIELD_SM, gfc_get_cfi_descriptor_field, gfc_get_cfi_desc_base_addr, gfc_get_cfi_desc_elem_len, gfc_get_cfi_desc_version, gfc_get_cfi_desc_rank, gfc_get_cfi_desc_type, gfc_get_cfi_desc_attribute, gfc_get_cfi_dim_item, gfc_get_cfi_dim_lbound, gfc_get_cfi_dim_extent, gfc_get_cfi_dim_sm): New define/functions to access the CFI array descriptor. (gfc_conv_descriptor_type): New function for the GFC descriptor. (gfc_get_array_span): Handle expr of CFI descriptors and assumed-type descriptors. (gfc_trans_array_bounds): Remove 'static'. (gfc_conv_expr_descriptor): For assumed type, use the dtype of the actual argument. (structure_alloc_comps): Remove ' ' inside tabs. * trans-array.h (gfc_trans_array_bounds, gfc_conv_descriptor_type, gfc_get_cfi_desc_base_addr, gfc_get_cfi_desc_elem_len, gfc_get_cfi_desc_version, gfc_get_cfi_desc_rank, gfc_get_cfi_desc_type, gfc_get_cfi_desc_attribute, gfc_get_cfi_dim_lbound, gfc_get_cfi_dim_extent, gfc_get_cfi_dim_sm): New prototypes. * trans-decl.c (gfor_fndecl_cfi_to_gfc, gfor_fndecl_gfc_to_cfi): Remove global vars. (gfc_build_builtin_function_decls): Remove their initialization. (gfc_get_symbol_decl, create_function_arglist, gfc_trans_deferred_vars): Update for CFI. (convert_CFI_desc): Remove and replace by ... (gfc_conv_cfi_to_gfc): ... this function (gfc_generate_function_code): Call it; create local GFC var for CFI. * trans-expr.c (gfc_maybe_dereference_var): Handle CFI. (gfc_conv_subref_array_arg): Handle the if-noncontigous-only copy in when the result should be a descriptor. (gfc_conv_gfc_desc_to_cfi_desc): Completely rewritten. (gfc_conv_procedure_call): CFI fixes. * trans-openmp.c (gfc_omp_is_optional_argument, gfc_omp_check_optional_argument): Handle optional CFI. * trans-stmt.c (gfc_trans_select_rank_cases): Cleanup, avoid invalid code for allocatable/pointer dummies, which cannot be assumed size. * trans-types.c (gfc_cfi_descriptor_base): New global var. (gfc_get_dtype_rank_type): Skip rank init for rank < 0. (gfc_sym_type): Handle CFI dummies. (gfc_get_function_type): Update call. (gfc_get_cfi_dim_type, gfc_get_cfi_type): New. * trans-types.h (gfc_sym_type): Update prototype. (gfc_get_cfi_type): New prototype. * trans.c (gfc_trans_runtime_check): Make conditions more consistent to avoid '<logical> AND_THEN <long int>' in conditions. * trans.h (gfor_fndecl_cfi_to_gfc, gfor_fndecl_gfc_to_cfi): Remove global-var declaration. libgfortran/ChangeLog: * ISO_Fortran_binding.h (CFI_type_cfunptr): Make unique type again. * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Add comment that those are no longer called by new code. libgomp/ChangeLog: * testsuite/libgomp.fortran/optional-bind-c.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/ISO_Fortran_binding_4.f90: Extend testcase. * gfortran.dg/PR100914.f90: Remove xfail. * gfortran.dg/PR100915.c: Expect CFI_type_cfunptr. * gfortran.dg/PR100915.f90: Handle CFI_type_cfunptr != CFI_type_cptr. * gfortran.dg/PR93963.f90: Extend select-rank tests. * gfortran.dg/bind-c-intent-out.f90: Change to dg-do run, update scan-dump. * gfortran.dg/bind_c_array_params_2.f90: Update/extend scan-dump. * gfortran.dg/bind_c_char_10.f90: Update scan-dump. * gfortran.dg/bind_c_char_8.f90: Remove dg-error "sorry". * gfortran.dg/c-interop/allocatable-dummy.f90: Remove xfail. * gfortran.dg/c-interop/c1255-1.f90: Likewise. * gfortran.dg/c-interop/c407c-1.f90: Update dg-error. * gfortran.dg/c-interop/cf-descriptor-5.f90: Remove xfail. * gfortran.dg/c-interop/cf-out-descriptor-3.f90: Likewise. * gfortran.dg/c-interop/cf-out-descriptor-4.f90: Likewise. * gfortran.dg/c-interop/cf-out-descriptor-5.f90: Likewise. * gfortran.dg/c-interop/contiguous-2.f90: Likewise. * gfortran.dg/c-interop/contiguous-3.f90: Likewise. * gfortran.dg/c-interop/deferred-character-1.f90: Likewise. * gfortran.dg/c-interop/deferred-character-2.f90: Likewise. * gfortran.dg/c-interop/fc-descriptor-3.f90: Likewise. * gfortran.dg/c-interop/fc-descriptor-5.f90: Likewise. * gfortran.dg/c-interop/fc-descriptor-6.f90: Likewise. * gfortran.dg/c-interop/fc-out-descriptor-3.f90: Likewise. * gfortran.dg/c-interop/fc-out-descriptor-4.f90: Likewise. * gfortran.dg/c-interop/fc-out-descriptor-5.f90: Likewise. * gfortran.dg/c-interop/fc-out-descriptor-6.f90: Likewise. * gfortran.dg/c-interop/ff-descriptor-5.f90: Likewise. * gfortran.dg/c-interop/ff-descriptor-6.f90: Likewise. * gfortran.dg/c-interop/fc-descriptor-7.f90: Remove xfail + extend. * gfortran.dg/c-interop/fc-descriptor-7-c.c: Update for changes. * gfortran.dg/c-interop/shape.f90: Add implicit none. * gfortran.dg/c-interop/typecodes-array-char-c.c: Add kind=4 char. * gfortran.dg/c-interop/typecodes-array-char.f90: Likewise. * gfortran.dg/c-interop/typecodes-array-float128.f90: Remove xfail. * gfortran.dg/c-interop/typecodes-scalar-basic.f90: Likewise. * gfortran.dg/c-interop/typecodes-scalar-float128.f90: Likewise. * gfortran.dg/c-interop/typecodes-scalar-int128.f90: Likewise. * gfortran.dg/c-interop/typecodes-scalar-longdouble.f90: Likewise. * gfortran.dg/iso_c_binding_char_1.f90: Remove dg-error "sorry". * gfortran.dg/pr93792.f90: Turn XFAIL into PASS. * gfortran.dg/ISO_Fortran_binding_19.f90: New test. * gfortran.dg/assumed_type_12.f90: New test. * gfortran.dg/assumed_type_13.c: New test. * gfortran.dg/assumed_type_13.f90: New test. * gfortran.dg/bind-c-char-descr.f90: New test. * gfortran.dg/bind-c-contiguous-1.c: New test. * gfortran.dg/bind-c-contiguous-1.f90: New test. * gfortran.dg/bind-c-contiguous-2.f90: New test. * gfortran.dg/bind-c-contiguous-3.c: New test. * gfortran.dg/bind-c-contiguous-3.f90: New test. * gfortran.dg/bind-c-contiguous-4.c: New test. * gfortran.dg/bind-c-contiguous-4.f90: New test. * gfortran.dg/bind-c-contiguous-5.c: New test. * gfortran.dg/bind-c-contiguous-5.f90: New test.	2021-10-18 10:29:30 +02:00
GCC Administrator	93d183a5ff	Daily bump.	2021-10-16 00:16:27 +00:00
Jakub Jelinek	a10794eafb	openmp: Improve testsuite/libgomp.c/affinity-1.c testcase I've noticed that while I have added hopefully sufficient test coverage for the case where one uses simple number or !number as p-interval, I haven't added any coverage for number:len:stride or number:len. This patch adds that. 2021-10-15 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/affinity-1.c (struct places): Change name field type from char [50] to const char *. (places_array): Add a testcase for simplified syntax place followed by length or length and stride.	2021-10-15 17:19:54 +02:00
Jakub Jelinek	4a0fed0c0c	openmp: Handle OpenMP 5.1 simplified OMP_PLACES syntax In addition to adding ll_caches and numa_domain abstract names to OMP_PLACES syntax, OpenMP 5.1 also added one syntax simplification: https://github.com/OpenMP/spec/issues/2080 https://github.com/OpenMP/spec/pull/2081 in particular that in the grammar place non-terminal is now not only { res-list } but also res (i.e. a non-negative integer), which stands as a shortcut for { res } So, one can specify OMP_PLACES=0,4,8,12 with the meaning OMP_PLACES={0},{4},{8},{12} or OMP_PLACES=0:4 instead of OMP_PLACES={0}:4 or OMP_PLACES={0},{1},{2},{3} etc. This patch implements that. 2021-10-15 Jakub Jelinek <jakub@redhat.com> * env.c (parse_one_place): Handle non-negative-number the same as { non-negative-number }. Reject even !number:1 and !number:1:stride or !place:1 or !place:1:stride instead of just length other than 1. * libgomp.texi (OpenMP 5.1): Document OMP_PLACES syntax extensions and OMP_NUM_TEAMS/OMP_TEAMS_THREAD_LIMIT and omp_{set_num,get_max}_teams/omp_{s,g}et_teams_thread_limit features as implemented. * testsuite/libgomp.c/affinity-1.c: Add a test for the 5.1 place simplified syntax.	2021-10-15 16:35:57 +02:00
Jakub Jelinek	c057ed9c52	openmp: Fix up strtoul and strtoull uses in libgomp Yesterday when working on numa_domains, I've noticed because of a bug in my patch a hang on a large NUMA machine. I've fixed the bug, but also discovered that the hang was a result of making wrong assumptions about strtoul/strtoull. All the uses were for portability setting errno = 0 before the calls and treating non-zero errno after the call as invalid input, but for the case where there are no valid digits at all strtoul may set errno to EINVAL, but doesn't have to and with glibc doesn't do that. So, this patch goes through all the strtoul calls and next to errno != 0 checks adds also endptr == startptr check. Haven't done it in places where we immediately reject strtoul returning 0 the same as we reject errno != 0, because strtoul must return 0 in the case where it sets endptr to the start pointer. In some spots the code was using errno = 0; x = strtoul (p, &p, 10); if (errno) { /invalid/ } and those spots had to be changed to errno = 0; x = strtoul (p, &end, 10); if (errno \|\| end == p) { /invalid/ } p = end; 2021-10-15 Jakub Jelinek <jakub@redhat.com> * env.c (parse_schedule): For strtoul or strtoull calls which don't clearly reject return value 0 as invalid handle the case where end pointer is the same as first argument as invalid. (parse_unsigned_long_1): Likewise. (parse_one_place): Likewise. (parse_places_var): Likewise. (parse_stacksize): Likewise. (parse_spincount): Likewise. (parse_affinity): Likewise. (parse_gomp_openacc_dim): Likewise. Avoid strict aliasing violation. Make code valid C89. * config/linux/affinity.c (gomp_affinity_find_last_cache_level): For strtoul calls which don't clearly reject return value 0 as invalid handle the case where end pointer is the same as first argument as invalid. (gomp_affinity_init_level_1): Likewise. (gomp_affinity_init_numa_domains): Likewise. * config/rtems/proc.c (parse_thread_pools): Likewise.	2021-10-15 16:28:34 +02:00
Jakub Jelinek	4764049dd6	openmp: Fix up handling of OMP_PLACES=threads(1) When writing the places-.c tests, I've noticed that we mishandle threads abstract name with specified num-places if num-places isn't a multiple of number of hw threads in a core. It then happily ignores the maximum count and overwrites for the remaining hw threads in a core further places that haven't been allocated. 2021-10-15 Jakub Jelinek <jakub@redhat.com> config/linux/affinity.c (gomp_affinity_init_level_1): For level 1 after creating count places clean up and return immediately. * testsuite/libgomp.c/places-6.c: New test. * testsuite/libgomp.c/places-7.c: New test. * testsuite/libgomp.c/places-8.c: New test. * testsuite/libgomp.c/places-9.c: New test. * testsuite/libgomp.c/places-10.c: New test.	2021-10-15 16:25:25 +02:00
Jakub Jelinek	e7ce32c783	openmp: Add support for OMP_PLACES=numa_domains This adds support for numa_domains abstract name in OMP_PLACES, also new in OpenMP 5.1. Way to test this is OMP_PLACES=numa_domains OMP_DISPLAY_ENV=true LD_PRELOAD=.libs/libgomp.so.1 /bin/true and see what it prints on OMP_PLACES line. For non-NUMA machines it should print a single place that covers all CPUs, for NUMA machine one place for each NUMA node with corresponding CPUs. 2021-10-15 Jakub Jelinek <jakub@redhat.com> * env.c (parse_places_var): Handle numa_domains as level 5. * config/linux/affinity.c (gomp_affinity_init_numa_domains): New function. (gomp_affinity_init_level): Use it instead of gomp_affinity_init_level_1 for level == 5. * testsuite/libgomp.c/places-5.c: New test.	2021-10-15 12:16:50 +02:00
Jakub Jelinek	5809be05a2	openmp: Add support for OMP_PLACES=ll_caches This patch implements support for ll_caches abstract name in OMP_PLACES, which stands for places where logical cpus in each place share the last level cache. This seems to work fine for me on x86 and kernel sources show that it is in common code, but on some machines on CompileFarm the files I'm using, i.e. /sys/devices/system/cpu/cpuN/cache/indexN/level /sys/devices/system/cpu/cpuN/cache/indexN/shared_cpu_list don't exist, is that because they have too old kernel and newer kernels are fine or should I implement some fallback methods (which)? E.g. on gcc112.fsffrance.org I see just shared_cpu_map and not shared_cpu_list (with shared_cpu_map being harder to parse) and on another box I didn't even see the cache subdirectories. Way to test this is OMP_PLACES=ll_caches OMP_DISPLAY_ENV=true LD_PRELOAD=.libs/libgomp.so.1 /bin/true and see what it prints on OMP_PLACES line. 2021-10-15 Jakub Jelinek <jakub@redhat.com> * env.c (parse_places_var): Handle ll_caches as level 4. * config/linux/affinity.c (gomp_affinity_find_last_cache_level): New function. (gomp_affinity_init_level_1): Handle level 4 as logical cpus sharing last level cache. (gomp_affinity_init_level): Likewise. * testsuite/libgomp.c/places-1.c: New test. * testsuite/libgomp.c/places-2.c: New test. * testsuite/libgomp.c/places-3.c: New test. * testsuite/libgomp.c/places-4.c: New test.	2021-10-15 12:06:51 +02:00
GCC Administrator	5d5885c99c	Daily bump.	2021-10-15 00:17:02 +00:00
Kwok Cheung Yeung	2c4666fb06	openmp: Mark declare variant directive in documentation as supported in Fortran 2021-10-14 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * libgomp.texi (OpenMP 5.0): Update entry for declare variant directive.	2021-10-14 09:35:33 -07:00
Kwok Cheung Yeung	724ee5a009	openmp, fortran: Add support for OpenMP declare variant directive in Fortran 2021-10-14 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/c-family/ * c-omp.c (c_omp_check_context_selector): Rename to omp_check_context_selector and move to omp-general.c. (c_omp_mark_declare_variant): Rename to omp_mark_declare_variant and move to omp-general.c. gcc/c/ * c-parser.c (c_finish_omp_declare_variant): Change call from c_omp_check_context_selector to omp_check_context_selector. Change call from c_omp_mark_declare_variant to omp_mark_declare_variant. gcc/cp/ * decl.c (omp_declare_variant_finalize_one): Change call from c_omp_mark_declare_variant to omp_mark_declare_variant. * parser.c (cp_finish_omp_declare_variant): Change call from c_omp_check_context_selector to omp_check_context_selector. gcc/fortran/ * gfortran.h (enum gfc_statement): Add ST_OMP_DECLARE_VARIANT. (enum gfc_omp_trait_property_kind): New. (struct gfc_omp_trait_property): New. (gfc_get_omp_trait_property): New macro. (struct gfc_omp_selector): New. (gfc_get_omp_selector): New macro. (struct gfc_omp_set_selector): New. (gfc_get_omp_set_selector): New macro. (struct gfc_omp_declare_variant): New. (gfc_get_omp_declare_variant): New macro. (struct gfc_namespace): Add omp_declare_variant field. (gfc_free_omp_declare_variant_list): New prototype. * match.h (gfc_match_omp_declare_variant): New prototype. * openmp.c (gfc_free_omp_trait_property_list): New. (gfc_free_omp_selector_list): New. (gfc_free_omp_set_selector_list): New. (gfc_free_omp_declare_variant_list): New. (gfc_match_omp_clauses): Add extra optional argument. Handle end of clauses for context selectors. (omp_construct_selectors, omp_device_selectors, omp_implementation_selectors, omp_user_selectors): New. (gfc_match_omp_context_selector): New. (gfc_match_omp_context_selector_specification): New. (gfc_match_omp_declare_variant): New. * parse.c: Include tree-core.h and omp-general.h. (decode_omp_directive): Handle 'declare variant'. (case_omp_decl): Include ST_OMP_DECLARE_VARIANT. (gfc_ascii_statement): Handle ST_OMP_DECLARE_VARIANT. (gfc_parse_file): Initialize omp_requires_mask. * symbol.c (gfc_free_namespace): Call gfc_free_omp_declare_variant_list. * trans-decl.c (gfc_get_extern_function_decl): Call gfc_trans_omp_declare_variant. (gfc_create_function_decl): Call gfc_trans_omp_declare_variant. * trans-openmp.c (gfc_trans_omp_declare_variant): New. * trans-stmt.h (gfc_trans_omp_declare_variant): New prototype. gcc/ * omp-general.c (omp_check_context_selector): Move from c-omp.c. (omp_mark_declare_variant): Move from c-omp.c. (omp_context_name_list_prop): Update for Fortran strings. * omp-general.h (omp_check_context_selector): New prototype. (omp_mark_declare_variant): New prototype. gcc/testsuite/ * gfortran.dg/gomp/declare-variant-1.f90: New test. * gfortran.dg/gomp/declare-variant-10.f90: New test. * gfortran.dg/gomp/declare-variant-11.f90: New test. * gfortran.dg/gomp/declare-variant-12.f90: New test. * gfortran.dg/gomp/declare-variant-13.f90: New test. * gfortran.dg/gomp/declare-variant-14.f90: New test. * gfortran.dg/gomp/declare-variant-15.f90: New test. * gfortran.dg/gomp/declare-variant-16.f90: New test. * gfortran.dg/gomp/declare-variant-17.f90: New test. * gfortran.dg/gomp/declare-variant-18.f90: New test. * gfortran.dg/gomp/declare-variant-19.f90: New test. * gfortran.dg/gomp/declare-variant-2.f90: New test. * gfortran.dg/gomp/declare-variant-2a.f90: New test. * gfortran.dg/gomp/declare-variant-3.f90: New test. * gfortran.dg/gomp/declare-variant-4.f90: New test. * gfortran.dg/gomp/declare-variant-5.f90: New test. * gfortran.dg/gomp/declare-variant-6.f90: New test. * gfortran.dg/gomp/declare-variant-7.f90: New test. * gfortran.dg/gomp/declare-variant-8.f90: New test. * gfortran.dg/gomp/declare-variant-9.f90: New test. libgomp/ * testsuite/libgomp.fortran/declare-variant-1.f90: New test.	2021-10-14 09:16:36 -07:00
GCC Administrator	52055987fb	Daily bump.	2021-10-13 00:16:22 +00:00
Julian Brown	ccfcf08e66	libgomp: Release device lock on cbuf error path This patch releases the device lock on a sanity-checking error path in transfer combining (cbuf) handling in libgomp:target.c. This shouldn't happen when handling well-formed mapping clauses, but erroneous clauses can currently cause a hang if the condition triggers. 2021-12-10 Julian Brown <julian@codesourcery.com> libgomp/ * target.c (gomp_copy_host2dev): Release device lock on cbuf error path.	2021-10-12 06:50:26 -07:00
Tobias Burnus	f5a538e164	Fortran version of libgomp.c-c++-common/icv-{3,4}.c This adds the Fortran testsuite coverage of omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit libgomp/ * testsuite/libgomp.fortran/icv-3.f90: New. * testsuite/libgomp.fortran/icv-4.f90: New.	2021-10-12 10:54:18 +02:00
Jakub Jelinek	4096bf82a0	openmp: Add documentation for omp_{get_max, set_num}_threads and omp_{s, g}et_teams_thread_limit This patch adds documentation for these new OpenMP 5.1 APIs as well as two new environment variables - OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT. 2021-10-12 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (omp_get_max_teams, omp_get_teams_thread_limit, omp_set_num_teams, omp_set_teams_thread_limit, OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT): Document.	2021-10-12 09:35:43 +02:00
Jakub Jelinek	de7fa7063e	openmp: Fix up warnings on libgomp.info build When building libgomp documentation, I see makeinfo --split-size=5000000 -I ../../../libgomp/../gcc/doc/include -I ../../../libgomp -o libgomp.info ../../../libgomp/libgomp.texi ../../../libgomp/libgomp.texi:503: warning: node next `omp_get_default_device' in menu `omp_get_device_num' and in sectioning `omp_get_dynamic' differ ../../../libgomp/libgomp.texi:528: warning: node prev `omp_get_dynamic' in menu `omp_get_device_num' and in sectioning `omp_get_default_device' differ ../../../libgomp/libgomp.texi:560: warning: node next `omp_get_initial_device' in menu `omp_get_level' and in sectioning `omp_get_device_num' differ ../../../libgomp/libgomp.texi:587: warning: node next `omp_get_device_num' in menu `omp_get_dynamic' and in sectioning `omp_get_level' differ ../../../libgomp/libgomp.texi:587: warning: node prev `omp_get_device_num' in menu `omp_get_default_device' and in sectioning `omp_get_initial_device' differ ../../../libgomp/libgomp.texi:615: warning: node prev `omp_get_level' in menu `omp_get_initial_device' and in sectioning `omp_get_device_num' differ warnings. This patch fixes those. 2021-10-12 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (omp_get_device_num): Move @node before omp_get_dynamic to avoid makeinfo warnings.	2021-10-12 09:34:38 +02:00
Jakub Jelinek	88f5ad524a	openmp: Add testsuite coverage for omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit This adds (C/C++ only) testsuite coverage for these new OpenMP 5.1 APIs. 2021-10-12 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c-c++-common/icv-3.c: New test. * testsuite/libgomp.c-c++-common/icv-4.c: New test.	2021-10-12 09:32:28 +02:00
Jakub Jelinek	342aedf0e5	libgomp: alloc* test fixes [PR102628, PR102668] As reported, the alloc-9.c test and alloc-{1,2,3}.F* and alloc-11.f90 tests fail on powerpc64-linux with -m32. The reason why it fails just there is that malloc doesn't guarantee there 128-bit alignment (historically glibc guaranteed 2 * sizeof (void ) alignment from malloc). There are two separate issues. One is a thinko on my side. In this part of alloc-9.c test (copied to alloc-11.f90), we have 2 allocators, a with pool size 1024B and alignment 16B and default fallback and a2 with pool size 512B and alignment 32B and a as fallback allocator. We start at no allocations in both at line 194 and do: p = (int ) omp_alloc (sizeof (int), a2); // This succeeds in a2 and needs 4+overhead bytes (which includes the 32B alignment) p = (int ) omp_realloc (p, 420, a, a2); // This allocates 420 bytes+overhead in a, with 16B alignment and deallocates the above q = (int ) omp_alloc (sizeof (int), a); // This allocates 4+overhead bytes in a, with 16B alignment q = (int ) omp_realloc (q, 420, a2, a); // This allocates 420+overhead in a2 with 32B alignment q = (int ) omp_realloc (q, 768, a2, a2); // This attempts to reallocate, but as there are elevated alignment // requirements doesn't try to just realloc (even if it wanted to try that // a2 is almost full, with 512-420-overhead bytes left in it), so it // tries to alloc in a2, but there is no space left in the pool, falls // back to a, which already has 420+overhead bytes allocated in it and // 1024-420-overhead bytes left and so fails too and fails to default // non-pool allocator that allocates it, but doesn't guarantee alignment // higher than malloc guarantees. // But, the test expected 16B alignment. So, I've slightly lowered the allocation sizes in that part of the test 420->320 and 768 -> 568, so that the last test still fails to allocate in a2 (568 > 512-320-overhead) but succeeds in a as fallback, which was the intent of the test. Another thing is that alloc-1.F90 seems to be transcription of libgomp.c-c++-common/alloc-1.c into Fortran, but alloc-1.c had: q = (int ) omp_alloc (768, a2); if ((((uintptr_t) q) % 16) != 0) abort (); q[0] = 7; q[767 / sizeof (int)] = 8; r = (int ) omp_alloc (512, a2); if ((((uintptr_t) r) % __alignof (int)) != 0) abort (); there but Fortran has: cq = omp_alloc (768_c_size_t, a2) if (mod (transfer (cq, intptr), 16_c_intptr_t) /= 0) stop 12 call c_f_pointer (cq, q, [768 / c_sizeof (i)]) q(1) = 7 q(768 / c_sizeof (i)) = 8 cr = omp_alloc (512_c_size_t, a2) if (mod (transfer (cr, intptr), 16_c_intptr_t) /= 0) stop 13 I'm changing the latter to 4_c_intptr_t because other spots in the testcase do that, Fortran sadly doesn't have c_alignof, but strictly speaking it isn't correct, __alignof (int) could be on some architectures smaller than 4. So probably alloc-1.F90 etc. should also have ! { dg-additional-sources alloc-7.c } ! { dg-prune-output "command-line option '-fintrinsic-modules-path=.' is valid for Fortran but not for C" } and use get__alignof_int. 2021-10-12 Jakub Jelinek <jakub@redhat.com> PR libgomp/102628 PR libgomp/102668 testsuite/libgomp.c-c++-common/alloc-9.c (main): Decrease allocation sizes from 420 to 320 and from 768 to 568. * testsuite/libgomp.fortran/alloc-11.f90: Likewise. * testsuite/libgomp.fortran/alloc-1.F90: Change expected alignment for cr from 16 to 4.	2021-10-12 09:30:41 +02:00
Jakub Jelinek	fab2f61dc1	vectorizer: Fix up -fsimd-cost-model= handling > * testsuite/libgomp.c++/scan-10.C: Add option -fvect-cost-model=cheap. I don't think this is the right thing to do. This just means that at some point between 2013 when -fsimd-cost-model has been introduced and now -fsimd-cost-model= option at least partially stopped working properly. As documented, -fsimd-cost-model= overrides the -fvect-cost-model= setting for OpenMP simd loops (loop->force_vectorize is true) if specified differently from default. In tree-vectorizer.h we have: static inline bool unlimited_cost_model (loop_p loop) { if (loop != NULL && loop->force_vectorize && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT) return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED; return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED); } and use it in various places, but we also just use flag_vect_cost_model in lots of places (and in one spot use flag_simd_cost_model, not sure if we are sure it is a force_vectorize loop or what). So, IMHO we should change the above inline function to loop_cost_model and let it return the cost model and then just reimplement unlimited_cost_model as return loop_cost_model (loop) == VECT_COST_MODEL_UNLIMITED; and then adjust the direct uses of the flag and revert these changes. 2021-10-12 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-vectorizer.h (loop_cost_model): New function. (unlimited_cost_model): Use it. * tree-vect-loop.c (vect_analyze_loop_costing): Use loop_cost_model call instead of flag_vect_cost_model. * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise. (vect_prune_runtime_alias_test_list): Likewise. Also use it instead of flag_simd_cost_model. gcc/testsuite/ * gcc.dg/gomp/simd-2.c: Remove option -fvect-cost-model=cheap. * gcc.dg/gomp/simd-3.c: Likewise. libgomp/ * testsuite/libgomp.c/scan-11.c: Remove option -fvect-cost-model=cheap. * testsuite/libgomp.c/scan-12.c: Likewise. * testsuite/libgomp.c/scan-13.c: Likewise. * testsuite/libgomp.c/scan-14.c: Likewise. * testsuite/libgomp.c/scan-15.c: Likewise. * testsuite/libgomp.c/scan-16.c: Likewise. * testsuite/libgomp.c/scan-17.c: Likewise. * testsuite/libgomp.c/scan-18.c: Likewise. * testsuite/libgomp.c/scan-19.c: Likewise. * testsuite/libgomp.c/scan-20.c: Likewise. * testsuite/libgomp.c/scan-21.c: Likewise. * testsuite/libgomp.c/scan-22.c: Likewise. * testsuite/libgomp.c++/scan-9.C: Likewise. * testsuite/libgomp.c++/scan-10.C: Likewise. * testsuite/libgomp.c++/scan-11.C: Likewise. * testsuite/libgomp.c++/scan-12.C: Likewise. * testsuite/libgomp.c++/scan-13.C: Likewise. * testsuite/libgomp.c++/scan-14.C: Likewise. * testsuite/libgomp.c++/scan-15.C: Likewise. * testsuite/libgomp.c++/scan-16.C: Likewise.	2021-10-12 09:28:10 +02:00
liuhongt	d61ce6ab04	Adjust testcase for O2 vectorization enabling This issue was observed in rs6000 specific PR102658 as well. I've looked into it a bit, it's caused by the "conditional store replacement" which is originally disabled without vectorization as below code. /* If either vectorization or if-conversion is disabled then do not sink any stores. / if (param_max_stores_to_sink == 0 \|\| (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize) \|\| !flag_tree_loop_if_convert) return false; The new change makes the innermost loop look like for (int c1 = 0; c1 <= 1499; c1 += 1) { if (c1 <= 500) { S_10(c0, c1); } else { S_9(c0, c1); } S_11(c0, c1); } and can not be splitted as: for (int c1 = 0; c1 <= 500; c1 += 1) S_10(c0, c1); for (int c1 = 501; c1 <= 1499; c1 += 1) S_9(c0, c1); So instead of disabling vectorization, could we just disable this cs replacement with parameter "--param max-stores-to-sink=0"? I tested this proposal on ppc64le, it should work as well. 2021-10-11 Kewen Lin <linkw@linux.ibm.com> libgomp/ChangeLog: testsuite/libgomp.graphite/force-parallel-8.c: Add --param max-stores-to-sink=0.	2021-10-12 15:24:12 +08:00
GCC Administrator	732d763847	Daily bump.	2021-10-12 00:17:02 +00:00
Marcel Vollweiler	f70977936a	libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential. The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP 5.1. This was already implemented by Jakub (C/C++, commit `ea82325afe`) and Tobias (Fortran, commit `fff15bad1a`). This patch adds two tests to check if omp_atv_serialized is available (one test for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as deprecated in C/C++ and Fortran for OpenMP 5.1. libgomp/ChangeLog: * allocator.c (omp_init_allocator): Replace omp_atv_sequential with omp_atv_serialized. * omp.h.in: Add deprecated flag for omp_atv_sequential. * omp_lib.f90.in: Add deprecated flag for omp_atv_sequential. * testsuite/libgomp.c-c++-common/alloc-10.c: New test. * testsuite/libgomp.fortran/alloc-12.f90: New test.	2021-10-11 04:34:51 -07:00
Jakub Jelinek	07dd3bcda1	openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit OpenMP 5.1 adds env vars and functions to set and query new ICVs used as fallback if thread_limit or num_teams clauses aren't specified on teams construct. The following patch implements those, though further work will be needed: 1) OpenMP 5.1 also changed the num_teams clause, so that it can specify both lower and upper limit for how many teams should be created and changed the meaning when only one expression is provided, instead of num_teams(expr) in 5.0 meaning num_teams(1:expr) in 5.1, it now means num_teams(expr:expr), i.e. while previously we could create 1 to expr teams, in 5.1 we have some low limit by default equal to the single expression provided and may not create fewer teams. For host teams (which we don't currently implement efficiently for NUMA hosts) we trivially satisfy it now by always honoring what the user asked for, but for the offloading teams I think we'll need to rethink the APIs; currently teams construct is just a call that returns and possibly lowers the number of teams; and whenever possible we try to evaluate num_teams/thread_limit already on the target construct and the GOMP_teams call just sets the number of teams to the minimum of provided and requested teams; for some cases e.g. where target is not combined with teams and num_teams expression calls some functions etc., we need to call those functions in the target region and so it is late to figure number of teams, but also hw could just limit what it is willing to create; in that case I'm afraid we need to run the target body multiple times and arrange for omp_get_team_num () returning the right values 2) we need to finally implement the NUMA handling for GOMP_teams_reg 3) I now realize I haven't added some testcase coverage, will do that incrementally 4) libgomp.texi needs updates for these new APIs, but also others like the allocator 2021-10-11 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-low.c (omp_runtime_api_call): Handle omp_get_max_teams, omp_[sg]et_teams_thread_limit and omp_set_num_teams. libgomp/ * omp.h.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * omp_lib.f90.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * omp_lib.h.in (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare. * libgomp.h (gomp_nteams_var, gomp_teams_thread_limit_var): Declare. * libgomp.map (OMP_5.1): Export omp_get_max_teams{,_}, omp_get_teams_thread_limit{,_}, omp_set_num_teams{,_,_8_} and omp_set_teams_thread_limit{,_,_8_}. * icv.c (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): New functions. * env.c (gomp_nteams_var, gomp_teams_thread_limit_var): Define. (omp_display_env): Print OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT. (initialize_env): Handle OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT env vars. * teams.c (GOMP_teams_reg): If thread_limit is not specified, use gomp_teams_thread_limit_var as fallback if not zero. If num_teams is not specified, use gomp_nteams_var. * fortran.c (omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit): Add ialias_redirect. (omp_set_num_teams_, omp_set_num_teams_8_, omp_get_max_teams_, omp_set_teams_thread_limit_, omp_set_teams_thread_limit_8_, omp_get_teams_thread_limit_): New functions.	2021-10-11 12:20:22 +02:00
GCC Administrator	c9db17b880	Daily bump.	2021-10-10 00:16:19 +00:00
liuhongt	b4e81f6dd4	Adjust more testcases for O2 vectorization enabling. libgomp/ChangeLog: * testsuite/libgomp.c++/scan-10.C: Add option -fvect-cost-model=cheap. * testsuite/libgomp.c++/scan-11.C: Ditto. * testsuite/libgomp.c++/scan-12.C: Ditto. * testsuite/libgomp.c++/scan-13.C: Ditto. * testsuite/libgomp.c++/scan-14.C: Ditto. * testsuite/libgomp.c++/scan-15.C: Ditto. * testsuite/libgomp.c++/scan-16.C: Ditto. * testsuite/libgomp.c++/scan-9.C: Ditto. * testsuite/libgomp.c-c++-common/lastprivate-conditional-7.c: Ditto. * testsuite/libgomp.c-c++-common/lastprivate-conditional-8.c: Ditto. * testsuite/libgomp.c/scan-11.c: Ditto. * testsuite/libgomp.c/scan-12.c: Ditto. * testsuite/libgomp.c/scan-13.c: Ditto. * testsuite/libgomp.c/scan-14.c: Ditto. * testsuite/libgomp.c/scan-15.c: Ditto. * testsuite/libgomp.c/scan-16.c: Ditto. * testsuite/libgomp.c/scan-17.c: Ditto. * testsuite/libgomp.c/scan-18.c: Ditto. * testsuite/libgomp.c/scan-19.c: Ditto. * testsuite/libgomp.c/scan-20.c: Ditto. * testsuite/libgomp.c/scan-21.c: Ditto. * testsuite/libgomp.c/scan-22.c: Ditto. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr94403.C: Add -fno-tree-vectorize * gcc.dg/optimize-bswapsi-5.c: Ditto. * gcc.dg/optimize-bswapsi-6.c: Ditto. * gcc.dg/Warray-bounds-51.c: Add additional option -mtune=generic for target x86/i?86 * gcc.dg/Wstringop-overflow-14.c: Ditto.	2021-10-09 16:28:11 +08:00
Jakub Jelinek	875124eb08	openmp: Add support for OpenMP 5.1 structured-block-sequences Related to this is the addition of structured-block-sequence in OpenMP 5.1, which doesn't change anything for Fortran, but for C/C++ allows multiple statements instead of just one possibly compound around the separating directives (section and scan). I've also made some updates to the OpenMP 5.1 support list in libgomp.texi. 2021-10-09 Jakub Jelinek <jakub@redhat.com> gcc/c/ * c-parser.c (c_parser_omp_structured_block_sequence): New function. (c_parser_omp_scan_loop_body): Use it. (c_parser_omp_sections_scope): Likewise. gcc/cp/ * parser.c (cp_parser_omp_structured_block): Remove disallow_omp_attrs argument. (cp_parser_omp_structured_block_sequence): New function. (cp_parser_omp_scan_loop_body): Use it. (cp_parser_omp_sections_scope): Likewise. gcc/testsuite/ * c-c++-common/gomp/sections1.c (foo): Don't expect errors on multiple statements in between section directive(s). Add testcases for invalid no statements in between section directive(s). * gcc.dg/gomp/sections-2.c (foo): Don't expect errors on multiple statements in between section directive(s). * g++.dg/gomp/sections-2.C (foo): Likewise. * g++.dg/gomp/attrs-6.C (foo): Add testcases for multiple statements in between section directive(s). (bar): Add testcases for multiple statements in between scan directive. * g++.dg/gomp/attrs-7.C (bar): Adjust expected error recovery. libgomp/ * libgomp.texi (OpenMP 5.1): Mention implemented support for structured block sequences in C/C++. Mention support for unconstrained/reproducible modifiers on order clause. Mention partial (C/C++ only) support of extentensions to atomics construct. Mention partial (C/C++ on clause only) support of align/allocator modifiers on allocate clause.	2021-10-09 10:14:36 +02:00
GCC Administrator	e3e07b8955	Daily bump.	2021-10-03 00:16:17 +00:00
Tobias Burnus	703d8a4d39	Add libgomp.fortran/order-reproducible-.f90 libgomp/ChangeLog: testsuite/libgomp.fortran/order-reproducible-1.f90: New test based on libgomp.c-c++-common/order-reproducible-1.c. * testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise. * testsuite/libgomp.fortran/my-usleep.c: New test.	2021-10-02 11:29:35 +02:00
GCC Administrator	9d116bcc55	Daily bump.	2021-10-02 00:16:31 +00:00
Tobias Burnus	2a93d18da3	Add/update libgomp.fortran/alloc-.f90 libgomp/ChangeLog: testsuite/libgomp.fortran/alloc-10.f90: Fix alignment check. * testsuite/libgomp.fortran/alloc-7.f90: Fix array access. * testsuite/libgomp.fortran/alloc-8.f90: Likewise. * testsuite/libgomp.fortran/alloc-11.f90: New test for omp_realloc, based on libgomp.c-c++-common/alloc-9.c.	2021-10-01 20:03:25 +02:00
Jakub Jelinek	e705b8533a	openmp: Differentiate between order(concurrent) and order(reproducible:concurrent) While OpenMP 5.1 implies order(concurrent) is the same thing as order(reproducible:concurrent), this is going to change in OpenMP 5.2, where essentially order(concurrent) means nothing is stated on whether it is reproducible or unconstrained (and is determined by other means, e.g. for/do with schedule static or runtime with static being selected is implicitly reproducible, distribute with dist_schedule static is implicitly reproducible, loop is implicitly reproducible) and when the modifier is specified explicitly, it overrides the implicit behavior either way. And, when order(reproducible:concurrent) is used with e.g. schedule(dynamic) or some other schedule that is by definition not reproducible, it is implementation's duty to ensure it is reproducible, either by remembering how it scheduled some loop and then replaying the same schedule when seeing loops with the same directive/schedule/number of iterations, or by overriding the schedule to some reproducible one. This patch doesn't implement the 5.2 wording just yet, but in the FEs differentiates between the 3 states - no explicit modifier, explicit reproducible or explicit unconstrainted, so that the middle-end can easily switch any time. Instead it follows the 5.1 wording where both order(concurrent) (implicit or explicit) or order(reproducible:concurrent) imply reproducibility. And, it implements the easier method, when for/do should be reproducible, it just chooses static schedule. order(concurrent) implies no OpenMP APIs in the loop body nor threadprivate vars, so the exact scheduling isn't (easily at least) observable. 2021-10-01 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_ORDER_REPRODUCIBLE): Define. * tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_ORDER>: Print reproducible: for OMP_CLAUSE_ORDER_REPRODUCIBLE. * omp-general.c (omp_extract_for_data): If OMP_CLAUSE_ORDER is seen without OMP_CLAUSE_ORDER_UNCONSTRAINED, overwrite sched_kind to OMP_CLAUSE_SCHEDULE_STATIC. gcc/c-family/ * c-omp.c (c_omp_split_clauses): Also copy OMP_CLAUSE_ORDER_REPRODUCIBLE. gcc/c/ * c-parser.c (c_parser_omp_clause_order): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier. gcc/cp/ * parser.c (cp_parser_omp_clause_order): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier. gcc/fortran/ * gfortran.h (gfc_omp_clauses): Add order_reproducible bitfield. * dump-parse-tree.c (show_omp_clauses): Print REPRODUCIBLE: for it. * openmp.c (gfc_match_omp_clauses): Set order_reproducible for explicit reproducible: modifier. * trans-openmp.c (gfc_trans_omp_clauses): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for order_reproducible. (gfc_split_omp_clauses): Also copy order_reproducible. gcc/testsuite/ * gfortran.dg/gomp/order-5.f90: Adjust scan-tree-dump-times regexps. libgomp/ * testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test. * testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test.	2021-10-01 10:45:48 +02:00
Jakub Jelinek	3749c3aff6	openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 \| grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 \| grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.	2021-10-01 10:42:07 +02:00
Jakub Jelinek	998e434f8f	openmp: Add alloc_align attribute to omp_aligned_alloc and testcase for omp_realloc This patch adds alloc_align attribute to omp_aligned_{,c}alloc so that if the first argument is constant, GCC can assume requested alignment. Additionally, it adds testsuite coverage for omp_realloc which I haven't managed to write in the patch from yesterday. 2021-10-01 Jakub Jelinek <jakub@redhat.com> omp.h.in (omp_aligned_alloc, omp_aligned_calloc): Add __alloc_align__ (1) attribute. * testsuite/libgomp.c-c++-common/alloc-9.c: New test.	2021-10-01 10:32:10 +02:00
GCC Administrator	2467998373	Daily bump.	2021-10-01 00:16:27 +00:00
Tobias Burnus	ef37ddf477	libgomp.fortran/alloc-.f90: Add missing dg-prune-output libgomp/ testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output for -fintrinsic-modules-path= warning of the C compiler. * testsuite/libgomp.fortran/alloc-9.f90: Likewise. * testsuite/libgomp.fortran/alloc-10.f90: Likewise.	2021-09-30 14:44:06 +02:00
Tobias Burnus	70de20db23	openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for Fortran gcc/ChangeLog: * omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and omp_{c,re}alloc, fix omp_alloc/omp_free. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1): Set implementation status to Y for omp_aligned_{,c}alloc and omp_{c,re}alloc routines. * omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * testsuite/libgomp.fortran/alloc-10.f90: New test. * testsuite/libgomp.fortran/alloc-6.f90: New test. * testsuite/libgomp.fortran/alloc-7.c: New test. * testsuite/libgomp.fortran/alloc-7.f90: New test. * testsuite/libgomp.fortran/alloc-8.f90: New test. * testsuite/libgomp.fortran/alloc-9.f90: New test.	2021-09-30 14:26:46 +02:00
Jakub Jelinek	b38a4bd102	openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that fixes an omp_alloc bug which is hard to test for - if the first allocator fails but has a larger alignment trait and has a fallback allocator, either the default behavior or a user fallback, then the extra alignment will be used even in the fallback allocation, rather than just starting with whatever alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc). Jonathan's comment on IRC this morning made me realize that I should add alloc_align attributes to 2 of the prototypes and I still need to add testsuite coverage for omp_realloc, will do that in a follow-up. 2021-09-30 Jakub Jelinek <jakub@redhat.com> * omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc, omp_realloc): New prototypes. (omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free) attribute. * allocator.c: Include string.h. (omp_aligned_alloc): No longer static, add ialias. Add new_alignment variable and use it instead of alignment so that when retrying the old alignment is used again. Don't retry if new alignment is the same as old alignment, unless allocator had pool size. (omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call. (omp_aligned_calloc, omp_calloc, omp_realloc): New functions. * libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc, omp_aligned_calloc and omp_realloc. * testsuite/libgomp.c-c++-common/alloc-4.c (main): Add omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests. * testsuite/libgomp.c-c++-common/alloc-5.c: New test. * testsuite/libgomp.c-c++-common/alloc-6.c: New test. * testsuite/libgomp.c-c++-common/alloc-7.c: New test. * testsuite/libgomp.c-c++-common/alloc-8.c: New test.	2021-09-30 09:30:18 +02:00
GCC Administrator	fd1334791e	Daily bump.	2021-09-29 00:16:26 +00:00
Tobias Burnus	1f0a57bd54	libgomp: Only check for 2sizeof(void) int type with Fortran [PR96661] The depend type is a struct with two pointer members for C/C++ - but for Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus, libgomp's configure checks that an integer type/kind with size 2sizeof(void) is available. However, this integer type/kind is not needed when building without Fortran support. Thus, only check this when Fortran is enabled. libgomp/ PR libgomp/96661 * configure.ac: Only check for int-type = 2size_t support when building with Fortran support. configure: Regenerate.	2021-09-28 15:15:47 +02:00
Thomas Schwinge	a43ae03a05	Further test case adjustment re "Fortran: Fix assumed-size to assumed-rank passing" Fix-up for recent commit `00f6de9c69` "Fortran: Fix assumed-size to assumed-rank passing [PR94070]", and commit `da1f6391b7` "libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note". Due to use of '#if !ACC_MEM_SHARED' conditionals in 'libgomp.oacc-fortran/if-1.f90', 'target { ! openacc_host_selected }' needs some special care (ignoring the pre-existing mismatch of 'ACC_MEM_SHARED' vs. 'openacc_host_selected'). As seen with GCN offloading, we need to revert to another bit of the original code in 'libgomp.oacc-fortran/privatized-ref-2.f90'. libgomp/ * testsuite/libgomp.oacc-fortran/if-1.f90: Adjust. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.	2021-09-28 14:18:21 +02:00
GCC Administrator	cf966403d9	Daily bump.	2021-09-28 00:16:21 +00:00
Aldy Hernandez	0288527f47	Replace VRP threader with a hybrid forward threader. This patch implements the new hybrid forward threader and replaces the embedded VRP threader with it. With all the pieces that have gone in, the implementation of the hybrid threader is straightforward: convert the current state into SSA imports that the solver will understand, and let the path solver precompute ranges and relations for the path. After this setup is done, we can use the range_query API to solve gimple statements in the threader. The forward threader is now engine agnostic so there are no changes to the threader per se. I have put the hybrid bits in tree-ssa-threadedge., instead of VRP, because they will also be used in the evrp removal of the DOM/threader, which is my next task. Most of the patch, is actually test changes. I have gone through every single one and verified that we're correct. Most were trivial dump file name changes, but others required going through the IL an certifying that the different IL was expected. For example, in pr59597.c, we have one less thread because the ASSERT_EXPR was getting in the way, and making it seem like things were not crossing loops. The hybrid threader sees the correct representation of the IL, and avoids threading this one case. The final numbers are a 12.16% improvement in jump threads immediately after VRP, and a 0.82% improvement in overall jump threads. The performance drop is 0.6% (plus the 1.43% hit from moving the embedded threader into its own pass). As I've said, I'd prefer to keep the threader in its own pass, but if this is an issue, we can address this with a shared ranger when VRP is replaced with an evrp instance (upcoming). Note, that these numbers are slightly different than what I originally posted. A few correctness tweaks, plus restricting loop threads, made the difference. That being said, I was aiming for par. A 12% gain is just gravy ;-). When we merge the threaders, we should see even better numbers-- and we'll have the benefit of an entire release stress testing the solver. As I mentioned in my introductory note, paths ending in MEM_REF conditional are missing. In reality, this didn't make a difference, as it was so rare. However, as a follow-up, I will distill a test and add a suitable PR to keep us honest. There is a one-line change to libgomp/team.c silencing a new used uninitialized warning. As my previous work with the threaders has shown, warnings flare up after each improvement to jump threading. I expect this to be no different. I've promised Jakub to investigate fully, so I will analyze and add the appropriate PR for the warning experts. Oh yeah, the new pass dump is called vrp-threader[12] to match each VRP[12] pass. However, there's no reason for it to either be named vrp-threader, or for it to live in tree-vrp.c. Tested on x86-64 Linux. OK? p.s. "Did I say 5 weeks? My bad, I meant 5 months." gcc/ChangeLog: passes.def (pass_vrp_threader): New. * tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader. * tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New. (hybrid_jt_simplifier::hybrid_jt_simplifier): New. (hybrid_jt_simplifier::simplify): New. (hybrid_jt_simplifier::compute_ranges_from_state): New. * tree-ssa-threadedge.h (class hybrid_jt_state): New. (class hybrid_jt_simplifier): New. * tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump threader. (class hybrid_threader): New. (hybrid_threader::hybrid_threader): New. (hybrid_threader::~hybrid_threader): New. (hybrid_threader::before_dom_children): New. (hybrid_threader::after_dom_children): New. (execute_vrp_threader): New. (class pass_vrp_threader): New. (make_pass_vrp_threader): New. libgomp/ChangeLog: * team.c: Initialize start_data. * testsuite/libgomp.graphite/force-parallel-4.c: Adjust. * testsuite/libgomp.graphite/force-parallel-8.c: Adjust. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr55107.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust. * gcc.dg/tree-ssa/pr21559.c: Adjust. * gcc.dg/tree-ssa/pr59597.c: Adjust. * gcc.dg/tree-ssa/pr61839_1.c: Adjust. * gcc.dg/tree-ssa/pr61839_3.c: Adjust. * gcc.dg/tree-ssa/pr71437.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust. * gcc.dg/tree-ssa/ssa-thread-14.c: Adjust. * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust. * gcc.dg/tree-ssa/vrp106.c: Adjust. * gcc.dg/tree-ssa/vrp55.c: Adjust.	2021-09-27 17:39:51 +02:00
Tobias Burnus	da1f6391b7	libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note In my last commit, r12-3897-g00f6de9c69119594f7dad3bd525937c94c8200d0, which inlined array-size code, I had to update the expected output. However, in doing so, I accidentally (copy'n'paste) changed dg-note into dg-message. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Change dg-message back to dg-note.	2021-09-27 14:33:39 +02:00
Tobias Burnus	00f6de9c69	Fortran: Fix assumed-size to assumed-rank passing [PR94070] This code inlines the size0 and size1 libgfortran calls, the former is still used by libgfortan itself (and by old code). Besides permitting more optimizations, it also permits to handle assumed-rank dummies better: If the dummy argument is a nonpointer/nonallocatable, an assumed-size actual arg is repesented by having ubound == -1 for the last dimension. However, for allocatable/pointers, this value can also exist. Hence, the dummy arg attr has to be honored. For that reason, when calling an assumed-rank procedure with nonpointer, nonallocatable dummy arguments, the bounds have to be updated to avoid the case ubound == -1 for the last dimension. PR fortran/94070 gcc/fortran/ChangeLog: * trans-array.c (gfc_tree_array_size): New function to find size inline (whole array or one dimension). (array_parameter_size): Use it, take stmt_block as arg. (gfc_conv_array_parameter): Update call. * trans-array.h (gfc_tree_array_size): Add prototype. * trans-decl.c (gfor_fndecl_size0, gfor_fndecl_size1): Remove these global vars. (gfc_build_intrinsic_function_decls): Remove their initialization. * trans-expr.c (gfc_conv_procedure_call): Update bounds of pointer/allocatable actual args to nonallocatable/nonpointer dummies to be one based. * trans-intrinsic.c (gfc_conv_intrinsic_shape): Fix case for assumed rank with allocatable/pointer dummy. (gfc_conv_intrinsic_size): Update to use inline function. * trans.h (gfor_fndecl_size0, gfor_fndecl_size1): Remove var decl. libgfortran/ChangeLog: * intrinsics/size.c (size0, size1): Comment that now not used by newer compiler code. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Update expected dg-note output. gcc/testsuite/ChangeLog: * gfortran.dg/c-interop/cf-out-descriptor-6.f90: Remove xfail. * gfortran.dg/c-interop/size.f90: Remove xfail. * gfortran.dg/intrinsic_size_3.f90: Update scan-tree-dump-times. * gfortran.dg/transpose_optimization_2.f90: Likewise. * gfortran.dg/size_optional_dim_1.f90: Add scan-tree-dump-not. * gfortran.dg/assumed_rank_22.f90: New test. * gfortran.dg/assumed_rank_22_aux.c: New test.	2021-09-27 14:04:54 +02:00
GCC Administrator	e4777439fc	Daily bump.	2021-09-23 00:16:29 +00:00
Tobias Burnus	83aac69883	Fortran: Improve -Wmissing-include-dirs warnings [PR55534] It turned out that enabling the -Wmissing-include-dirs for libcpp did output too many warnings – at least as run with -B and similar options during the GCC build and warning for internal include dirs like finclude, unlikely of relevance to for a real-world user. This patch now only warns for -I and -J by default but permits to get the full warnings including libcpp ones with -Wmissing-include-dirs. It additionally documents this in the manual. With that change, the -Wno-missing-include-dirs could be removed from libgfortran's configure and libgomp's testsuite always cflags. This reverts those bits of the previous commit r12-3722-g417ea5c02cef7f000e66d1af22b066c2c1cda047 Additionally, it turned out that all call to load_file called exit explicitly - except for the main file via gfc_init -> gfc_new_file. The latter also output a file not existing fatal error, such that two errors where printed. Now exit is called in line with the other users of load_file. Finally, when compileing with "nonexisting/file.f90", first a warning that "nonexisting" does not exist as include path was printed before the file not found error was printed. Now the directory in which the physical file is located is added silently, relying on the file-not-found diagnostic for those. PR fortran/55534 gcc/ChangeLog: * doc/invoke.texi (-Wno-missing-include-dirs.): Document Fortran behavior. gcc/fortran/ChangeLog: * cpp.c (gfc_cpp_register_include_paths, gfc_cpp_post_options): Add new bool verbose_missing_dir_warn argument. * cpp.h (gfc_cpp_post_options): Update prototype. * f95-lang.c (gfc_init): Remove duplicated file-not found diag. * gfortran.h (gfc_check_include_dirs): Takes bool verbose_missing_dir_warn arg. (gfc_new_file): Returns now void. * options.c (gfc_post_options): Update to warn for -I and -J, only, by default but for all when user requested. * scanner.c (gfc_do_check_include_dir): (gfc_do_check_include_dirs, gfc_check_include_dirs): Take bool verbose warn arg and update to avoid printing the same message twice or never. (load_file): Fix indent. (gfc_new_file): Return void and exit when load_file failed as all other load_file users do. libgfortran/ChangeLog: * configure.ac (AM_FCFLAGS): Revert r12-3722 by removing -Wno-missing-include-dirs. * configure: Regenerate. libgomp/ChangeLog: * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Revert r12-3722 by removing -Wno-missing-include-dirs. * testsuite/libgomp.oacc-fortran/fortran.exp (ALWAYS_CFLAGS): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/include_14.f90: Add -J testcase and update dg-output. * gfortran.dg/include_15.f90: Likewise. * gfortran.dg/include_16.f90: Likewise. * gfortran.dg/include_17.f90: Likewise. * gfortran.dg/include_18.f90: Likewise. * gfortran.dg/include_19.f90: Likewise.	2021-09-22 20:58:35 +02:00
Jakub Jelinek	059b819e3c	openmp: Add support for allocator and align modifiers on allocate clauses As the allocate-2.c testcase shows, this change isn't 100% backwards compatible, one could have allocate and/or align functions that return an OpenMP allocator handle and previously it would call those functions and now would use those names as keywords for the modifiers. But it allows specify extra alignment requirements for the allocations. 2021-09-22 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_ALLOCATE_ALIGN): Define. * tree.c (omp_clause_num_ops): Change number of OMP_CLAUSE_ALLOCATE arguments from 2 to 3. * tree-pretty-print.c (dump_omp_clause): Print allocator() around allocate clause allocator and print align if present. * omp-low.c (scan_sharing_clauses): Force allocate_map entry even for omp_default_mem_alloc if align modifier is present. If align modifier is present, use TREE_LIST to encode both allocator and align. (lower_private_allocate, lower_rec_input_clauses, create_task_copyfn): Handle align modifier on allocator clause if present. gcc/c-family/ * c-omp.c (c_omp_split_clauses): Copy over OMP_CLAUSE_ALLOCATE_ALIGN. gcc/c/ * c-parser.c (c_parser_omp_clause_allocate): Parse allocate clause modifiers. gcc/cp/ * parser.c (cp_parser_omp_clause_allocate): Parse allocate clause modifiers. * semantics.c (finish_omp_clauses) <OMP_CLAUSE_ALLOCATE>: Perform semantic analysis of OMP_CLAUSE_ALLOCATE_ALIGN. * pt.c (tsubst_omp_clauses) <case OMP_CLAUSE_ALLOCATE>: Handle also OMP_CLAUSE_ALLOCATE_ALIGN. gcc/testsuite/ * c-c++-common/gomp/allocate-6.c: New test. * c-c++-common/gomp/allocate-7.c: New test. * g++.dg/gomp/allocate-4.C: New test. libgomp/ * testsuite/libgomp.c-c++-common/allocate-2.c: New test. * testsuite/libgomp.c-c++-common/allocate-3.c: New test.	2021-09-22 09:29:13 +02:00
GCC Administrator	2c41dd82e2	Daily bump.	2021-09-22 00:16:28 +00:00
Tobias Burnus	417ea5c02c	Fortran: Fix -Wno-missing-include-dirs handling [PR55534] gcc/fortran/ChangeLog: PR fortran/55534 * cpp.c: Define GCC_C_COMMON_C for #include "options.h" to make cpp_reason_option_codes available. (gfc_cpp_register_include_paths): Make static, set pfile's warn_missing_include_dirs and move before caller. (gfc_cpp_init_cb): New, cb code moved from ... (gfc_cpp_init_0): ... here. (gfc_cpp_post_options): Call gfc_cpp_init_cb. (cb_cpp_diagnostic_cpp_option): New. As implemented in c-family to match CppReason flags to -W... names. (cb_cpp_diagnostic): Use it to replace single special case. * cpp.h (gfc_cpp_register_include_paths): Remove as now static. * gfortran.h (gfc_check_include_dirs): New prototype. (gfc_add_include_path): Add new bool arg. * options.c (gfc_init_options): Don't set -Wmissing-include-dirs. (gfc_post_options): Set it here after commandline processing. Call gfc_add_include_path with defer_warn=false. (gfc_handle_option): Call it with defer_warn=true. * scanner.c (gfc_do_check_include_dir, gfc_do_check_include_dirs, gfc_check_include_dirs): New. Diagnostic moved from ... (add_path_to_list): ... here, which came before cmdline processing. Take additional bool defer_warn argument. (gfc_add_include_path): Take additional defer_warn arg. * scanner.h (struct gfc_directorylist): Reorder for alignment issues, add new 'bool warn'. libgfortran/ChangeLog: PR fortran/55534 * configure.ac (AM_FCFLAGS): Add -Wno-missing-include-dirs. * configure: Regenerate. libgomp/ChangeLog: PR fortran/55534 * testsuite/libgomp.fortran/fortran.exp: Add -Wno-missing-include-dirs to ALWAYS_CFLAGS. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/include_6.f90: Change dg-error to dg-warning and update pattern. * gfortran.dg/include_14.f90: New test. * gfortran.dg/include_15.f90: New test. * gfortran.dg/include_16.f90: New test. * gfortran.dg/include_17.f90: New test. * gfortran.dg/include_18.f90: New test. * gfortran.dg/include_19.f90: New test. * gfortran.dg/include_20.f90: New test. * gfortran.dg/include_21.f90: New test.	2021-09-21 08:28:30 +02:00
GCC Administrator	cf74e7b57b	Daily bump.	2021-09-19 00:16:29 +00:00
Jakub Jelinek	e5597f2ad5	openmp: Allow private or firstprivate arguments to default clause even for C/C++ OpenMP 5.1 allows default(private) or default(firstprivate) even in C/C++, but it behaves the same way as in Fortran only for variables not declared at namespace or file scope. For the namespace/file scope variables it instead behaves as default(none). 2021-09-18 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.c (omp_default_clause): For C/C++ default({,first}private), if file/namespace scope variable doesn't have predetermined sharing, treat it as if there was default(none). gcc/c/ * c-parser.c (c_parser_omp_clause_default): Handle private and firstprivate arguments, adjust diagnostics on unknown argument. gcc/cp/ * parser.c (cp_parser_omp_clause_default): Handle private and firstprivate arguments, adjust diagnostics on unknown argument. * cp-gimplify.c (cxx_omp_finish_clause): Handle OMP_CLAUSE_PRIVATE. gcc/testsuite/ * c-c++-common/gomp/default-2.c: New test. * c-c++-common/gomp/default-3.c: New test. * g++.dg/gomp/default-1.C: New test. libgomp/ * testsuite/libgomp.c++/default-1.C: New test. * testsuite/libgomp.c-c++-common/default-1.c: New test. * libgomp.texi (OpenMP 5.1): Mark "private and firstprivate argument to default clause in C and C++" as implemented.	2021-09-18 09:47:25 +02:00
GCC Administrator	0a4cb43932	Daily bump.	2021-09-18 00:16:36 +00:00
Julian Brown	2a3f9f6532	openacc: Shared memory layout optimisation This patch implements an algorithm to lay out local data-share (LDS) space. It currently works for AMD GCN. At the moment, LDS is used for three things: 1. Gang-private variables 2. Reduction temporaries (accumulators) 3. Broadcasting for worker partitioning After the patch is applied, (2) and (3) are placed at preallocated locations in LDS, and (1) continues to be handled by the backend (as it is at present prior to this patch being applied). LDS now looks like this: +--------------+ (gang-private size + 1024, = 1536) \| free space \| \| ... \| \| - - - - - - -\| \| worker bcast \| +--------------+ \| reductions \| +--------------+ <<< -mgang-private-size=<number> (def. 512) \| gang-private \| \| vars \| +--------------+ (32) \| low LDS vars \| +--------------+ LDS base So, gang-private space is fixed at a constant amount at compile time (which can be increased with a command-line switch if necessary for some given code). The layout algorithm takes out a slice of the remainder of usable space for reduction vars, and uses the rest for worker partitioning. The partitioning algorithm works as follows. 1. An "adjacency" set is built up for each basic block that might do a broadcast. This is calculated by starting at each such block, and doing a recursive DFS walk over successors to find the next block (or blocks) that also does a broadcast (dfs_broadcast_reachable_1). 2. The adjacency set is inverted to get adjacent predecessor blocks also. 3. Blocks that will perform a broadcast are sorted by size of that broadcast: the biggest blocks are handled first. 4. A splay tree structure is used to calculate the spans of LDS memory that are already allocated by the blocks adjacent to this one (merge_ranges{,_1}. 5. The current block's broadcast space is allocated from the first free span not allocated in the splay tree structure calculated above (first_fit_range). This seems to work quite nicely and efficiently with the splay tree structure. 6. Continue with the next-biggest broadcast block until we're done. In this way, "adjacent" broadcasts will not use the same piece of LDS memory. PR96334 "openacc: Unshare reduction temporaries for GCN" got merged in: The GCN backend uses tree nodes like MEM((__lds TYPE ) <constant>) for reduction temporaries. Unlike e.g. var decls and SSA names, these nodes cannot be shared during gimplification, but are so in some circumstances. This is detected when appropriate --enable-checking options are used. This patch unshares such nodes when they are reused more than once. gcc/ config/gcn/gcn-protos.h (gcn_goacc_create_worker_broadcast_record): Update prototype. * config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Use preallocated block of LDS memory. Do not cache/share decls for reduction temporaries between invocations. (gcn_goacc_reduction_teardown): Unshare VAR on second use. (gcn_goacc_create_worker_broadcast_record): Add OFFSET parameter and return temporary LDS space at that offset. Return pointer in "sender" case. * config/gcn/gcn.c (acc_lds_size, gang_private_hwm, lds_allocs): New global vars. (ACC_LDS_SIZE): Define as acc_lds_size. (gcn_init_machine_status): Don't initialise lds_allocated, lds_allocs, reduc_decls fields of machine function struct. (gcn_option_override): Handle default size for gang-private variables and -mgang-private-size option. (gcn_expand_prologue): Use LDS_SIZE instead of LDS_SIZE-1 when initialising M0_REG. (gcn_shared_mem_layout): New function. (gcn_print_lds_decl): Update comment. Use global lds_allocs map and gang_private_hwm variable. (TARGET_GOACC_SHARED_MEM_LAYOUT): Define target hook. * config/gcn/gcn.h (machine_function): Remove lds_allocated, lds_allocs, reduc_decls. Add reduction_base, reduction_limit. * config/gcn/gcn.opt (gang_private_size_opt): New global. (mgang-private-size=): New option. * doc/tm.texi.in (TARGET_GOACC_SHARED_MEM_LAYOUT): Place documentation hook. * doc/tm.texi: Regenerate. * omp-oacc-neuter-broadcast.cc (targhooks.h, diagnostic-core.h): Add includes. (build_sender_ref): Handle sender_decl being pointer. (worker_single_copy): Add PLACEMENT and ISOLATE_BROADCASTS parameters. Pass placement argument to create_worker_broadcast_record hook invocations. Handle sender_decl being pointer and isolate_broadcasts inserting extra barriers. (blk_offset_map_t): Add typedef. (neuter_worker_single): Add BLK_OFFSET_MAP parameter. Pass preallocated range to worker_single_copy call. (dfs_broadcast_reachable_1): New function. (idx_decl_pair_t, used_range_vec_t): New typedefs. (sort_size_descending): New function. (addr_range): New class. (splay_tree_compare_addr_range, splay_tree_free_key) (first_fit_range, merge_ranges_1, merge_ranges): New functions. (execute_omp_oacc_neuter_broadcast): Rename to... (oacc_do_neutering): ... this. Add BOUNDS_LO, BOUNDS_HI parameters. Arrange layout of shared memory for broadcast operations. (execute_omp_oacc_neuter_broadcast): New function. (pass_omp_oacc_neuter_broadcast::gate): Remove num_workers==1 handling from here. Enable pass for all OpenACC routines in order to call shared memory-layout hook. * target.def (create_worker_broadcast_record): Add OFFSET parameter. (shared_mem_layout): New hook. libgomp/ * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Update.	2021-09-17 21:04:30 +02:00
Julian Brown	8251f90e87	Add 'libgomp.oacc-c-c++-common/broadcast-many.c' libgomp/ * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: New test.	2021-09-17 21:04:29 +02:00
Jakub Jelinek	4a7842bb99	libgomp: Spelling error fix in OpenMP 5.1 conformance section Fix spelling of OpenMP directive declare variant. 2021-09-17 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (OpenMP 5.1): Spelling fix, declare variante -> declare variant.	2021-09-17 12:34:27 +02:00
Jakub Jelinek	3a2bcffac6	openmp: Add support for OpenMP 5.1 atomics for C++ Besides the C++ FE changes, I've noticed that the C FE didn't reject #pragma omp atomic capture compare { v = x; x = y; } and other forms of atomic swap, this patch fixes that too. And the c-family/ routine needed quite a few changes so that the new code in it works fine with both FEs. 2021-09-17 Jakub Jelinek <jakub@redhat.com> gcc/c-family/ * c-omp.c (c_finish_omp_atomic): Avoid creating TARGET_EXPR if test is true, use create_tmp_var_raw instead of create_tmp_var and add a zero initializer to TARGET_EXPRs that had NULL initializer. When omitting operands after v = x, use type of v rather than type of x. Fix type of vtmp TARGET_EXPR. gcc/c/ * c-parser.c (c_parser_omp_atomic): Reject atomic swap if capture is true. gcc/cp/ * cp-tree.h (finish_omp_atomic): Add r and weak arguments. * parser.c (cp_parser_omp_atomic): Update function comment for OpenMP 5.1 atomics, parse OpenMP 5.1 atomics and fail, compare and weak clauses. * semantics.c (finish_omp_atomic): Add r and weak arguments, handle them, handle COND_EXPRs. * pt.c (tsubst_expr): Adjust for COND_EXPR forms that finish_omp_atomic can now produce. gcc/testsuite/ * c-c++-common/gomp/atomic-18.c: Expect same diagnostics in C++ as in C. * c-c++-common/gomp/atomic-25.c: Drop c effective target. * c-c++-common/gomp/atomic-26.c: Likewise. * c-c++-common/gomp/atomic-27.c: Likewise. * c-c++-common/gomp/atomic-28.c: Likewise. * c-c++-common/gomp/atomic-29.c: Likewise. * c-c++-common/gomp/atomic-30.c: Likewise. Adjust expected diagnostics for C++ when it differs from C. (foo): Change return type from double to void. * g++.dg/gomp/atomic-5.C: Adjust expected diagnostics wording. * g++.dg/gomp/atomic-20.C: New test. libgomp/ * testsuite/libgomp.c-c++-common/atomic-19.c: Drop c effective target. Use /* / comments instead of //. testsuite/libgomp.c-c++-common/atomic-20.c: Likewise. * testsuite/libgomp.c-c++-common/atomic-21.c: Likewise. * testsuite/libgomp.c++/atomic-16.C: New test. * testsuite/libgomp.c++/atomic-17.C: New test.	2021-09-17 11:28:31 +02:00
GCC Administrator	a26206ec7b	Daily bump.	2021-09-11 00:16:27 +00:00
Jakub Jelinek	8122fbff77	openmp: Implement OpenMP 5.1 atomics, so far for C only This patch implements OpenMP 5.1 atomics (with clarifications from upcoming 5.2). The most important changes are that it is now possible to write (for C/C++, for Fortran it was possible before already) min/max atomics and more importantly compare and exchange in various forms. Also, acq_rel is now allowed on read/write and acq_rel/acquire are allowed on update, and there are new compare, weak and fail clauses. 2021-09-10 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-core.h (enum omp_memory_order): Add OMP_MEMORY_ORDER_MASK, OMP_FAIL_MEMORY_ORDER_UNSPECIFIED, OMP_FAIL_MEMORY_ORDER_RELAXED, OMP_FAIL_MEMORY_ORDER_ACQUIRE, OMP_FAIL_MEMORY_ORDER_RELEASE, OMP_FAIL_MEMORY_ORDER_ACQ_REL, OMP_FAIL_MEMORY_ORDER_SEQ_CST and OMP_FAIL_MEMORY_ORDER_MASK enumerators. (OMP_FAIL_MEMORY_ORDER_SHIFT): Define. * gimple-pretty-print.c (dump_gimple_omp_atomic_load, dump_gimple_omp_atomic_store): Print [weak] for weak atomic load/store. * gimple.h (enum gf_mask): Change GF_OMP_ATOMIC_MEMORY_ORDER to 6-bit mask, adjust GF_OMP_ATOMIC_NEED_VALUE value and add GF_OMP_ATOMIC_WEAK. (gimple_omp_atomic_weak_p, gimple_omp_atomic_set_weak): New inline functions. * tree.h (OMP_ATOMIC_WEAK): Define. * tree-pretty-print.c (dump_omp_atomic_memory_order): Adjust for fail memory order being encoded in the same enum and also print fail clause if present. (dump_generic_node): Print weak clause if OMP_ATOMIC_WEAK. * gimplify.c (goa_stabilize_expr): Add target_expr and rhs arguments, handle pre_p == NULL case as a test mode that only returns value but doesn't change gimplify nor change anything otherwise, adjust recursive calls, add MODIFY_EXPR, ADDR_EXPR, COND_EXPR, TARGET_EXPR and CALL_EXPR handling, adjust COMPOUND_EXPR handling for __builtin_clear_padding calls, for !rhs gimplify as lvalue rather than rvalue. (gimplify_omp_atomic): Adjust goa_stabilize_expr caller. Handle COND_EXPR rhs. Set weak flag on gimple load/store for OMP_ATOMIC_WEAK. * omp-expand.c (omp_memory_order_to_fail_memmodel): New function. (omp_memory_order_to_memmodel): Adjust for fail clause encoded in the same enum. (expand_omp_atomic_cas): New function. (expand_omp_atomic_pipeline): Use omp_memory_order_to_fail_memmodel function. (expand_omp_atomic): Attempt to optimize atomic compare and exchange using expand_omp_atomic_cas. gcc/c-family/ * c-common.h (c_finish_omp_atomic): Add r and weak arguments. * c-omp.c: Include gimple-fold.h. (c_finish_omp_atomic): Add r and weak arguments. Add support for OpenMP 5.1 atomics. gcc/c/ * c-parser.c (c_parser_conditional_expression): If omp_atomic_lhs and cond.value is >, < or == with omp_atomic_lhs as one of the operands, don't call build_conditional_expr, instead build a COND_EXPR directly. (c_parser_binary_expression): Avoid calling parser_build_binary_op if omp_atomic_lhs even in more cases for >, < or ==. (c_parser_omp_atomic): Update function comment for OpenMP 5.1 atomics, parse OpenMP 5.1 atomics and fail, compare and weak clauses, allow acq_rel on atomic read/write and acq_rel/acquire clauses on update. * c-typeck.c (build_binary_op): For flag_openmp only handle MIN_EXPR/MAX_EXPR. gcc/cp/ * parser.c (cp_parser_omp_atomic): Allow acq_rel on atomic read/write and acq_rel/acquire clauses on update. * semantics.c (finish_omp_atomic): Adjust c_finish_omp_atomic caller. gcc/testsuite/ * c-c++-common/gomp/atomic-17.c (foo): Add tests for atomic read, write or update with acq_rel clause and atomic update with acquire clause. * c-c++-common/gomp/atomic-18.c (foo): Adjust expected diagnostics wording, remove tests moved to atomic-17.c. * c-c++-common/gomp/atomic-21.c: Expect only 2 omp atomic release and 2 omp atomic acq_rel directives instead of 4 omp atomic release. * c-c++-common/gomp/atomic-25.c: New test. * c-c++-common/gomp/atomic-26.c: New test. * c-c++-common/gomp/atomic-27.c: New test. * c-c++-common/gomp/atomic-28.c: New test. * c-c++-common/gomp/atomic-29.c: New test. * c-c++-common/gomp/atomic-30.c: New test. * c-c++-common/goacc-gomp/atomic.c: Expect 1 omp atomic release and 1 omp atomic_acq_rel instead of 2 omp atomic release directives. * gcc.dg/gomp/atomic-5.c: Adjust expected error diagnostic wording. * g++.dg/gomp/atomic-18.C:Expect 4 omp atomic release and 1 omp atomic_acq_rel instead of 5 omp atomic release directives. libgomp/ * testsuite/libgomp.c-c++-common/atomic-19.c: New test. * testsuite/libgomp.c-c++-common/atomic-20.c: New test. * testsuite/libgomp.c-c++-common/atomic-21.c: New test.	2021-09-10 20:41:33 +02:00
GCC Administrator	b2748138c0	Daily bump.	2021-09-08 00:16:23 +00:00
Tobias Burnus	ff7bc505b1	libgomp.texi: Extend OpenMP 5.0 Implementation Status libgomp/ * libgomp.texi (OpenMP Implementation Status): Extend OpenMP 5.0 section. (OpenACC Profiling Interface): Fix typo.	2021-09-07 18:30:25 +02:00
Tobias Burnus	cff72ef4e2	libgomp.texi: Add OpenMP Implementation Status libgomp/ * libgomp.texi (Enabling OpenMP): Refer to OMP spec in general not to 4.5; link to new section. (OpenMP Implementation Status): New.	2021-09-07 11:01:38 +02:00
GCC Administrator	9f99555f29	Daily bump.	2021-09-07 00:16:34 +00:00
Thomas Schwinge	086bb917d6	'libgomp.c/target-43.c': '-latomic' for nvptx offloading ... to avoid a regression with recent commit `090f0d78f1` "openmp: Improve expand_omp_atomic_pipeline": unresolved symbol __atomic_compare_exchange_1 collect2: error: ld returned 1 exit status mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status libgomp/ * testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.	2021-09-06 11:51:13 +02:00
GCC Administrator	7b7395409c	Daily bump.	2021-09-04 00:16:38 +00:00
Tobias Burnus	4ce90454c2	libgomp./error-1.{c,f90}: Fix dg-output newline pattern libgomp/ChangeLog: testsuite/libgomp.c-c++-common/error-1.c: Use \r\n not \n\r in dg-output. * testsuite/libgomp.fortran/error-1.f90: Likewise.	2021-09-03 15:27:00 +02:00
GCC Administrator	38b19c5b08	Daily bump.	2021-08-24 00:17:00 +00:00
Thomas Schwinge	29c355f76c	Add 'libgomp.c/address-space-1.c' Intel MIC (emulated) offloading execution failure remains to be analyzed. libgomp/ * testsuite/libgomp.c/address-space-1.c: New file. Co-authored-by: Jakub Jelinek <jakub@redhat.com>	2021-08-23 17:46:08 +02:00
Thomas Schwinge	bb75b22aba	Allow matching Intel MIC in OpenMP 'declare variant' ..., and use that to improve XFAILing for Intel MIC offloading execution instead of compilation in 'libgomp.c-c++-common/target-45.c', 'libgomp.fortran/target10.f90'. gcc/ * config/i386/i386-options.c (ix86_omp_device_kind_arch_isa) <omp_device_arch> [ACCEL_COMPILER]: Match "intel_mic". * config/i386/t-omp-device (omp-device-properties-i386) <arch>: Add "intel_mic". libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_offload_target_intelmic): Remove 'proc'. (check_effective_target_offload_device_intel_mic): New 'proc'. * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_intel_mic, on_device_arch_intel_mic): New. * testsuite/libgomp.c-c++-common/target-45.c: Use that for 'dg-xfail-run-if'. * testsuite/libgomp.fortran/target10.f90: Likewise.	2021-08-23 17:45:40 +02:00
Tobias Burnus	d4de7e32ef	Fortran/OpenMP: strict modifier on grainsize/num_tasks This patch adds support for the 'strict' modifier on grainsize/num_tasks clauses, an OpenMP 5.1 feature supported in C/C++ since commit r12-3066-g3bc75533d1f87f0617be6c1af98804f9127ec637 gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle 'strict' modifier on grainsize/num_tasks * gfortran.h (gfc_omp_clauses): Add grainsize_strict and num_tasks_strict. * trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses): Handle 'strict' modifier on grainsize/num_tasks. * openmp.c (gfc_match_omp_clauses): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/taskloop-4-a.f90: New test. * testsuite/libgomp.fortran/taskloop-4.f90: New test. * testsuite/libgomp.fortran/taskloop-5-a.f90: New test. * testsuite/libgomp.fortran/taskloop-5.f90: New test.	2021-08-23 15:15:30 +02:00
Jakub Jelinek	3bc75533d1	openmp: Add support for strict modifier on grainsize/num_tasks clauses With strict: modifier on these clauses, the standard is explicit about how many iterations (and which) each generated task of taskloop directive should contain. For num_tasks it actually matches what we were already implementing, but for grainsize it does not (and even violates the old rule - without strict it requires that the number of iterations (unspecified which exactly) handled by each generated task is >= grainsize argument and < 2 * grainsize argument, with strict: it requires that each generated task handles exactly == grainsize argument iterations, except for the generated task handling the last iteration which can handles <= grainsize iterations). The following patch implements it for C and C++. 2021-08-23 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_GRAINSIZE_STRICT): Define. (OMP_CLAUSE_NUM_TASKS_STRICT): Define. * tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_GRAINSIZE, case OMP_CLAUSE_NUM_TASKS>: Print strict: modifier. * omp-expand.c (expand_task_call): Use GOMP_TASK_FLAG_STRICT in iflags if either grainsize or num_tasks clause has the strict modifier. gcc/c/ * c-parser.c (c_parser_omp_clause_num_tasks, c_parser_omp_clause_grainsize): Parse the optional strict: modifier. gcc/cp/ * parser.c (cp_parser_omp_clause_num_tasks, cp_parser_omp_clause_grainsize): Parse the optional strict: modifier. include/ * gomp-constants.h (GOMP_TASK_FLAG_STRICT): Define. libgomp/ * taskloop.c (GOMP_taskloop): Handle GOMP_TASK_FLAG_STRICT. * testsuite/libgomp.c-c++-common/taskloop-4.c (main): Fix up comment. * testsuite/libgomp.c-c++-common/taskloop-5.c: New test.	2021-08-23 10:16:24 +02:00
GCC Administrator	5b2876f96c	Daily bump.	2021-08-23 00:16:28 +00:00
Thomas Schwinge	a5416bf369	Make the OpenMP 'error' directive work for nvptx offloading ... and add a minimum amount of offloading testing. (Leaving aside that 'fwrite' to 'stderr' probably wouldn't work anyway) the 'fwrite' calls in 'libgomp/error.c:GOMP_warning', 'libgomp/error.c:GOMP_error' drag in 'isatty', which isn't provided by my nvptx newlib build at present, so we get, for example: [...] FAIL: libgomp.c/../libgomp.c-c++-common/declare_target-1.c (test for excess errors) Excess errors: unresolved symbol isatty mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status [...] ..., and many more. Fix up for recent commit `0d973c0a0d` "openmp: Implement the error directive". libgomp/ * config/nvptx/error.c (fwrite, exit): Override, too. * testsuite/libgomp.c-c++-common/error-1.c: Add a minimum amount of offloading testing. * testsuite/libgomp.fortran/error-1.f90: Likewise.	2021-08-22 11:08:26 +02:00
GCC Administrator	7c9e164583	Daily bump.	2021-08-21 00:16:29 +00:00
Tobias Burnus	77167196fe	Fortran: Add OpenMP's error directive Fortran part to the C/C++ implementation of commit r12-3040-g0d973c0a0d90a0a302e7eda1a4d9709be3c5b102 gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle 'at', 'severity' and 'message' clauses. (show_omp_node, show_code_node): Handle EXEC_OMP_ERROR. * gfortran.h (gfc_statement): Add ST_OMP_ERROR. (gfc_omp_severity_type, gfc_omp_at_type): New. (gfc_omp_clauses): Add 'at', 'severity' and 'message' clause; use more bitfields + ENUM_BITFIELD. (gfc_exec_op): Add EXEC_OMP_ERROR. * match.h (gfc_match_omp_error): New. * openmp.c (enum omp_mask1): Add OMP_CLAUSE_(AT,SEVERITY,MESSAGE). (gfc_match_omp_clauses): Handle new clauses. (OMP_ERROR_CLAUSES, gfc_match_omp_error): New. (resolve_omp_clauses): Resolve new clauses. (omp_code_to_statement, gfc_resolve_omp_directive): Handle EXEC_OMP_ERROR. * parse.c (decode_omp_directive, next_statement, gfc_ascii_statement): Handle 'omp error'. * resolve.c (gfc_resolve_blocks): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_omp_error): Likewise. (gfc_trans_omp_directive): Likewise. * trans.c (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/error-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/error-1.f90: New test. * gfortran.dg/gomp/error-2.f90: New test. * gfortran.dg/gomp/error-3.f90: New test.	2021-08-20 12:12:51 +02:00
Jakub Jelinek	0d973c0a0d	openmp: Implement the error directive This patch implements the error directive. Depending on clauses it is either a compile time diagnostics (in that case diagnosed right away) or runtime diagnostics (libgomp API call that diagnoses at runtime), and either fatal or warning (error or warning at compile time or fatal error vs. error at runtime) and either has no message or user supplied message (this kind of e.g. deprecated attribute). The directive is also stand-alone directive when at runtime while utility (thus disappears from the IL as if it wasn't there for parsing like nothing directive) at compile time. There are some clarifications in the works ATM, so this patch doesn't yet require that for compile time diagnostics the user message must be a constant string literal, there are uncertainities on what exactly is valid argument of message clause (whether just const char * type, convertible to const char , qualified/unqualified const char or char * or what else) and what to do in templates. Currently even in templates it is diagnosed right away for compile time diagnostics, if we'll need to substitute it, we'd need to queue something into the IL, have pt.c handle it and diagnose only later. 2021-08-20 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-builtins.def (BUILT_IN_GOMP_WARNING, BUILT_IN_GOMP_ERROR): New builtins. gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ERROR. * c-pragma.c (omp_pragmas): Add error directive. * c-omp.c (omp_directives): Uncomment error directive entry. gcc/c/ * c-parser.c (c_parser_omp_error): New function. (c_parser_pragma): Handle PRAGMA_OMP_ERROR. gcc/cp/ * parser.c (cp_parser_handle_statement_omp_attributes): Determine if PRAGMA_OMP_ERROR directive is C_OMP_DIR_STANDALONE. (cp_parser_omp_error): New function. (cp_parser_pragma): Handle PRAGMA_OMP_ERROR. gcc/fortran/ * types.def (BT_FN_VOID_CONST_PTR_SIZE): New DEF_FUNCTION_TYPE_2. * f95-lang.c (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST): Define. gcc/testsuite/ * c-c++-common/gomp/error-1.c: New test. * c-c++-common/gomp/error-2.c: New test. * c-c++-common/gomp/error-3.c: New test. * g++.dg/gomp/attrs-1.C (bar): Add error directive test. * g++.dg/gomp/attrs-2.C (bar): Add error directive test. * g++.dg/gomp/attrs-13.C: New test. * g++.dg/gomp/error-1.C: New test. libgomp/ * libgomp.map (GOMP_5.1): Add GOMP_error and GOMP_warning. * libgomp_g.h (GOMP_warning, GOMP_error): Declare. * error.c (GOMP_warning, GOMP_error): New functions. * testsuite/libgomp.c-c++-common/error-1.c: New test.	2021-08-20 11:36:52 +02:00
GCC Administrator	6e529985d8	Daily bump.	2021-08-19 00:16:42 +00:00
Tobias Burnus	76bb3c50dd	Fortran/OpenMP: Add memory routines existing for C/C++ This patch adds the Fortran interface for omp_alloc/omp_free and the omp_target_* memory routines, which were added in OpenMP 5.0 for C/C++ but only OpenMP 5.1 added them for Fortran. Those functions use BIND(C), i.e. on the libgomp side, the same interface as for C/C++ is used. Note: By using BIND(C) in omp_lib.h, files including this file no longer compiler with -std=f95 but require at least -std=f2003. libgomp/ChangeLog: * omp_lib.f90.in (omp_alloc, omp_free, omp_target_alloc, omp_target_free. omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr): Add interface. * omp_lib.h.in (omp_alloc, omp_free, omp_target_alloc, omp_target_free. omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr): Add interface. * testsuite/libgomp.fortran/alloc-1.F90: Remove local interface block for omp_alloc + omp_free. * testsuite/libgomp.fortran/alloc-4.f90: Likewise. * testsuite/libgomp.fortran/refcount-1.f90: New test. * testsuite/libgomp.fortran/target-12.f90: New test.	2021-08-18 11:15:47 +02:00
Jakub Jelinek	5079b7781a	openmp: Add nothing directive support As has been clarified, it is intentional that nothing directive is accepted in substatements of selection and looping statements and after labels and is handled as if the directive just isn't there, so that void foo (int x) { if (x) #pragma omp metadirective when (...:nothing) when (...:parallel) bar (); } behaves consistently; declarative and stand-alone directives aren't allowed at that point, but constructs are parsed with the following statement as the construct body and nothing or missing default on metadirective therefore should handle the following statement as part of the if substatement instead of having nothing as the substatement and bar done unconditionally after the if. 2021-08-18 Jakub Jelinek <jakub@redhat.com> gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_NOTHING. * c-pragma.c (omp_pragmas): Add nothing directive. * c-omp.c (omp_directives): Uncomment nothing directive entry. gcc/c/ * c-parser.c (c_parser_omp_nothing): New function. (c_parser_pragma): Handle PRAGMA_OMP_NOTHING. gcc/cp/ * parser.c (cp_parser_omp_nothing): New function. (cp_parser_pragma): Handle PRAGMA_OMP_NOTHING. gcc/testsuite/ * c-c++-common/gomp/nothing-1.c: New test. * g++.dg/gomp/attrs-1.C (bar): Add nothing directive test. * g++.dg/gomp/attrs-2.C (bar): Likewise. * g++.dg/gomp/attrs-9.C: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/nothing-1.c: New test.	2021-08-18 11:10:43 +02:00
GCC Administrator	2d14d64bf2	Daily bump.	2021-08-18 00:16:48 +00:00
Tobias Burnus	f8d535f3fe	Fortran: Implement OpenMP 5.1 scope construct Fortran version to commit `e45483c7c4`, which implemented OpenMP's scope construct for C and C++. Most testcases are based on the C testcases; it also contains some testcases which existed previously but had no Fortran equivalent. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OMP_SCOPE. * gfortran.h (enum gfc_statement): Add ST_OMP_(END_)SCOPE. (enum gfc_exec_op): Add EXEC_OMP_SCOPE. * match.h (gfc_match_omp_scope): New. * openmp.c (OMP_SCOPE_CLAUSES): Define (gfc_match_omp_scope): New. (gfc_match_omp_cancellation_point, gfc_match_omp_end_nowait): Improve error diagnostic. (omp_code_to_statement): Handle ST_OMP_SCOPE. (gfc_resolve_omp_directive): Handle EXEC_OMP_SCOPE. * parse.c (decode_omp_directive, next_statement, gfc_ascii_statement, parse_omp_structured_block, parse_executable): Handle OpenMP's scope construct. * resolve.c (gfc_resolve_blocks): Likewise * st.c (gfc_free_statement): Likewise * trans-openmp.c (gfc_trans_omp_scope): New. (gfc_trans_omp_directive): Call it. * trans.c (trans_code): handle EXEC_OMP_SCOPE. libgomp/ChangeLog: * testsuite/libgomp.fortran/scope-1.f90: New test. * testsuite/libgomp.fortran/task-reduction-16.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/scan-1.f90: * gfortran.dg/gomp/cancel-1.f90: New test. * gfortran.dg/gomp/cancel-4.f90: New test. * gfortran.dg/gomp/loop-4.f90: New test. * gfortran.dg/gomp/nesting-1.f90: New test. * gfortran.dg/gomp/nesting-2.f90: New test. * gfortran.dg/gomp/nesting-3.f90: New test. * gfortran.dg/gomp/nowait-1.f90: New test. * gfortran.dg/gomp/reduction-task-1.f90: New test. * gfortran.dg/gomp/reduction-task-2.f90: New test. * gfortran.dg/gomp/reduction-task-2a.f90: New test. * gfortran.dg/gomp/reduction-task-3.f90: New test. * gfortran.dg/gomp/scope-1.f90: New test. * gfortran.dg/gomp/scope-2.f90: New test.	2021-08-17 15:51:03 +02:00
Jakub Jelinek	e45483c7c4	openmp: Implement OpenMP 5.1 scope construct This patch implements the OpenMP 5.1 scope construct, which is similar to worksharing constructs in many regards, but isn't one of them. The body of the construct is encountered by all threads though, it can be nested in itself or intermixed with taskgroup and worksharing etc. constructs can appear inside of it (but it can't be nested in worksharing etc. constructs). The main purpose of the construct is to allow reductions (normal and task ones) without the need to close the parallel and reopen another one. If it doesn't have task reductions, it can be implemented without any new library support, with nowait it just does the privatizations at the start if any and reductions before the end of the body, with without nowait emits a normal GOMP_barrier{,_cancel} at the end too. For task reductions, we need to ensure only one thread initializes the task reduction library data structures and other threads copy from that, so a new GOMP_scope_start routine is added to the library for that. It acts as if the start of the scope construct is a nowait worksharing construct (that is ok, it can't be nested in other worksharing constructs and all threads need to encounter the start in the same order) which does the task reduction initialization, but as the body can have other scope constructs and/or worksharing constructs, that is all where we use this dummy worksharing construct. With task reductions, the construct must not have nowait and ends with a GOMP_barrier{,_cancel}, followed by task reductions followed by GOMP_workshare_task_reduction_unregister. Only C/C++ FE support is done. 2021-08-17 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.def (OMP_SCOPE): New tree code. * tree.h (OMP_SCOPE_BODY, OMP_SCOPE_CLAUSES): Define. * tree-nested.c (convert_nonlocal_reference_stmt, convert_local_reference_stmt, convert_gimple_call): Handle GIMPLE_OMP_SCOPE. * tree-pretty-print.c (dump_generic_node): Handle OMP_SCOPE. * gimple.def (GIMPLE_OMP_SCOPE): New gimple code. * gimple.c (gimple_build_omp_scope): New function. (gimple_copy): Handle GIMPLE_OMP_SCOPE. * gimple.h (gimple_build_omp_scope): Declare. (gimple_has_substatements): Handle GIMPLE_OMP_SCOPE. (gimple_omp_scope_clauses, gimple_omp_scope_clauses_ptr, gimple_omp_scope_set_clauses): New inline functions. (CASE_GIMPLE_OMP): Add GIMPLE_OMP_SCOPE. * gimple-pretty-print.c (dump_gimple_omp_scope): New function. (pp_gimple_stmt_1): Handle GIMPLE_OMP_SCOPE. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple-low.c (lower_stmt): Likewise. * gimplify.c (is_gimple_stmt): Handle OMP_MASTER. (gimplify_scan_omp_clauses): For task reductions, handle OMP_SCOPE like ORT_WORKSHARE constructs. Adjust diagnostics for %<scope%> allowing task reductions. Reject inscan reductions on scope. (omp_find_stores_stmt): Handle GIMPLE_OMP_SCOPE. (gimplify_omp_workshare, gimplify_expr): Handle OMP_SCOPE. * tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_SCOPE. (estimate_num_insns): Likewise. * omp-low.c (build_outer_var_ref): Look through GIMPLE_OMP_SCOPE contexts if var isn't privatized there. (check_omp_nesting_restrictions): Handle GIMPLE_OMP_SCOPE. (scan_omp_1_stmt): Likewise. (maybe_add_implicit_barrier_cancel): Look through outer scope constructs. (lower_omp_scope): New function. (lower_omp_task_reductions): Handle OMP_SCOPE. (lower_omp_1): Handle GIMPLE_OMP_SCOPE. (diagnose_sb_1, diagnose_sb_2): Likewise. * omp-expand.c (expand_omp_single): Support also GIMPLE_OMP_SCOPE. (expand_omp): Handle GIMPLE_OMP_SCOPE. (omp_make_gimple_edges): Likewise. * omp-builtins.def (BUILT_IN_GOMP_SCOPE_START): New built-in. gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_SCOPE. * c-pragma.c (omp_pragmas): Add scope construct. * c-omp.c (omp_directives): Uncomment scope directive entry. gcc/c/ * c-parser.c (OMP_SCOPE_CLAUSE_MASK): Define. (c_parser_omp_scope): New function. (c_parser_omp_construct): Handle PRAGMA_OMP_SCOPE. gcc/cp/ * parser.c (OMP_SCOPE_CLAUSE_MASK): Define. (cp_parser_omp_scope): New function. (cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_SCOPE. * pt.c (tsubst_expr): Handle OMP_SCOPE. gcc/testsuite/ * c-c++-common/gomp/nesting-2.c (foo): Add scope and masked construct tests. * c-c++-common/gomp/scan-1.c (f3): Add scope construct test.. * c-c++-common/gomp/cancel-1.c (f2): Add scope and masked construct tests. * c-c++-common/gomp/reduction-task-2.c (bar): Add scope construct test. Adjust diagnostics for the addition of scope. * c-c++-common/gomp/loop-1.c (f5): Add master, masked and scope construct tests. * c-c++-common/gomp/clause-dups-1.c (f1): Add scope construct test. * gcc.dg/gomp/nesting-1.c (f1, f2, f3): Add scope construct tests. * c-c++-common/gomp/scope-1.c: New test. * c-c++-common/gomp/scope-2.c: New test. * g++.dg/gomp/attrs-1.C (bar): Add scope construct tests. * g++.dg/gomp/attrs-2.C (bar): Likewise. * gfortran.dg/gomp/reduction4.f90: Adjust expected diagnostics. * gfortran.dg/gomp/reduction7.f90: Likewise. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add scope.c * Makefile.in: Regenerated. * libgomp_g.h (GOMP_scope_start): Declare. * libgomp.map: Add GOMP_scope_start@@GOMP_5.1. * scope.c: New file. * testsuite/libgomp.c-c++-common/scope-1.c: New test. * testsuite/libgomp.c-c++-common/task-reduction-16.c: New test.	2021-08-17 09:30:09 +02:00
GCC Administrator	9d1d9fc8b4	Daily bump.	2021-08-17 00:16:32 +00:00
Thomas Schwinge	a2ab2f0dfb	Address '?:' issues in 'libgomp.oacc-c-c++-common/mode-transitions.c' [...]/libgomp.oacc-c-c++-common/mode-transitions.c: In function ‘t3’: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:127:43: warning: ‘?:’ using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context] 127 \| assert (arr[i] == ((i % 64) < 32) ? 1 : -1); \| ^ [...]/libgomp.oacc-c-c++-common/mode-transitions.c: In function ‘t9’: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:359:46: warning: ‘?:’ using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context] 359 \| assert (arr[i] == ((i % 3) == 0) ? 1 : 2); \| ^ ..., and PR101862 "[C, C++] Potential '?:' diagnostic for always-true expressions in boolean context". libgomp/ * testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Address '?:' issues.	2021-08-16 12:12:09 +02:00
Tobias Burnus	53d5b59cb3	Fortran/OpenMP: Add support for OpenMP 5.1 masked construct Commit r12-2891-gd0befed793b94f3f407be44e6f69f81a02f5f073 added C/C++ support for the masked construct. This patch extends it to Fortran. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle 'filter' clause. (show_omp_node, show_code_node): Handle (combined) omp masked construct. * frontend-passes.c (gfc_code_walker): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP__MASKED. (enum gfc_exec_op): Add EXEC_OMP__MASKED. * match.h (gfc_match_omp_masked, gfc_match_omp_masked_taskloop, gfc_match_omp_masked_taskloop_simd, gfc_match_omp_parallel_masked, gfc_match_omp_parallel_masked_taskloop, gfc_match_omp_parallel_masked_taskloop_simd): New prototypes. * openmp.c (enum omp_mask1): Add OMP_CLAUSE_FILTER. (gfc_match_omp_clauses): Match it. (OMP_MASKED_CLAUSES, gfc_match_omp_parallel_masked, gfc_match_omp_parallel_masked_taskloop, gfc_match_omp_parallel_masked_taskloop_simd, gfc_match_omp_masked, gfc_match_omp_masked_taskloop, gfc_match_omp_masked_taskloop_simd): New. (resolve_omp_clauses): Resolve filter clause. (gfc_resolve_omp_parallel_blocks, resolve_omp_do, omp_code_to_statement, gfc_resolve_omp_directive): Handle omp masked constructs. * parse.c (decode_omp_directive, case_exec_markers, gfc_ascii_statement, parse_omp_do, parse_omp_structured_block, parse_executable): Likewise. * resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_omp_clauses): Handle filter clause. (GFC_OMP_SPLIT_MASKED, GFC_OMP_MASK_MASKED): New enum values. (gfc_trans_omp_masked): New. (gfc_split_omp_clauses): Handle combined masked directives. (gfc_trans_omp_master_taskloop): Rename to ... (gfc_trans_omp_master_masked_taskloop): ... this; handle also combined masked directives. (gfc_trans_omp_parallel_master): Rename to ... (gfc_trans_omp_parallel_master_masked): ... this; handle combined masked directives. (gfc_trans_omp_directive): Handle EXEC_OMP__MASKED. * trans.c (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/masked-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/masked-1.f90: New test. * gfortran.dg/gomp/masked-2.f90: New test. * gfortran.dg/gomp/masked-3.f90: New test. * gfortran.dg/gomp/masked-combined-1.f90: New test. * gfortran.dg/gomp/masked-combined-2.f90: New test.	2021-08-16 09:26:26 +02:00
GCC Administrator	261512fa6d	Daily bump.	2021-08-14 00:16:29 +00:00
Thomas Schwinge	2cc65fcbd4	Adjust 'libgomp.oacc-c-c++-common/static-variable-1.c' ... for 'gcc/gimplify.c:gimplify_scan_omp_clauses' changes in recent commit `d0befed793` "openmp: Add support for OpenMP 5.1 masked construct". libgomp/ * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Adjust.	2021-08-13 22:53:58 +02:00
GCC Administrator	72be20e202	Daily bump.	2021-08-13 00:16:43 +00:00
Jakub Jelinek	d0befed793	openmp: Add support for OpenMP 5.1 masked construct This construct has been introduced as a replacement for master construct, but unlike that construct is slightly more general, has an optional clause which allows to choose which thread will be the one running the region, it can be some other thread than the master (primary) thread with number 0, or it could be no threads or multiple threads (then of course one needs to be careful about data races). It is way too early to deprecate the master construct though, we don't even have OpenMP 5.0 fully implemented, it has been deprecated in 5.1, will be also in 5.2 and removed in 6.0. But even then it will likely be a good idea to just -Wdeprecated warn about it and still accept it. The patch also contains something I should have done much earlier, for clauses that accept some integral expression where we only care about the value, forces during gimplification that value into either a min invariant (as before), SSA_NAME or a fresh temporary, but never e.g. a user VAR_DECL, so that for those clauses we don't need to worry about adjusting it. 2021-08-12 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.def (OMP_MASKED): New tree code. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FILTER. * tree.h (OMP_MASKED_BODY, OMP_MASKED_CLAUSES, OMP_MASKED_COMBINED, OMP_CLAUSE_FILTER_EXPR): Define. * tree.c (omp_clause_num_ops): Add OMP_CLAUSE_FILTER entry. (omp_clause_code_name): Likewise. (walk_tree_1): Handle OMP_CLAUSE_FILTER. * tree-nested.c (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Handle OMP_CLAUSE_FILTER. (convert_nonlocal_reference_stmt, convert_local_reference_stmt, convert_gimple_call): Handle GIMPLE_OMP_MASTER. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_FILTER. (dump_generic_node): Handle OMP_MASTER. * gimple.def (GIMPLE_OMP_MASKED): New gimple code. * gimple.c (gimple_build_omp_masked): New function. (gimple_copy): Handle GIMPLE_OMP_MASKED. * gimple.h (gimple_build_omp_masked): Declare. (gimple_has_substatements): Handle GIMPLE_OMP_MASKED. (gimple_omp_masked_clauses, gimple_omp_masked_clauses_ptr, gimple_omp_masked_set_clauses): New inline functions. (CASE_GIMPLE_OMP): Add GIMPLE_OMP_MASKED. * gimple-pretty-print.c (dump_gimple_omp_masked): New function. (pp_gimple_stmt_1): Handle GIMPLE_OMP_MASKED. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple-low.c (lower_stmt): Likewise. * gimplify.c (is_gimple_stmt): Handle OMP_MASTER. (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FILTER. For clauses that take one expression rather than decl or constant, force gimplification of that into a SSA_NAME or temporary unless min invariant. (gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_FILTER. (gimplify_expr): Handle OMP_MASKED. * tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_MASKED. (estimate_num_insns): Likewise. * omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FILTER. (check_omp_nesting_restrictions): Handle GIMPLE_OMP_MASKED. Adjust diagnostics for existence of masked construct. (scan_omp_1_stmt, lower_omp_master, lower_omp_1, diagnose_sb_1, diagnose_sb_2): Handle GIMPLE_OMP_MASKED. * omp-expand.c (expand_omp_synch, expand_omp, omp_make_gimple_edges): Likewise. gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_MASKED. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FILTER. * c-pragma.c (omp_pragmas_simd): Add masked construct. * c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_MASKED enumerator. (c_finish_omp_masked): Declare. * c-omp.c (c_finish_omp_masked): New function. (c_omp_split_clauses): Handle combined masked constructs. gcc/c/ * c-parser.c (c_parser_omp_clause_name): Parse filter clause name. (c_parser_omp_clause_filter): New function. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER. (OMP_MASKED_CLAUSE_MASK): Define. (c_parser_omp_masked): New function. (c_parser_omp_parallel): Handle parallel masked. (c_parser_omp_construct): Handle PRAGMA_OMP_MASKED. * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FILTER. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Parse filter clause name. (cp_parser_omp_clause_filter): New function. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER. (OMP_MASKED_CLAUSE_MASK): Define. (cp_parser_omp_masked): New function. (cp_parser_omp_parallel): Handle parallel masked. (cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_MASKED. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FILTER. * pt.c (tsubst_omp_clauses): Likewise. (tsubst_expr): Handle OMP_MASKED. gcc/testsuite/ * c-c++-common/gomp/clauses-1.c (bar): Add tests for combined masked constructs with clauses. * c-c++-common/gomp/clauses-5.c (foo): Add testcase for filter clause. * c-c++-common/gomp/clause-dups-1.c (f1): Likewise. * c-c++-common/gomp/masked-1.c: New test. * c-c++-common/gomp/masked-2.c: New test. * c-c++-common/gomp/masked-combined-1.c: New test. * c-c++-common/gomp/masked-combined-2.c: New test. * c-c++-common/goacc/uninit-if-clause.c: Remove xfails. * g++.dg/gomp/block-11.C: New test. * g++.dg/gomp/tpl-masked-1.C: New test. * g++.dg/gomp/attrs-1.C (bar): Add tests for masked construct and combined masked constructs with clauses in attribute syntax. * g++.dg/gomp/attrs-2.C (bar): Likewise. * gcc.dg/gomp/nesting-1.c (f1, f2): Add tests for masked construct nesting. * gfortran.dg/goacc/host_data-tree.f95: Allow also SSA_NAMEs in if clause. * gfortran.dg/goacc/kernels-tree.f95: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/masked-1.c: New test.	2021-08-12 22:41:17 +02:00
Tobias Burnus	432de08498	OpenMP 5.1: Add proc-bind 'primary' support In OpenMP 5.1 "master thread" was changed to "primary thread" and the proc_bind clause and the OMP_PROC_BIND environment variable now take 'primary' as argument as alias for 'master', while the latter is deprecated. This commit accepts 'primary' and adds the named constant omp_proc_bind_primary and changes 'master thread' in the documentation; however, given that not even OpenMP 5.0 is fully supported, omp_display_env and the dumps currently still output 'master' and there is no deprecation warning when using the 'master' in the proc_bind clause. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_proc_bind): Accept 'primary' as alias for 'master'. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_clause_proc_bind): Accept 'primary' as alias for 'master'. gcc/fortran/ChangeLog: * gfortran.h (gfc_omp_proc_bind_kind): Add OMP_PROC_BIND_PRIMARY. * dump-parse-tree.c (show_omp_clauses): Add TODO comment to change 'master' to 'primary' in proc_bind for OpenMP 5.1. * intrinsic.texi (OMP_LIB): Mention OpenMP 5.1; add omp_proc_bind_primary. * openmp.c (gfc_match_omp_clauses): Accept 'primary' as alias for 'master'. * trans-openmp.c (gfc_trans_omp_clauses): Handle OMP_PROC_BIND_PRIMARY. gcc/ChangeLog: * tree-core.h (omp_clause_proc_bind_kind): Add OMP_CLAUSE_PROC_BIND_PRIMARY. * tree-pretty-print.c (dump_omp_clause): Add TODO comment to change 'master' to 'primary' in proc_bind for OpenMP 5.1. libgomp/ChangeLog: * env.c (parse_bind_var): Accept 'primary' as alias for 'master'. (omp_display_env): Add TODO comment to change 'master' to 'primary' in proc_bind for OpenMP 5.1. * libgomp.texi: Change 'master thread' to 'primary thread' in line with OpenMP 5.1. (omp_get_proc_bind): Add omp_proc_bind_primary and note that omp_proc_bind_master is an alias of it. (OMP_PROC_BIND): Mention 'PRIMARY'. * omp.h.in (__GOMP_DEPRECATED_5_1): Define. (omp_proc_bind_primary): Add. (omp_proc_bind_master): Deprecate for OpenMP 5.1. * omp_lib.f90.in (omp_proc_bind_primary): Add. (omp_proc_bind_master): Deprecate for OpenMP 5.1. * omp_lib.h.in (omp_proc_bind_primary): Add. * testsuite/libgomp.c/affinity-1.c: Check that 'primary' works and is identical to 'master'. gcc/testsuite/ChangeLog: * c-c++-common/gomp/pr61486-2.c: Duplicate one proc_bind(master) testcase and test proc_bind(primary) instead. * gfortran.dg/gomp/affinity-1.f90: Likewise.	2021-08-12 15:49:49 +02:00
GCC Administrator	377681505f	Daily bump.	2021-08-10 00:16:28 +00:00
Julian Brown	c408512e1f	amdgcn: Enable OpenACC worker partitioning for AMD GCN gcc/ * config/gcn/gcn.c (gcn_init_builtins): Override decls for BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START, BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER. (gcn_goacc_validate_dims): Turn on worker partitioning unconditionally. (gcn_fork_join): Update comment. * config/gcn/gcn.opt (flag_worker_partitioning): Remove. (macc_experimental_workers): Remove unused option. libgomp/ * plugin/plugin-gcn.c (gcn_exec): Change default number of workers to 16. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c [acc_device_radeon]: Update. * testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c [ACC_DEVICE_TYPE_radeon]: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c [acc_device_radeon]: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c [ACC_DEVICE_TYPE_radeon]: Likewise. * testsuite/libgomp.oacc-fortran/optional-reduction.f90: XFAIL for 'openacc_radeon_accel_selected' and '-O0'. * testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise. Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>	2021-08-09 15:08:44 +02:00
GCC Administrator	8ebf4fb54a	Daily bump.	2021-08-06 00:16:29 +00:00
Chung-Lin Tang	0bac793ed6	openmp: Implement omp_get_device_num routine This patch implements the omp_get_device_num library routine, specified in OpenMP 5.0. GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number" variable, is defined on the device-side libgomp, has it's address returned to host-side libgomp during device initialization, and the host libgomp then sets its value to the designated device number. libgomp/ChangeLog: * icv-device.c (omp_get_device_num): New API function, host side. * fortran.c (omp_get_device_num_): New interface function. * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol. * libgomp.map (OMP_5.0.2): New version space with omp_get_device_num, omp_get_device_num_. * libgomp.texi (omp_get_device_num): Add documentation for new API function. * omp.h.in (omp_get_device_num): Add declaration. * omp_lib.f90.in (omp_get_device_num): Likewise. * omp_lib.h.in (omp_get_device_num): Likewise. * target.c (gomp_load_image_to_device): If additional entry for device number exists at end of returned entries from 'load_image_func' hook, copy the assigned device number over to the device variable. * config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-gcn.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-nvptx.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * testsuite/lib/libgomp.exp (check_effective_target_offload_target_intelmic): New function for testing for intelmic offloading. * testsuite/libgomp.c-c++-common/target-45.c: New test. * testsuite/libgomp.fortran/target10.f90: New test.	2021-08-05 23:29:03 +08:00
Martin Liska	872c1a56e3	ChangeLog: add problematic commit `2e96b5f14e`. gcc/ChangeLog: * ChangeLog: Add manually. libgomp/ChangeLog: * ChangeLog: Add manually. gcc/testsuite/ChangeLog: * ChangeLog: Add manually.	2021-08-03 09:57:21 +02:00
GCC Administrator	4d17ca1bc7	Daily bump.	2021-08-03 07:49:16 +00:00
Thomas Schwinge	28665ddc7e	[libgomp] Restore offloading 'libgomp/fortran.c' GCN: ld: error: undefined symbol: gomp_ialias_omp_display_env >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744) >>> fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a >>> referenced by fortran.c:744 ([...]/source-gcc/libgomp/fortran.c:744) >>> fortran.o:(omp_display_env_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750) >>> fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a >>> referenced by fortran.c:750 ([...]/source-gcc/libgomp/fortran.c:750) >>> fortran.o:(omp_display_env_8_) in archive [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libgomp/.libs/libgomp.a collect2: error: ld returned 1 exit status mkoffload: fatal error: build-gcc/gcc/x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc returned 1 exit status nvptx: unresolved symbol omp_display_env collect2: error: ld returned 1 exit status mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status Fix-up for commit `7123ae2455` "Implement OpenMP 5.1 section 3.15: omp_display_env". libgomp/ * fortran.c (omp_display_env_, omp_display_env_8_): Only '#ifndef LIBGOMP_OFFLOADED_ONLY'. Co-Authored-By: Ulrich Drepper <drepper@redhat.com>	2021-07-30 12:02:15 +02:00
Thomas Schwinge	0829ab79d3	[OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower' This really is a separate step -- and another pass to be added between the two, later on. gcc/ * omp-offload.c (oacc_loop_xform_head_tail, oacc_loop_process): 'update_stmt' after modification. (pass_oacc_loop_designation): New function, extracted out of... (pass_oacc_device_lower): ... this. (pass_data_oacc_loop_designation, pass_oacc_loop_designation) (make_pass_oacc_loop_designation): New * passes.def: Add it. * tree-parloops.c (create_parallel_loop): Adjust. * tree-pass.h (make_pass_oacc_loop_designation): New. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: 's%oaccdevlow%oaccloops%g'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine-nohost.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/routine-nohost-1.c: Likewise. * g++.dg/goacc/template.C: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine-nohost.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: 's%oaccdevlow%oaccloops%g'. * testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise. * testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise. Co-Authored-By: Julian Brown <julian@codesourcery.com> Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>	2021-07-29 09:19:44 +02:00
Aldy Hernandez	2e96b5f14e	Backwards jump threader rewrite with ranger. This is a rewrite of the backwards threader with a ranger based solver. The code is divided into two parts: the path solver in gimple-range-path., and the path discovery bits in tree-ssa-threadbackward.c. The legacy code is still available with --param=threader-mode=legacy, but will be removed shortly after. gcc/ChangeLog: Makefile.in (tree-ssa-loop-im.o-warn): New. * flag-types.h (enum threader_mode): New. * params.opt: Add entry for --param=threader-mode. * tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New. (class back_threader): New. (back_threader::back_threader): New. (back_threader::~back_threader): New. (back_threader::maybe_register_path): New. (back_threader::find_taken_edge): New. (back_threader::find_taken_edge_switch): New. (back_threader::find_taken_edge_cond): New. (back_threader::resolve_def): New. (back_threader::resolve_phi): New. (back_threader::find_paths_to_names): New. (back_threader::find_paths): New. (dump_path): New. (debug): New. (thread_jumps::find_jump_threads_backwards): Call ranger threader. (thread_jumps::find_jump_threads_backwards_with_ranger): New. (pass_thread_jumps::execute): Abstract out code... (try_thread_blocks): ...here. * tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges): Abstract out threading candidate code to... (single_succ_to_potentially_threadable_block): ...here. * tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block): New. * tree-ssa-threadupdate.c (register_jump_thread): Return boolean. * tree-ssa-threadupdate.h (class jump_thread_path_registry): Return bool from register_jump_thread. libgomp/ChangeLog: * testsuite/libgomp.graphite/force-parallel-4.c: Adjust for threader. * testsuite/libgomp.graphite/force-parallel-8.c: Same. gcc/testsuite/ChangeLog: * g++.dg/debug/dwarf2/deallocator.C: Adjust for threader. * gcc.c-torture/compile/pr83510.c: Same. * dg.dg/analyzer/pr94851-2.c: Same. * gcc.dg/loop-unswitch-2.c: Same. * gcc.dg/old-style-asm-1.c: Same. * gcc.dg/pr68317.c: Same. * gcc.dg/pr97567-2.c: Same. * gcc.dg/predict-9.c: Same. * gcc.dg/shrink-wrap-loop.c: Same. * gcc.dg/sibcall-1.c: Same. * gcc.dg/tree-ssa/builtin-sprintf-3.c: Same. * gcc.dg/tree-ssa/pr21001.c: Same. * gcc.dg/tree-ssa/pr21294.c: Same. * gcc.dg/tree-ssa/pr21417.c: Same. * gcc.dg/tree-ssa/pr21458-2.c: Same. * gcc.dg/tree-ssa/pr21563.c: Same. * gcc.dg/tree-ssa/pr49039.c: Same. * gcc.dg/tree-ssa/pr61839_1.c: Same. * gcc.dg/tree-ssa/pr61839_3.c: Same. * gcc.dg/tree-ssa/pr77445-2.c: Same. * gcc.dg/tree-ssa/split-path-4.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. * gcc.dg/tree-ssa/ssa-fre-48.c: Same. * gcc.dg/tree-ssa/ssa-thread-11.c: Same. * gcc.dg/tree-ssa/ssa-thread-12.c: Same. * gcc.dg/tree-ssa/ssa-thread-14.c: Same. * gcc.dg/tree-ssa/vrp02.c: Same. * gcc.dg/tree-ssa/vrp03.c: Same. * gcc.dg/tree-ssa/vrp05.c: Same. * gcc.dg/tree-ssa/vrp06.c: Same. * gcc.dg/tree-ssa/vrp07.c: Same. * gcc.dg/tree-ssa/vrp09.c: Same. * gcc.dg/tree-ssa/vrp19.c: Same. * gcc.dg/tree-ssa/vrp20.c: Same. * gcc.dg/tree-ssa/vrp33.c: Same. * gcc.dg/uninit-pred-9_b.c: Same. * gcc.dg/uninit-pr61112.c: Same. * gcc.dg/vect/bb-slp-16.c: Same. * gcc.target/i386/avx2-vect-aggressive.c: Same. * gcc.dg/tree-ssa/ranger-threader-1.c: New test. * gcc.dg/tree-ssa/ranger-threader-2.c: New test. * gcc.dg/tree-ssa/ranger-threader-3.c: New test. * gcc.dg/tree-ssa/ranger-threader-4.c: New test. * gcc.dg/tree-ssa/ranger-threader-5.c: New test.	2021-07-29 08:24:50 +02:00

1 2 3 4 5 ...

1717 Commits