mirrors/gcc

mirror of https://gcc.gnu.org/git/gcc.git synced 2024-11-23 19:03:59 +08:00

Author	SHA1	Message	Date
Thomas Schwinge	7ab75a6e6d	Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilation Fix-up for recent commit `0b1ce70a81` "libgomp: Fix reverse offload issues". libgomp/ * testsuite/libgomp.fortran/reverse-offload-6.f90: Fix nvptx offloading compilation.	2023-02-07 23:44:33 +01:00
GCC Administrator	49e52115b0	Daily bump.	2023-02-04 00:16:24 +00:00
Tobias Burnus	0b1ce70a81	libgomp: Fix reverse offload issues If there is nothing to map, skip the mapping and avoid attempting to copy 0 bytes from addrs, sizes and kinds. Additionally, it could happen that a non-allocated address was deallocated, such as a pointer set, leading to a free for the actual data. libgomp/ * target.c (gomp_target_rev): Handle mapnum == 0 and avoid freeing not allocated memory. * testsuite/libgomp.fortran/reverse-offload-6.f90: New test.	2023-02-03 11:31:53 +01:00
Tobias Burnus	f84fdb134d	libgomp: enable reverse offload for AMDGCN libgomp/ChangeLog: * libgomp.texi (5.0 Impl. Status, gcn specifics): Update for reverse offload. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Accept reverse-offload requirement.	2023-02-03 08:33:17 +01:00
GCC Administrator	a37a0cb303	Daily bump.	2023-02-03 00:16:44 +00:00
Andrew Stubbs	f6fff8a6fc	amdgcn, libgomp: Manually allocated stacks Switch from using stacks in the "private segment" to using a memory block allocated on the host side. The primary reason is to permit the reverse offload implementation to access values located on the device stack, but there may also be performance benefits, especially with repeated kernel invocations. This implementation unifies the stacks with the "team arena" optimization feature, and now allows both to have run-time configurable sizes. A new ABI is needed, so all libraries must be rebuilt, and newlib must be version 4.3.0.20230120 or newer. gcc/ChangeLog: * config/gcn/gcn-run.cc: Include libgomp-gcn.h. (struct kernargs): Replace the common content with kernargs_abi. (struct heap): Delete. (main): Read GCN_STACK_SIZE envvar. Allocate space for the device stacks. Write the new kernargs fields. * config/gcn/gcn.cc (gcn_option_override): Remove stack_size_opt. (default_requested_args): Remove PRIVATE_SEGMENT_BUFFER_ARG and PRIVATE_SEGMENT_WAVE_OFFSET_ARG. (gcn_addr_space_convert): Mask the QUEUE_PTR_ARG content. (gcn_expand_prologue): Move the TARGET_PACKED_WORK_ITEMS to the top. Set up the stacks from the values in the kernargs, not private. (gcn_expand_builtin_1): Match the stack configuration in the prologue. (gcn_hsa_declare_function_name): Turn off the private segment. (gcn_conditional_register_usage): Ensure QUEUE_PTR is fixed. * config/gcn/gcn.h (FIXED_REGISTERS): Fix the QUEUE_PTR register. * config/gcn/gcn.opt (mstack-size): Change the description. include/ChangeLog: * gomp-constants.h (GOMP_VERSION_GCN): Bump. libgomp/ChangeLog: * config/gcn/libgomp-gcn.h (DEFAULT_GCN_STACK_SIZE): New define. (DEFAULT_TEAM_ARENA_SIZE): New define. (struct heap): Move to this file. (struct kernargs_abi): Likewise. * config/gcn/team.c (gomp_gcn_enter_kernel): Use team arena size from the kernargs. * libgomp.h: Include libgomp-gcn.h. (TEAM_ARENA_SIZE): Remove. (team_malloc): Update the error message. * plugin/plugin-gcn.c (struct kernargs): Move common content to struct kernargs_abi. (struct agent_info): Rename team arenas to ephemeral memories. (struct team_arena_list): Rename .... (struct ephemeral_memories_list): to this. (struct heap): Delete. (team_arena_size): New variable. (stack_size): New variable. (print_kernel_dispatch): Update debug messages. (init_environment_variables): Read GCN_TEAM_ARENA_SIZE. Read GCN_STACK_SIZE. (get_team_arena): Rename ... (configure_ephemeral_memories): ... to this, and set up stacks. (release_team_arena): Rename ... (release_ephemeral_memories): ... to this. (destroy_team_arenas): Rename ... (destroy_ephemeral_memories): ... to this. (create_kernel_dispatch): Add num_threads parameter. Adjust for kernargs_abi refactor and ephemeral memories. (release_kernel_dispatch): Adjust for ephemeral memories. (run_kernel): Pass thread-count to create_kernel_dispatch. (GOMP_OFFLOAD_init_device): Adjust for ephemeral memories. (GOMP_OFFLOAD_fini_device): Adjust for ephemeral memories. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr47237.c: Xfail on amdgcn. * gcc.dg/builtin-apply3.c: Xfail for amdgcn. * gcc.dg/builtin-apply4.c: Xfail for amdgcn. * gcc.dg/torture/stackalign/builtin-apply-3.c: Xfail for amdgcn. * gcc.dg/torture/stackalign/builtin-apply-4.c: Xfail for amdgcn.	2023-02-02 11:47:03 +00:00
Tobias Burnus	8da7476c5f	libgomp.texi (OpenMP TR11 impl. status): Fix 'strict' item Fix the 'strict' modifier status: it is already listed (as 'Y') for OpenMP 5.1 for num_task and grainsize; only strict on num_threads is new with TR11. libgomp/ * libgomp.texi (OpenMP TR11): Fix item for 'strict' modifier.	2023-02-02 12:05:58 +01:00
GCC Administrator	0a251e7497	Daily bump.	2023-02-02 00:17:43 +00:00
Tobias Burnus	bf2cf6f3f1	Fortran: Extend align-clause checks of OpenMP's allocate directive gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_clauses): Check also for power of two. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-3.f90: Fix ALIGN usage, remove unused -fdump-tree-original. * testsuite/libgomp.fortran/allocate-4.f90: New.	2023-02-01 14:51:00 +01:00
Tobias Burnus	eda38850a7	libgomp.texi: Reverse-offload updates libgomp/ * libgomp.texi (5.0 Impl. Status): Update 'requires' and 'ancestor'. (GCN): Add item about 'omp requires'. (nvptx): Likewise; add item about reverse offload.	2023-02-01 12:19:27 +01:00
GCC Administrator	338eb0f04b	Daily bump.	2023-01-28 00:16:39 +00:00
Tobias Burnus	2325c8920b	OpenMP/Fortran: Fix has_device_addr clause splitting [PR108558] gcc/fortran/ChangeLog: PR fortran/108558 * trans-openmp.cc (gfc_split_omp_clauses): Handle has_device_addr. libgomp/ChangeLog: PR fortran/108558 * testsuite/libgomp.fortran/has_device_addr.f90: New test.	2023-01-27 11:33:46 +01:00
GCC Administrator	607f278a35	Daily bump.	2023-01-24 00:17:23 +00:00
Tobias Burnus	20552407ae	libgomp.texi: Impl. status - non-rect loop nest only partial libgomp/ * libgomp.texi (OpenMP 5.0): Set non-rectangular loop nest back to 'P' as Fortran support is incomplete.	2023-01-23 09:40:41 +01:00
GCC Administrator	0846336de5	Daily bump.	2023-01-20 00:17:40 +00:00
Jakub Jelinek	46644ec99c	openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459] expand_omp_for_init_counts was using for the case where collapse(2) inner loop has init expression dependent on non-constant multiple of the outer iterator and the condition upper bound expression doesn't depend on the outer iterator fold_unary (NEGATE_EXPR, ...). This will just return NULL if it can't be folded, we need fold_build1 instead. 2023-01-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/108459 * omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather than fold_unary for NEGATE_EXPR. * testsuite/libgomp.c/pr108459.c: New test.	2023-01-19 21:00:08 +01:00
GCC Administrator	8d07b193d7	Daily bump.	2023-01-18 00:17:21 +00:00
Martin Liska	42bf66e4a7	Regenerate Makefile.in files. libbacktrace/ChangeLog: * Makefile.in: Regenerate. libgomp/ChangeLog: * Makefile.in: Regenerate. * configure: Regenerate. libphobos/ChangeLog: * Makefile.in: Regenerate. * libdruntime/Makefile.in: Regenerate. libstdc++-v3/ChangeLog: * src/libbacktrace/Makefile.in: Regenerate.	2023-01-17 12:20:05 +01:00
Jakub Jelinek	83ffe9cde7	Update copyright years.	2023-01-16 11:52:17 +01:00
GCC Administrator	d901bf8a44	Daily bump.	2023-01-08 00:16:59 +00:00
LIU Hao	902c755930	Always define `WIN32_LEAN_AND_MEAN` before <windows.h> Recently, mingw-w64 has got updated <msxml.h> from Wine which is included indirectly by <windows.h> if `WIN32_LEAN_AND_MEAN` is not defined. The `IXMLDOMDocument` class has a member function named `abort()`, which gets affected by our `abort()` macro in "system.h". `WIN32_LEAN_AND_MEAN` should, nevertheless, always be defined. This can exclude 'APIs such as Cryptography, DDE, RPC, Shell, and Windows Sockets' [1], and speed up compilation of these files a bit. [1] https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers gcc/ PR middle-end/108300 * config/xtensa/xtensa-dynconfig.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * diagnostic-color.cc: Likewise. * plugin.cc: Likewise. * prefix.cc: Likewise. gcc/ada/ PR middle-end/108300 * adaint.c: Define `WIN32_LEAN_AND_MEAN` before `#include <windows.h>`. * cio.c: Likewise. * ctrl_c.c: Likewise. * expect.c: Likewise. * gsocket.h: Likewise. * mingw32.h: Likewise. * mkdir.c: Likewise. * rtfinal.c: Likewise. * rtinit.c: Likewise. * seh_init.c: Likewise. * sysdep.c: Likewise. * terminals.c: Likewise. * tracebak.c: Likewise. gcc/jit/ PR middle-end/108300 * jit-w32.h: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libatomic/ PR middle-end/108300 * config/mingw/lock.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libffi/ PR middle-end/108300 * src/aarch64/ffi.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libgcc/ PR middle-end/108300 * config/i386/enable-execute-stack-mingw32.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * libgcc2.c: Likewise. * unwind-generic.h: Likewise. libgfortran/ PR middle-end/108300 * intrinsics/sleep.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libgomp/ PR middle-end/108300 * config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libiberty/ PR middle-end/108300 * make-temp-file.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * pex-win32.c: Likewise. libssp/ PR middle-end/108300 * ssp.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. libstdc++-v3/ PR middle-end/108300 * src/c++11/system_error.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * src/c++11/thread.cc: Likewise. * src/c++17/fs_ops.cc: Likewise. * src/filesystem/ops.cc: Likewise. libvtv/ PR middle-end/108300 * vtv_malloc.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>. * vtv_rts.cc: Likewise. * vtv_utils.cc: Likewise.	2023-01-07 06:51:06 +00:00
GCC Administrator	53ef7c1d9a	Daily bump.	2023-01-06 00:17:35 +00:00
Jakub Jelinek	29c3218618	openmp: Fix up finish_omp_target_clauses [PR108286] The comment in the loop says that we shouldn't add a map clause if such a clause exists already, but the loop was actually using OMP_CLAUSE_DECL on any clause. Target construct can have various clauses which don't have OMP_CLAUSE_DECL at all (e.g. nowait, device or if) or clause where it means something different (e.g. privatization clauses, allocate, depend). So, only check OMP_CLAUSE_DECL on OMP_CLAUSE_MAP clauses. 2023-01-05 Jakub Jelinek <jakub@redhat.com> PR c++/108286 * semantics.cc (finish_omp_target_clauses): Ignore clauses other than OMP_CLAUSE_MAP. * testsuite/libgomp.c++/pr108286.C: New test.	2023-01-05 11:57:30 +01:00
GCC Administrator	fee53a3194	Daily bump.	2023-01-03 00:17:09 +00:00
Jakub Jelinek	74d5206fb6	Update copyright dates. Manual part of copyright year updates. 2023-01-02 Jakub Jelinek <jakub@redhat.com> gcc/ * gcc.cc (process_command): Update copyright notice dates. * gcov-dump.cc (print_version): Ditto. * gcov.cc (print_version): Ditto. * gcov-tool.cc (print_version): Ditto. * gengtype.cc (create_file): Ditto. * doc/cpp.texi: Bump @copying's copyright year. * doc/cppinternals.texi: Ditto. * doc/gcc.texi: Ditto. * doc/gccint.texi: Ditto. * doc/gcov.texi: Ditto. * doc/install.texi: Ditto. * doc/invoke.texi: Ditto. gcc/ada/ * gnat_ugn.texi: Bump @copying's copyright year. * gnat_rm.texi: Likewise. gcc/d/ * gdc.texi: Bump @copyrights-d year. gcc/fortran/ * gfortranspec.cc (lang_specific_driver): Update copyright notice dates. * gfc-internals.texi: Bump @copying's copyright year. * gfortran.texi: Ditto. * intrinsic.texi: Ditto. * invoke.texi: Ditto. gcc/go/ * gccgo.texi: Bump @copyrights-go year. libgomp/ * libgomp.texi: Bump @copying's copyright year. libitm/ * libitm.texi: Bump @copying's copyright year. libquadmath/ * libquadmath.texi: Bump @copying's copyright year.	2023-01-02 09:26:59 +01:00
Jakub Jelinek	68127a8e87	Update Copyright year in ChangeLog files 2022 -> 2023	2023-01-02 09:23:36 +01:00
GCC Administrator	de282a2012	Daily bump.	2022-12-22 00:17:29 +00:00
Chung-Lin Tang	fdc7469cf5	nvptx: reimplement libgomp barriers [PR99555] Instead of trying to have the GPU do CPU-with-OS-like things, this new barriers implementation for NVPTX uses simplistic bar.* synchronization instructions. Tasks are processed after threads have joined, and only if team->task_count != 0 It is noted that: there might be a little bit of performance forfeited for cases where earlier arriving threads could've been used to process tasks ahead of other threads, but that has the requirement of implementing complex futex-wait/wake like behavior, which is what we're try to avoid with this patch. It is deemed that task processing is not what GPU target offloading is usually used for. Implementation highlight notes: 1. gomp_team_barrier_wake() is now an empty function (threads never "wake" in the usual manner) 2. gomp_team_barrier_cancel() now uses the "exit" PTX instruction. 3. gomp_barrier_wait_last() now is implemented using "bar.arrive" 4. gomp_team_barrier_wait_end()/gomp_team_barrier_wait_cancel_end(): The main synchronization is done using a 'bar.red' instruction. This reduces across all threads the condition (team->task_count != 0), to enable the task processing down below if any thread created a task. (this bar.red usage means that this patch is dependent on the prior NVPTX bar.red GCC patch) PR target/99555 libgomp/ChangeLog: * config/nvptx/bar.c (generation_to_barrier): Remove. (futex_wait,futex_wake,do_spin,do_wait): Remove. (GOMP_WAIT_H): Remove. (#include "../linux/bar.c"): Remove. (gomp_barrier_wait_end): New function. (gomp_barrier_wait): Likewise. (gomp_barrier_wait_last): Likewise. (gomp_team_barrier_wait_end): Likewise. (gomp_team_barrier_wait): Likewise. (gomp_team_barrier_wait_final): Likewise. (gomp_team_barrier_wait_cancel_end): Likewise. (gomp_team_barrier_wait_cancel): Likewise. (gomp_team_barrier_cancel): Likewise. * config/nvptx/bar.h (gomp_barrier_t): Remove waiters, lock fields. (gomp_barrier_init): Remove init of waiters, lock fields. (gomp_team_barrier_wake): Remove prototype, add new static inline function.	2022-12-21 05:58:49 -08:00
Jakub Jelinek	1119902b6c	openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180] DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR of this->field used just during gimplification and omp lowering/expansion to privatize individual fields in methods when needed. As the following testcase shows, when not in templates, they were handled right, but in templates we actually called cp_finish_decl on them and that can result in their destruction, which is obviously undesirable, we should only destruct the privatized copies of them created in omp lowering. Fixed thusly. 2022-12-21 Jakub Jelinek <jakub@redhat.com> PR c++/108180 * pt.cc (tsubst_expr): Don't call cp_finish_decl on DECL_OMP_PRIVATIZED_MEMBER vars. * testsuite/libgomp.c++/pr108180.C: New test.	2022-12-21 09:10:03 +01:00
GCC Administrator	5fb1e67453	Daily bump.	2022-12-17 00:17:56 +00:00
Tobias Burnus	18af26fc37	Remove libgomp/testsuite/libgomp.fortran/allocate-4.f90 [PR108056] Commit r13-4716-ge205ec03f0794aeac3e8a89e947c12624d5a274e accidentally included a testcase of another patch that is pending review: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html libgomp/ PR libfortran/108056 * testsuite/libgomp.fortran/allocate-4.f90: Remove accidentally added file.	2022-12-16 08:56:03 +01:00
GCC Administrator	c8f767b2c0	Daily bump.	2022-12-16 00:17:46 +00:00
Tobias Burnus	e205ec03f0	libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056] Since GCC 12, the conversion between the array descriptors formats - the internal (GFC) and the C binding one (CFI) - moved to the compiler itself such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only used with older code (GCC 9 to 11). The newly added checks caused asserts as older code did not pass the proper values (e.g. real(4) as effective argument arrived as BT_ASSUME type as the effective type got lost inbetween). As proposed in the PR, revert to the GCC 11 version - known bugs is better than some fixes and new issues. Still, GCC 12 is much better in terms of TS29113 support and should really be used. This patch uses the current libgomp version of the GCC 11 branch, except it fixes the GFC version number (which is 0), uses calloc instead of malloc, and sets the lower bound to 1 instead of keeping it as is for CFI_attribute_other. libgfortran/ChangeLog: PR libfortran/108056 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for those backward-compatiblity-only functions.	2022-12-15 12:26:06 +01:00
GCC Administrator	26f4aefaeb	Daily bump.	2022-12-15 00:17:29 +00:00
Julian Brown	9316ad3b43	OpenMP/Fortran: Combined directives with map/firstprivate of same symbol This patch fixes a case where a combined directive (e.g. "!$omp target parallel ...") contains both a map and a firstprivate clause for the same variable. When the combined directive is split into two nested directives, the outer "target" gets the "map" clause, and the inner "parallel" gets the "firstprivate" clause, like so: !$omp target parallel map(x) firstprivate(x) --> !$omp target map(x) !$omp parallel firstprivate(x) ... When there is no map of the same variable, the firstprivate is distributed to both directives, e.g. for 'y' in: !$omp target parallel map(x) firstprivate(y) --> !$omp target map(x) firstprivate(y) !$omp parallel firstprivate(y) ... This is not a recent regression, but appear to fix a long-standing ICE. (The included testcase is based on one by Tobias.) 2022-12-06 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function. (gfc_split_omp_clauses): Call above. libgomp/ * testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New test.	2022-12-14 14:11:45 +00:00
GCC Administrator	c6b12b802c	Daily bump.	2022-12-11 00:17:43 +00:00
Tobias Burnus	ea4b23d9c8	libgomp: Handle OpenMP's reverse offloads This commit enabled reverse offload for nvptx such that gomp_target_rev actually gets called. And it fills the latter function to do all of the following: finding the host function to the device func ptr and copying the arguments to the host, processing the mapping/firstprivate, calling the host function, copying back the data and freeing as needed. The data handling is made easier by assuming that all host variables either existed before (and are in the mapping) or that those are devices variables not yet available on the host. Thus, the reverse mapping can do without refcounts etc. Note that the spec disallows inside a target region device-affecting constructs other than target plus ancestor device-modifier and it also limits the clauses permitted on this construct. For the function addresses, an additional splay tree is used; for the lookup of mapped variables, the existing splay-tree is used. Unfortunately, its data structure requires a full walk of the tree; Additionally, the just mapped variables are recorded in a separate data structure an extra lookup. While the lookup is slow, assuming that only few variables get mapped in each reverse offload construct and that reverse offload is the exception and not performance critical, this seems to be acceptable. libgomp/ChangeLog: * libgomp.h (struct target_mem_desc): Predeclare; move below after 'reverse_splay_tree_node' and add rev_array member. (struct reverse_splay_tree_key_s, reverse_splay_compare): New. (reverse_splay_tree_node, reverse_splay_tree, reverse_splay_tree_key): New typedef. (struct gomp_device_descr): Add mem_map_rev member. * oacc-host.c (host_dispatch): NULL init .mem_map_rev. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support for GOMP_REQUIRES_REVERSE_OFFLOAD. * splay-tree.h (splay_tree_callback_stop): New typedef; like splay_tree_callback but returning int not void. (splay_tree_foreach_lazy): Define; like splay_tree_foreach but taking splay_tree_callback_stop as argument. * splay-tree.c (splay_tree_foreach_internal_lazy, splay_tree_foreach_lazy): New; but early exit if callback returns nonzero. * target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'. (gomp_map_lookup_rev): New. (gomp_load_image_to_device): Handle reverse-offload function lookup table. (gomp_unload_image_from_device): Free devicep->mem_map_rev. (struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup, gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int, gomp_map_cdata_lookup): New auxiliary structs and functions for gomp_target_rev. (gomp_target_rev): Implement reverse offloading and its mapping. (gomp_target_init): Init current_device.mem_map_rev.root. * testsuite/libgomp.fortran/reverse-offload-2.f90: New test. * testsuite/libgomp.fortran/reverse-offload-3.f90: New test. * testsuite/libgomp.fortran/reverse-offload-4.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without mapping of on-device allocated variables.	2022-12-10 13:42:08 +01:00
GCC Administrator	40ce6485f3	Daily bump.	2022-12-10 00:17:39 +00:00
Tobias Burnus	b2e1c49b4a	Fortran/OpenMP: align/allocator modifiers to the allocate clause gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE output. * gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'. (gfc_free_omp_namelist): Add bool arg. * match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'. * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction, gfc_match_omp_flush): Update call. (gfc_match_omp_clauses): Match 'align/allocate modifers in 'allocate' clause. (resolve_omp_clauses): Resolve align. * st.cc (gfc_free_statement): Update call * trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'. libgomp/ChangeLog: * libgomp.texi (5.1 Impl. Status): Split allocate clause/directive item about 'align'; mark clause as 'Y' and directive as 'N'. * testsuite/libgomp.fortran/allocate-2.f90: New test. * testsuite/libgomp.fortran/allocate-3.f90: New test.	2022-12-09 21:45:37 +01:00
GCC Administrator	3fe66f7f9f	Daily bump.	2022-12-07 00:18:44 +00:00
Marcel Vollweiler	81476bc4f4	OpenMP: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices This patch adds support for omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of device-specific ICV values (specified as environment variables or changed on a device). In order to reuse device-specific ICV values, a copy back mechanism is implemented that copies ICV values back from device to the host. Additionally, a limitation of the number of teams on gcn offload devices is implemented. The number of teams is limited by twice the number of compute units (one team is executed on one compute unit). This avoids queueing unnessecary many teams and a corresponding allocation of large amounts of memory. Without that limitation the memory allocation for a large number of user-specified teams can result in an "memory access fault". A limitation of the number of teams is already also implemented for nvptx devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c). gcc/ChangeLog: * gimplify.cc (optimize_target_teams): Set initial num_teams_upper to "-2" instead of "1" for non-existing num_teams clause in order to disambiguate from the case of an existing num_teams clause with value 1. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to allow processing of device-specific values. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * icv-device.c (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. (omp_set_teams_thread_limit): Likewise. * icv.c (omp_set_teams_thread_limit): Removed. (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. * libgomp.texi: Updated documentation for nvptx and gcn corresponding to the limitation of the number of teams. * plugin/plugin-gcn.c (limit_teams): New helper function that limits the number of teams by twice the number of compute units. (parse_target_attributes): Limit the number of teams on gcn offload devices. * target.c (get_gomp_offload_icvs): Added teams_thread_limit_var handling. (gomp_load_image_to_device): Added a size check for the ICVs struct variable. (gomp_copy_back_icvs): New function that is used in GOMP_target_ext to copy back the ICV values from device to host. (GOMP_target_ext): Update the number of teams and threads in the kernel args also considering device-specific values. * testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading of OMP_TEAMS_THREAD_LIMIT from the environment. * testsuite/libgomp.c-c++-common/icv-5.c: Extended. * testsuite/libgomp.c-c++-common/icv-6.c: Extended. * testsuite/libgomp.c-c++-common/icv-7.c: Extended. * testsuite/libgomp.c-c++-common/icv-9.c: New test. * testsuite/libgomp.fortran/icv-5.f90: New test. * testsuite/libgomp.fortran/icv-6.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-teams-1.c: Adapt expected values for num_teams from "1" to "-2" in cases without num_teams clause. * g++.dg/gomp/target-teams-1.C: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise.	2022-12-06 06:03:50 -08:00
Tobias Burnus	9f80367e53	libgomp.texi: Fix a OpenMP 5.2 and a TR11 impl-status item libgomp/ * libgomp.texi (OpenMP 5.2): Add missing 'the'. (TR11): Add missing '@tab N @tab'.	2022-12-06 09:51:12 +01:00
GCC Administrator	6eea85a95e	Daily bump.	2022-12-01 00:17:51 +00:00
Tobias Burnus	e0b95c2e8b	libgomp.texi: List GCN's 'gfx803' under OpenMP Context Selectors libgomp/ChangeLog: * libgomp.texi (OpenMP Context Selectors): Add 'gfx803' to gcn's isa.	2022-11-30 11:23:41 +01:00
Paul-Antoine Arras	1fd508744e	amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors Add support for gfx803 as an alias for fiji. Add test cases for all supported 'isa' values. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_omp_device_kind_arch_isa): Add gfx803. * config/gcn/t-omp-device: Add gfx803. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-fiji.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx803.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx900.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx906.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx908.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: New test. * testsuite/libgomp.c/declare-variant-4.h: New header file.	2022-11-30 10:51:42 +01:00
GCC Administrator	b774853514	Daily bump.	2022-11-29 00:18:09 +00:00
Tobias Burnus	091b6dbc48	OpenMP/Fortran: Permit end-clause on directive gcc/fortran/ChangeLog: * openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES, OMP_SECTIONS_CLAUSES): Add 'nowait'. (OMP_SINGLE_CLAUSES): Add 'nowait' and 'copyprivate'. (gfc_match_omp_distribute_parallel_do, gfc_match_omp_distribute_parallel_do_simd, gfc_match_omp_parallel_do, gfc_match_omp_parallel_do_simd, gfc_match_omp_parallel_sections, gfc_match_omp_teams_distribute_parallel_do, gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'. (gfc_match_omp_workshare): Match 'nowait' clause. (gfc_match_omp_end_single): Use clause matcher for 'nowait'. (resolve_omp_clauses): Reject 'nowait' + 'copyprivate'. * parse.cc (decode_omp_directive): Break too long line. (parse_omp_do, parse_omp_structured_block): Diagnose duplicated 'nowait' clause. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.2): Mark end-directive as Y. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/copyprivate-1.f90: New test. * gfortran.dg/gomp/copyprivate-2.f90: New test. * gfortran.dg/gomp/nowait-2.f90: Move dg-error tests ... * gfortran.dg/gomp/nowait-4.f90: ... to this new file. * gfortran.dg/gomp/nowait-5.f90: New test. * gfortran.dg/gomp/nowait-6.f90: New test. * gfortran.dg/gomp/nowait-7.f90: New test. * gfortran.dg/gomp/nowait-8.f90: New test.	2022-11-28 11:10:31 +01:00
GCC Administrator	d769c50408	Daily bump.	2022-11-26 00:17:08 +00:00
Sandra Loosemore	309e2d95e3	OpenMP: Generate SIMD clones for functions with "declare target" This patch causes the IPA simdclone pass to generate clones for functions with the "omp declare target" attribute as if they had "omp declare simd", provided the function appears to be suitable for SIMD execution. The filter is conservative, rejecting functions that write memory or that call other functions not known to be safe. A new option -fopenmp-target-simd-clone is added to control this transformation; it's enabled for offload processing at -O2 and higher. gcc/ChangeLog: * common.opt (fopenmp-target-simd-clone): New option. (target_simd_clone_device): New enum to go with it. * doc/invoke.texi (-fopenmp-target-simd-clone): Document. * flag-types.h (enum omp_target_simd_clone_device_kind): New. * omp-simd-clone.cc (auto_simd_fail): New function. (auto_simd_check_stmt): New function. (plausible_type_for_simd_clone): New function. (ok_for_auto_simd_clone): New function. (simd_clone_create): Add force_local argument, make the symbol have internal linkage if it is true. (expand_simd_clones): Also check for cloneable functions with "omp declare target". Pass explicit_p argument to simd_clone.compute_vecsize_and_simdlen target hook. * opts.cc (default_options_table): Add -fopenmp-target-simd-clone. * target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): Add bool explicit_p argument. * doc/tm.texi: Regenerated. * config/aarch64/aarch64.cc (aarch64_simd_clone_compute_vecsize_and_simdlen): Update. * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): Update. * config/i386/i386.cc (ix86_simd_clone_compute_vecsize_and_simdlen): Update. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-simd-clone-1.C: New. * g++.dg/gomp/target-simd-clone-2.C: New. * gcc.dg/gomp/target-simd-clone-1.c: New. * gcc.dg/gomp/target-simd-clone-2.c: New. * gcc.dg/gomp/target-simd-clone-3.c: New. * gcc.dg/gomp/target-simd-clone-4.c: New. * gcc.dg/gomp/target-simd-clone-5.c: New. * gcc.dg/gomp/target-simd-clone-6.c: New. * gcc.dg/gomp/target-simd-clone-7.c: New. * gcc.dg/gomp/target-simd-clone-8.c: New. * lib/scanoffloadipa.exp: New. libgomp/ChangeLog: * testsuite/lib/libgomp.exp: Load scanoffloadipa.exp library. * testsuite/libgomp.c/target-simd-clone-1.c: New. * testsuite/libgomp.c/target-simd-clone-2.c: New. * testsuite/libgomp.c/target-simd-clone-3.c: New.	2022-11-25 18:13:22 +00:00
Tobias Burnus	9f9d128f45	libgomp: Add no-target-region rev offload test + fix plugin-nvptx OpenMP permits that a 'target device(ancestor:1)' is called without being enclosed in a target region - using the current device (i.e. the host) in that case. This commit adds a testcase for this. In case of nvptx, the missing on-device 'GOMP_target_ext' call causes that it and also the associated on-device GOMP_REV_OFFLOAD_VAR variable are not linked in from nvptx's libgomp.a. Thus, handle the failing cuModuleGetGlobal gracefully by disabling reverse offload and assuming that the failure is fine. libgomp/ChangeLog: * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Use unsigned int for 'i' to match 'fn_entries'; regard absent GOMP_REV_OFFLOAD_VAR as valid and the code having no reverse-offload code. * testsuite/libgomp.c-c++-common/reverse-offload-2.c: New test.	2022-11-25 13:48:17 +01:00

1 2 3 4 5 ...

1983 Commits