Commit Graph

1741 Commits

Author SHA1 Message Date
GCC Administrator
a4ae11d935 Daily bump. 2022-05-31 00:19:02 +00:00
Jakub Jelinek
93e6064289 libgomp: Don't define GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC for _aligned_malloc [PR105745]
since apparently _aligned_malloc requires freeing with _aligned_free and:
 /* Defined if gomp_aligned_alloc doesn't use fallback version
    and free can be used instead of gomp_aligned_free.  */
 #define GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC 1
so the second condition isn't satisfied.  For uses inside of the OpenMP
allocators we can still use _aligned_malloc but we need to call _aligned_free
in gomp_aligned_free.

2022-05-28  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/105745
	* libgomp.h (GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC): Don't define for
	defined(HAVE__ALIGNED_MALLOC) case.
	* alloc.c (gomp_aligned_alloc): Move defined(HAVE__ALIGNED_MALLOC)
	handling as last option before fallback instead of first.
	(gomp_aligned_free): For defined(HAVE__ALIGNED_MALLOC) call
	_aligned_free.

(cherry picked from commit 42fd2cd932)
2022-05-30 05:36:32 +02:00
GCC Administrator
a048e606e6 Daily bump. 2022-05-18 00:19:18 +00:00
Jakub Jelinek
978036a016 libgomp: Clarify that omp_display_env is fully implemented
OpenMP 5.2 added
"When called from within a target region the effect is unspecified."
restriction to omp_display_env, so it is ok not to support it in
target regions (worst case we could add an empty implementation
or one with __builtin_trap in there).

2022-05-17  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi (OpenMP 5.1): Remove "Not inside target regions"
	comment for omp_display_env feature.

(cherry picked from commit 741478ed3e)
2022-05-17 17:03:52 +02:00
Jakub Jelinek
1ea978e306 Update ChangeLog and version files for release 2022-05-06 07:07:53 +00:00
GCC Administrator
ab73f5b0bc Daily bump. 2022-04-29 00:19:02 +00:00
Thomas Schwinge
2a570f11a2 Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717]
That got broken by recent commit b220243191
"fortran: Fix up gfc_trans_oacc_construct [PR104717]".

	PR fortran/104717
	libgomp/
	* testsuite/libgomp.oacc-fortran/print-1.f90: Add OpenACC
	privatization scanning.  For GCN offloading compilation, raise
	'-mgang-private-size'.
2022-04-28 15:15:29 +02:00
GCC Administrator
01ad093b5f Daily bump. 2022-04-27 00:16:46 +00:00
Jakub Jelinek
57a957cb71 libgomp: Fix up two non-GOMP_USE_ALIGNED_WORK_SHARES related issues [PR105358]
Last fall I've changed struct gomp_work_share, so that it doesn't have
__attribute__((aligned (64))) lock member in the middle unless the target has
non-emulated aligned allocator, otherwise it just makes sure the first and
second halves are 64 bytes appart for cache line reasons, but doesn't make
the struct 64-byte aligned itself and so we can use normal allocators for it.

When the struct isn't 64-byte aligned, the amount of tail padding significantly
decreases, to 0 or 4 bytes or so.  The library uses that tail padding when
the ordered_teams_ids array (array of uints) and/or the memory for lastprivate
conditional temporaries (the latter wants to guarantee long long alignment).
The problem with it on ia32 darwin9 is that while the struct contains
long long members, long long is just 4 byte aligned while __alignof__(long long)
is 8.  That causes problems in gomp_init_work_share, where we currently rely on
if offsetof (struct gomp_work_share, inline_ordered_team_ids) is long long
aligned, then that tail array will be aligned at runtime and so no extra
memory for dynamic realignment will be needed (that is false when the whole
struct doesn't have long long alignment).  And also in the remaining hunks
causes another problem, where we compute INLINE_ORDERED_TEAM_IDS_OFF
as the above offsetof aligned up to long long boundary and subtract
sizeof (struct gomp_work_share) and INLINE_ORDERED_TEAM_IDS_OFF.
When unlucky, the former isn't multiple of 8 and the latter is 4 bigger
than that and as the subtraction is done in size_t, we end up with (size_t) -4,
so the comparison doesn't really work.

The fixes add additional conditions to make it work properly, but all of them
should be evaluated at compile time when optimizing and so shouldn't slow
anything.

2022-04-26  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/105358
	* work.c (gomp_init_work_share): Don't mask of adjustment for
	dynamic long long realignment if struct gomp_work_share has smaller
	alignof than long long.
	* loop.c (GOMP_loop_start): Don't use inline_ordered_team_ids if
	struct gomp_work_share has smaller alignof than long long or if
	sizeof (struct gomp_work_share) is smaller than
	INLINE_ORDERED_TEAM_IDS_OFF.
	* loop_ull.c (GOMP_loop_ull_start): Likewise.
	* sections.c (GOMP_sections2_start): Likewise.
2022-04-26 08:57:17 +02:00
GCC Administrator
98de0da603 Daily bump. 2022-04-26 00:16:51 +00:00
Jakub Jelinek
b220243191 fortran: Fix up gfc_trans_oacc_construct [PR104717]
So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs
for which that function is invoked need an extra artificial BIND_EXPR
around their body so that we move all variables of the bodies.

The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK
or OMP_TARGET and for OpenACC constructs that behave similarly to
OMP_TARGET, but the Fortran FE only does that for OpenMP constructs.

The following patch does that for OpenACC constructs too.

	PR fortran/104717
	gcc/fortran/
	* trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body
	in an extra BIND_EXPR.
	gcc/testsuite/
	* gfortran.dg/goacc/pr104717.f90: New test.
	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Adjust.
	libgomp/
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Adjust.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-04-25 23:14:02 +02:00
GCC Administrator
247bbed1b8 Daily bump. 2022-04-14 00:16:40 +00:00
Jakub Jelinek
db14bb4c6b libgomp: Fix a documentation typo
This fixes a typo in the 5.0 feature support table.

2022-04-13  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi: Fix a typo - mutexinouset -> mutexinoutset.
2022-04-13 09:46:53 +02:00
GCC Administrator
80eb8ec672 Daily bump. 2022-04-07 00:16:45 +00:00
Thomas Schwinge
5e431ae4cc Move 'libgomp/plugin/cuda/cuda.h' to 'include/cuda/cuda.h'
... so that it may be used by other projects that inherit GCC's 'include'
directory.

	include/
	* cuda/cuda.h: New file.
	libgomp/
	* plugin/cuda/cuda.h: Remove file.
	* plugin/plugin-nvptx.c [PLUGIN_NVPTX_DYNAMIC]: Include
	"cuda/cuda.h" instead of <cuda.h>.
	* plugin/configfrag.ac <PLUGIN_NVPTX_DYNAMIC>: Don't set
	'PLUGIN_NVPTX_CPPFLAGS'.
	* configure: Regenerate.
2022-04-06 22:30:14 +02:00
GCC Administrator
9d84ed6812 Daily bump. 2022-04-06 00:16:22 +00:00
Chung-Lin Tang
b0af8e3a50 OpenMP: Fix nested use_device_ptr
This patch fixes a bug in lower_omp_target, where for Fortran arrays,
the expanded sender assignment is wrongly using the variable in the
current ctx, instead of the one looked-up outside, which is causing
use_device_ptr/addr to fail to work when used inside an omp-parallel
(where the omp child_fn is split away from the original).

The fix is inside omp-low.cc, though because the omp_array_data langhook
is used only by Fortran, this is essentially Fortran-specific.

2022-04-05  Chung-Lin Tang  <cltang@codesourcery.com>

gcc/ChangeLog:

	* omp-low.cc (lower_omp_target): Use outer context looked-up 'var' as
	argument to lang_hooks.decls.omp_array_data, instead of 'ovar' from
	current clause.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/use_device_ptr-4.f90: New testcase.
2022-04-05 08:31:34 -07:00
GCC Administrator
5e09bb1b2e Daily bump. 2022-04-05 00:16:20 +00:00
Tom de Vries
88cffa1a07 [libgomp/testsuite] Fix libgomp.fortran/examples-4/declare_target-{1,2}.f90
The test-cases libgomp.fortran/examples-4/declare_target-{1,2}.f90 mean to
set an nvptx-specific limit using offload_target_nvptx, but also change
behaviour for amd.

That is, there is now a difference in behaviour between:
- a compiler configured for GCN offloading, and
- a compiler configured for both GCN and nvptx offloading.

Fix this by using instead on_device_arch_nvptx.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-04-04  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Use
	on_device_arch_nvptx instead of offload_target_nvptx.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
2022-04-04 13:37:19 +02:00
GCC Administrator
8af4270d3f Daily bump. 2022-04-04 08:00:40 +00:00
Tom de Vries
bfa9f660d2 [libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90
When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
into:
...
FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
  -DGOMP_NVPTX_JIT=-O0 execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
  -DGOMP_NVPTX_JIT=-O0 execution test
...

Fix this by further limiting recursion depth in the test-cases for nvptx.

Furthermore, make the recursion depth limiting nvptx-specific.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-04-01  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Define
	and use REC_DEPTH.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
2022-04-01 13:23:16 +02:00
Tom de Vries
065e25f633 [libgomp, testsuite, nvptx] Fix dg-output test in vector-length-128-7.c
When running test-case libgomp.oacc-c-c++-common/vector-length-128-7.c on an
RTX A2000 (sm_86) with driver 510.60.02 I run into:
...
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-7.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  \
  output pattern test
...

The failing check verifies the launch dimensions:
...
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: \
                launch gangs=1, workers=8, vectors=128" } */
...
which fails because (as we can see with GOMP_DEBUG=1) the actual num_workers
is 6:
...
  nvptx_exec: kernel main$_omp_fn$0: launch gangs=1, workers=6, vectors=128
...

This is due to the result of cuOccupancyMaxPotentialBlockSize (which suggests
'a launch configuration with reasonable occupancy') printed just before:
...
cuOccupancyMaxPotentialBlockSize: grid = 52, block = 768
...
[ Note: 6 * 128 == 768. ]

Fix this by updating the check to allow num_workers in the range 1 to 8.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-04-01  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Fix
	num_workers check.
2022-04-01 13:22:07 +02:00
GCC Administrator
9f774626c0 Daily bump. 2022-03-30 00:16:49 +00:00
chenglulu
34024b7150 LoongArch Port: libgomp
2022-03-29  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

libgomp/ChangeLog:

	* configure.tgt: Add LoongArch triplet.
2022-03-29 17:43:35 +08:00
GCC Administrator
aab0127dae Daily bump. 2022-03-29 00:17:13 +00:00
Tom de Vries
52f42dce15 [libgomp, testsuite] Fix hardcoded libexec in plugin/configfrag.ac
When building an nvptx offloading configuration on openSUSE Leap 15.3, the
site script /usr/share/site/x86_64-unknown-linux-gnu is activated, setting
libexecdir to ${exec_prefix}/lib rather than ${exec_prefix}/libexec:
...
| # If user did not specify libexecdir, set the correct target:
| # Nor FHS nor openSUSE allow prefix/libexec. Let's default to prefix/lib.
|
| if test "$libexecdir" = '${exec_prefix}/libexec' ; then
|       libexecdir='${exec_prefix}/lib'
| fi
...

However, in libgomp libgomp/plugin/configfrag.ac we hardcode libexec:
...
    # Configure additional search paths.
    if test x"$tgt_dir" != x; then
      offload_additional_options="$offload_additional_options \
        -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) \
	-B$tgt_dir/bin"
...

Fix this by using /$(libexecdir:\$(exec_prefix)/%=%)/ instead of /libexec/.

Tested on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

2022-03-28  Tom de Vries  <tdevries@suse.de>

	* plugin/configfrag.ac: Use /$(libexecdir:\$(exec_prefix)/%=%)/
	instead of /libexec/.
	* configure: Regenerate.
2022-03-28 14:09:02 +02:00
GCC Administrator
31e989a278 Daily bump. 2022-03-26 10:22:39 +00:00
Tom de Vries
8570cce7c7 [libgomp, testsuite] Scale down some OpenACC test-cases
When a display manager is running on an nvidia card, all CUDA kernel launches
get a 5 seconds watchdog timer.

Consequently, when running the libgomp testsuite with nvptx accelerator and
GOMP_NVPTX_JIT=-O0 we run into a few FAILs like this:
...
libgomp: cuStreamSynchronize error: the launch timed out and was terminated
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 \
  execution test
...

Fix this by scaling down the failing test-cases by default, and reverting to
the original behaviour for GCC_TEST_RUN_EXPENSIVE=1.

Tested on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

2022-03-25  Tom de Vries  <tdevries@suse.de>

	PR libgomp/105042
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Reduce
	execution time.
	* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Same.
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Same.
2022-03-25 13:51:48 +01:00
GCC Administrator
d1ca63a1b7 Daily bump. 2022-03-24 00:16:44 +00:00
Tobias Burnus
1002a7ace1 LTO: Fixes for renaming issues with offload/OpenMP [PR104285]
gcc/lto/ChangeLog:

	PR middle-end/104285
	* lto-partition.cc (maybe_rewrite_identifier): Use get_identifier
	for the returned string to be usable as hash key.
	(validize_symbol_for_target): Hence, use return value directly.
	(privatize_symbol_name_1): Track maybe_rewrite_identifier renames.
	* lto.cc (offload_handle_link_vars): Move function up before ...
	(do_whole_program_analysis): Call it after static renamings.
	(lto_main): Move call after static renamings.

libgomp/ChangeLog:

	PR middle-end/104285
	* testsuite/libgomp.c++/target-same-name-2-a.C: New test.
	* testsuite/libgomp.c++/target-same-name-2-b.C: New test.
	* testsuite/libgomp.c++/target-same-name-2.C: New test.
	* testsuite/libgomp.c-c++-common/target-same-name-1-a.c: New test.
	* testsuite/libgomp.c-c++-common/target-same-name-1-b.c: New test.
	* testsuite/libgomp.c-c++-common/target-same-name-1.c: New test.
2022-03-23 09:44:39 +01:00
GCC Administrator
a2287813b1 Daily bump. 2022-03-23 00:16:45 +00:00
Tom de Vries
a624388b95 [nvptx] Add warp sync at simt exit
Consider this code (with N defined to 1024):
...
  float v = 0.0;
  #pragma omp target map(tofrom: v)
  #pragma omp parallel for simd
  for (int i = 0 ; i < N; i++)
    {
      #pragma omp atomic update
      v = v + 1.0;
    }
...

It hangs when executing on target board unix/-foffload=-misa=sm_75, using
drivers 470.103.01 and 510.54 on a T400 board (sm_75).

I'm tentatively identifying the problem as a bug in -muniform-simt for
architectures that support Independent Thread Scheduling (sm_70 and later).

The problem -muniform-simt is trying to address is to make sure that a
register produced outside an openmp simd region is available when used in any
lane inside an simd region.

The solution is to, outside an simd region, execute in all warp lanes, thus
producing consistent values in result registers in each warp thread.

This approach doesn't work when executing in all warp lanes multiplies the
side effects from 1 to 32 separate side effects, which is the case for atomic
insns.  So atomic insns are rewritten to execute only in lane 0, and if
there are any results, those are propagated to the other threads in the warp.
[ And likewise for system calls malloc, free, vprintf. ]

Now, consider a non-atomic update: ld, add, store.  The store has side
effects, are those multiplied or not?

Pre-sm_70 we can assume that at the end of an SIMT region, any divergent
control flow has reconverged, and we have a uniform warp, executing in lock
step.  So:
- the load will load the same value into the result register across the warp,
- the add will write the same value into the result register across the warp,
- the store will write the same value to the same memory location, 32 times,
  at once, having the result of a single store.
So, no side-effect multiplication (well, at least that's the observation).

Starting sm_70, the threads in a warp are no longer guaranteed to reconverge
after divergence.  There's a "Convergence Optimizer" that can can identify
that it is safe for a warp to reconverge, but that works only as long as the
code does not contain "synchronizing operations".

Consequently, the ld, add, store sequence can be executed by a non-uniform
warp, which means the side effects can have multiplied, and the registers are
no longer guarantueed to be in sync.

The atomic update in the example above is translated using an atom.cas loop,
which means that we have divergence (because only one thread is allowed to
succeed at a time) and the "Convergence Optimizer" doesn't reconverge probably
because the atom.cas counts as a "synchronizing operation".  So, it seems
plausible that the root cause for the mentioned hang is the problem described
above.

Fix this by adding an explicit warp sync at simt exit.

Note that we're assuming here that the warp will stay uniform until the next
SIMT region entry.

Tested on x86_64 with nvptx accelerator.

gcc/ChangeLog:

2022-03-09  Tom de Vries  <tdevries@suse.de>

	PR target/104916
	PR target/104783
	* config/nvptx/nvptx.md (define_expand "omp_simt_exit"): Emit warp
	sync (or uniform warp check for mptx < 6.0).

libgomp/ChangeLog:

2022-03-15  Tom de Vries  <tdevries@suse.de>

	PR target/104916
	PR target/104783
	* testsuite/libgomp.c/pr104783-2.c: New test.
2022-03-22 14:35:34 +01:00
GCC Administrator
8ca61ad148 Daily bump. 2022-03-19 00:16:22 +00:00
Tobias Burnus
c133bdfa9e Fortran/OpenMP: Fix privatization of associated names
gfc_omp_predetermined_sharing cases the associate-name pointer variable
to be OMP_CLAUSE_DEFAULT_FIRSTPRIVATE, which is fine. However, the associated
selector is shared. Thus, the target of associate-name pointer should not get
copied. (It was before but because of gfc_omp_privatize_by_reference returning
false, the selector was not only wrongly copied but this was also not done
properly.)

gcc/fortran/ChangeLog:

	PR fortran/103039
	* trans-openmp.cc (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor):
	Only privatize pointer for associate names.

libgomp/ChangeLog:

	PR fortran/103039
	* testsuite/libgomp.fortran/associate4.f90: New test.
2022-03-18 17:40:22 +01:00
Tom de Vries
093cdadbce [openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR
Consider test-case pr104952-1.c, included in this commit, containing:
...
  #pragma omp target map(tofrom:result) map(to:arr)
  #pragma omp simd reduction(||: result)
...

When run on x86_64 with nvptx accelerator, the test-case either aborts or
hangs.

The reduction clause is translated by the SIMT code (active for nvptx) as a
butterfly reduction loop with this butterfly shuffle / update pair:
...
  D.2163 = D.2163 || .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164)
...
in the loop body.

The problem is that the butterfly shuffle is possibly not executed, while it
needs to be executed unconditionally.

Fix this by translating instead as:
...
  D.tmp_bfly = .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164)
  D.2163 = D.2163 || D.tmp_bfly
...

Tested on x86_64-linux with nvptx accelerator.

gcc/ChangeLog:

2022-03-17  Tom de Vries  <tdevries@suse.de>

	PR target/104952
	* omp-low.cc (lower_rec_input_clauses): Make sure GOMP_SIMT_XCHG_BFLY
	is executed unconditionally.

libgomp/ChangeLog:

2022-03-17  Tom de Vries  <tdevries@suse.de>

	PR target/104952
	* testsuite/libgomp.c/pr104952-1.c: New test.
	* testsuite/libgomp.c/pr104952-2.c: New test.
2022-03-18 15:45:13 +01:00
Jakub Jelinek
c0009a3b98 openmp: Fix up gomp_affinity_init_numa_domains
On Thu, Nov 11, 2021 at 02:14:05PM +0100, Thomas Schwinge wrote:
> There appears to be yet another issue: there still are quite a number of
> 'FAIL: libgomp.c/places-10.c execution test' reports on
> <gcc-testresults@gcc.gnu.org>.  Also in my testing testing, on a system
> where '/sys/devices/system/node/online' contains '0-1', I get a FAIL:
>
>     [...]
>     OPENMP DISPLAY ENVIRONMENT BEGIN
>       _OPENMP = '201511'
>       OMP_DYNAMIC = 'FALSE'
>       OMP_NESTED = 'FALSE'
>       OMP_NUM_THREADS = '8'
>       OMP_SCHEDULE = 'DYNAMIC'
>       OMP_PROC_BIND = 'TRUE'
>       OMP_PLACES = '{0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30},{FAIL: libgomp.c/places-10.c execution test

I've finally managed to debug this (by dumping used /sys/ files from
an affected system in Fedora build system, replacing /sys/ with /tmp/
in gcc sources and populating there those files), I think following patch
ought to fix it.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

	* config/linux/affinity.c (gomp_affinity_init_numa_domains): Move seen
	variable next to pl variable.
2022-03-18 11:02:13 +01:00
GCC Administrator
e9ea30165b Daily bump. 2022-03-18 00:16:27 +00:00
Thomas Schwinge
c43cb355f2 Enhance further testcases to verify Openacc 'kernels' decomposition
gcc/testsuite/
	* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
	* c-c++-common/goacc/kernels-loop-g.c: Likewise.
	* c-c++-common/goacc/nesting-1.c: Likewise.
	* gcc.dg/goacc/nested-function-1.c: Likewise.
	* gfortran.dg/goacc/common-block-3.f90: Likewise.
	* gfortran.dg/goacc/nested-function-1.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
	Enhance.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
2022-03-17 08:51:32 +01:00
Thomas Schwinge
004fc4f2fc Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
As originally introduced in commit 11b8286a83
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	gcc/testsuite/
	* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
	* gfortran.dg/goacc/common-block-3.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Enhance.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
2022-03-17 08:47:09 +01:00
GCC Administrator
9fc8f278eb Daily bump. 2022-03-17 00:17:00 +00:00
Marcel Vollweiler
be093b8dcc OpenMP, Fortran: Bugfix for omp_set_num_teams.
This patch fixes a small bug in the omp_set_num_teams implementation.

libgomp/ChangeLog:

	* fortran.c (omp_set_num_teams_8_): Call omp_set_num_teams instead of
	omp_set_max_active_levels.
	* testsuite/libgomp.fortran/icv-8.f90: New test.
2022-03-16 07:38:54 -07:00
Thomas Schwinge
ab46fc7c3b OpenACC privatization diagnostics vs. 'assert' [PR102841]
It's an orthogonal concern why these diagnostics do appear at all for
non-offloaded OpenACC constructs (where they're not relevant at all); PR90115.

Depending on how 'assert' is implemented, it may cause temporaries to be
created, and/or may lower into 'COND_EXPR's, and
'gcc/gimplify.cc:gimplify_cond_expr' uses 'create_tmp_var (type, "iftmp")'.

Fix-up for commit 11b8286a83
"[OpenACC privatization] Largely extend diagnostics and
corresponding testsuite coverage [PR90115]".

	PR testsuite/102841
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Adjust.
2022-03-16 10:12:09 +01:00
GCC Administrator
b9756c0858 Daily bump. 2022-03-14 00:16:20 +00:00
Tobias Burnus
1b85638aff texi + c-target.def: Fix typos
gcc/c-family/ChangeLog:

	* c-target.def (check_string_object_format_arg): Fix description typo.

gcc/ChangeLog:

	* doc/invoke.texi: Fix typos.
	* doc/tm.texi.in: Remove duplicated word.
	* doc/tm.texi: Regenerate.

libgomp/ChangeLog:

	* libgomp.texi: Fix typo.
2022-03-13 10:23:07 +01:00
GCC Administrator
57eeedda23 Daily bump. 2022-03-13 00:16:20 +00:00
Thomas Schwinge
a07b8f4fb7 OpenACC 'kernels' decomposition: resolve wrong-code cases unless manually making certain variables addressable [PR100280, PR104892]
Currently in OpenACC 'kernels' decomposition, there is special handling of
'GOMP_MAP_FORCE_TOFROM', documented to be done to avoid "internal compiler
errors in later passes".  For performance reasons, the current repetitive
to/from device copying for every region is not ideal, compared to using
'present' clauses, as done for almost all other 'GOMP_MAP_*'.  Also, the
current special handling (incomplete, evidently) is the reason for the PR104892
misbehavior.  For PR100280 etc. we've resolved all such known ICEs -- removing
the special handling for 'GOMP_MAP_FORCE_TOFROM' now resolves PR104892.

	PR middle-end/100280
	PR middle-end/104892
	gcc/
	* omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1):
	Remove special handling of 'GOMP_MAP_FORCE_TOFROM'.
	gcc/testsuite/
	* c-c++-common/goacc/kernels-decompose-2.c: Adjust.
	* c-c++-common/goacc/kernels-decompose-pr100400-1-1.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr100400-1-2.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr100400-1-3.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr100400-1-4.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104061-1-1.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104774-1.c: Likewise.
	* gfortran.dg/goacc/classify-kernels.f95: Likewise.
	* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust.
	* testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90:
	Likewise.
2022-03-12 15:37:27 +01:00
Thomas Schwinge
535afbd959 OpenACC 'kernels' decomposition: wrong-code cases unless manually making certain variables addressable [PR104892]
Document a few examples of the status quo.

	PR middle-end/104892
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Point
	to PR104892.
	* testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise,
	enable '--param=openacc-kernels=decompose' and adjust.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90:
	Likewise.
2022-03-12 15:37:27 +01:00
Thomas Schwinge
2e53fa7bb2 Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
As originally introduced in commit 11b8286a83
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/default-1.c: Enhance.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12 14:00:46 +01:00
Thomas Schwinge
337ed336d7 OpenACC 'kernels' decomposition: Mark variables used in 'present' clauses as addressable [PR100280, PR104086]
... like in recent commit 9b32c1669a
"OpenACC 'kernels' decomposition: Mark variables used in synthesized
data clauses as addressable [PR100280]".  Otherwise, we may run into
'gcc/omp-low.cc:lower_omp_target':

    13125                       else if (is_gimple_reg (var))
    13126                         {
    13127                           gcc_assert (offloaded);

	PR middle-end/100280
	PR middle-end/104086
	gcc/
	* omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1):
	Mark variables used in 'present' clauses as addressable.
	* omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Gracefully
	handle duplicate 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'.
	gcc/testsuite/
	* c-c++-common/goacc/kernels-decompose-pr104086-1.c: Adjust,
	extend.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
	Merge this...
	* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c:
	..., and this...
	* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: ... into
	this, and adjust.
	* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
	Extend.
2022-03-12 13:02:55 +01:00
GCC Administrator
5e28be8966 Daily bump. 2022-03-11 00:16:39 +00:00