With yesterday's commit 9f2bc5077d "[gcn]
Work-around libgomp 'error: array subscript 0 is outside array bounds of
‘__lds struct gomp_thread * __lds[0]’ [-Werror=array-bounds]' [PR101484]",
I did defuse the "unexpected" '-Werror=array-bounds' diagnostics that we see
as of commit a110855667 "Correct handling of
variable offset minus constant in -Warray-bounds [PR100137]". However, these
'#pragma GCC diagnostic [...]' directives cause some code generation changes
(that seems unexpected, problematic!), which results in a lot (ten thousands)
of 'GCN team arena exhausted' run-time diagnostics, also leading to a few
FAILs:
PASS: libgomp.c/../libgomp.c-c++-common/for-11.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-11.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/for-12.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-12.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/for-3.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-3.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/for-5.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-5.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/for-6.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-6.c execution test
PASS: libgomp.c/../libgomp.c-c++-common/for-9.c (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-9.c execution test
Same for 'libgomp.c++'.
It remains to be analyzed how '#pragma GCC diagnostic [...]' directives can
cause code generation changes; for now I'm working around the "unexpected"
'-Werror=array-bounds' diagnostics differently.
Overall, still awaiting a different solution, of course.
libgomp/
PR target/101484
* configure.tgt [amdgcn*-*-*] (XCFLAGS): Add
'-Wno-error=array-bounds'.
* config/gcn/team.c: Remove '-Werror=array-bounds' work-around.
* libgomp.h [__AMDGCN__]: Likewise.
... seen as of commit a110855667 "Correct
handling of variable offset minus constant in -Warray-bounds [PR100137]".
Awaiting a different solution, of course.
libgomp/
PR target/101484
* config/gcn/team.c: Apply '-Werror=array-bounds' work-around.
* libgomp.h [__AMDGCN__]: Likewise.
sem.h is included in between # pragma GCC visibility push(hidden)
and # pragma GCC visibility pop and includes limits.h there, which
since the introduction of sysconf declaration in recent glibcs
in there causes trouble. libgomp assumes it is compiled by gcc,
so we don't really need to include limits.h there and can use
-__INT_MAX__ - 1 instead (which clang and icc support too for years).
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Florian Weimer <fweimer@redhat.com>
* config/linux/sem.h: Don't include limits.h.
(SEM_WAIT): Define to -__INT_MAX__ - 1 instead of INT_MIN.
* config/linux/affinity.c: Include limits.h.
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.
This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.
2021-07-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94366
gcc/
* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
use boolean_type_node instead of integer_type_node as NE_EXPR type.
(lower_reduction_clauses): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/pr94366.c: New test.
As -foffload={options,targets,targets=options} is very convoluted,
it has been split into -foffload=targets (supporting the old syntax
for backward compatibilty) and -foffload-options={options,target=options}.
Only the new syntax is documented.
Additionally, -foffload=default is supported, which can reset the
devices after -foffload=disable / -foffload=targets to the default,
if needed.
gcc/ChangeLog:
PR other/67300
* common.opt (-foffload=): Update description.
(-foffload-options=): New.
* doc/invoke.texi (C Language Options): Document
-foffload and -foffload-options.
* gcc.c (check_offload_target_name): New, split off from
handle_foffload_option.
(check_foffload_target_names): New.
(handle_foffload_option): Handle -foffload=default.
(driver_handle_option): Update for -foffload-options.
* lto-opts.c (lto_write_options): Use -foffload-options
instead of -foffload.
* lto-wrapper.c (merge_and_complain, append_offload_options):
Likewise.
* opts.c (common_handle_option): Likewise.
libgomp/ChangeLog:
PR other/67300
* testsuite/libgomp.c-c++-common/reduction-16.c: Replace
-foffload=nvptx-none= by -foffload-options=nvptx-none= to
avoid disabling other offload targets.
* testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.
* testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.
* testsuite/libgomp.c/target-44.c: Likewise.
Disable some more parts of the test as firstprivate does not work yet
due to PR fortran/90742.
libgomp/
* testsuite/libgomp.fortran/defaultmap-8.f90 (bar): Determine whether
target has shared memory and disable some scalar pointer/allocatable
checks if not as firstprivate does not work.
The dg-shouldfail testcase libgomp.c-c++-common/struct-elem-5.c does not
properly fail for non-shared address space offloading. Adjust testcase
to limit testing only for "target offload_device_nonshared_as".
libgomp/ChangeLog:
PR testsuite/101114
* testsuite/libgomp.c-c++-common/struct-elem-5.c:
Add "target offload_device_nonshared_as" condition for enabling test.
This patch adds support for in_reduction clause on target construct, though
for now only for synchronous targets (without nowait clause).
The encountering thread in that case runs the target task and blocks until
the target region ends, so it is implemented by remapping it before entering
the target, initializing the private copy if not yet initialized for the
current thread and then using the remapped addresses for the mapping
addresses.
For nowait combined with in_reduction the patch contains a hack where the
nowait clause is ignored. To implement it correctly, I think we would need
to create a new private variable for the in_reduction and initialize it before
doing the async target and adjust the map addresses to that private variable
and then pass a function pointer to the library routine with code where the callback
would remap the address to the current threads private variable and use in_reduction
combiner to combine the private variable we've created into the thread's copy.
The library would then need to make sure that the routine is called in some thread
participating in the parallel (and not in an unshackeled thread).
2021-06-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): Document meaning for OpenMP.
* gimplify.c (gimplify_scan_omp_clauses): For OpenMP map clauses
with OMP_CLAUSE_MAP_IN_REDUCTION flag partially defer gimplification
of non-decl OMP_CLAUSE_DECL. For OMP_CLAUSE_IN_REDUCTION on
OMP_TARGET user outer_ctx instead of ctx for placeholders and
initializer/combiner gimplification.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_MAP_IN_REDUCTION
on target constructs.
(lower_rec_input_clauses): Likewise.
(lower_omp_target): Likewise.
* omp-expand.c (expand_omp_target): Temporarily ignore nowait clause
on target if in_reduction is present.
gcc/c-family/
* c-common.h (enum c_omp_region_type): Add C_ORT_TARGET and
C_ORT_OMP_TARGET.
* c-omp.c (c_omp_split_clauses): For OMP_CLAUSE_IN_REDUCTION on
combined target constructs also add map (always, tofrom:) clause.
gcc/c/
* c-parser.c (omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(c_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to c_finish_omp_clauses.
* c-typeck.c (handle_omp_array_sections): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(c_finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
gcc/cp/
* parser.c (cp_omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
C_ORT_OMP for clauses on target construct.
(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
(cp_parser_omp_target): For non-combined target add
map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass
C_ORT_OMP_TARGET to finish_omp_clauses.
* semantics.c (handle_omp_array_sections_1): Adjust ort handling
for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
never present on C_ORT_*DECLARE_SIMD.
(handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION
on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
corresponding map clauses.
* pt.c (tsubst_expr): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for
clauses on target construct.
gcc/testsuite/
* c-c++-common/gomp/target-in-reduction-1.c: New test.
* c-c++-common/gomp/clauses-1.c: Add in_reduction clauses on
target or combined target constructs.
libgomp/
* testsuite/libgomp.c-c++-common/target-in-reduction-1.c: New test.
* testsuite/libgomp.c-c++-common/target-in-reduction-2.c: New test.
* testsuite/libgomp.c++/target-in-reduction-1.C: New test.
* testsuite/libgomp.c++/target-in-reduction-2.C: New test.
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.
2021-06-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101167
* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.
* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
This patch implement OpenMP 5.0 requirements of incrementing/decrementing
the reference count of a mapped structure at most once (across all elements)
on a construct.
This is implemented by pulling in libgomp/hashtab.h and using htab_t as a
pointer set. Structure element list siblings also have pointers-to-refcounts
linked together, to naturally achieve uniform increment/decrement without
repeating.
There are still some questions on whether using such a htab_t based set is
faster/slower than using a sorted pointer array based implementation. This
is to be researched on later.
libgomp/ChangeLog:
* hashtab.h (htab_clear): New function with initialization code
factored out from...
(htab_create): ...here, adjust to use htab_clear function.
* libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of
special refcount values, add comments.
(REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL.
(REFCOUNT_LINK): Likewise.
(REFCOUNT_STRUCTELEM): New special refcount range for structure
element siblings.
(REFCOUNT_STRUCTELEM_P): Macro for testing for structure element
sibling maps.
(REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling.
(REFCOUNT_STRUCTELEM_FLAG_LAST): Flag to indicate last sibling.
(REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag.
(REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag.
(struct splay_tree_key_s): Add structelem_refcount and
structelem_refcount_ptr fields into a union with dynamic_refcount.
Add comments.
(gomp_map_vars): Delete declaration.
(gomp_map_vars_async): Likewise.
(gomp_unmap_vars): Likewise.
(gomp_unmap_vars_async): Likewise.
(goacc_map_vars): New declaration.
(goacc_unmap_vars): Likewise.
* oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars.
(goacc_enter_datum): Likewise.
(goacc_enter_data_internal): Likewise.
* oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars
and goacc_unmap_vars.
(GOACC_data_start): Adjust to use goacc_map_vars.
(GOACC_data_end): Adjust to use goacc_unmap_vars.
* target.c (hash_entry_type): New typedef.
(htab_alloc): New function hook for hashtab.h.
(htab_free): Likewise.
(htab_hash): Likewise.
(htab_eq): Likewise.
(hashtab.h): Add file include.
(gomp_increment_refcount): New function.
(gomp_decrement_refcount): Likewise.
(gomp_map_vars_existing): Add refcount_set parameter, adjust to use
gomp_increment_refcount.
(gomp_map_fields_existing): Add refcount_set parameter, adjust calls
to gomp_map_vars_existing.
(gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p
variable to guard OpenMP specific paths, adjust calls to
gomp_map_vars_existing, add structure element sibling splay_tree_key
sequence creation code, adjust Fortran map case to avoid increment
under OpenMP.
(gomp_map_vars): Adjust to static, add refcount_set parameter, manage
local refcount_set if caller passed in NULL, adjust call to
gomp_map_vars_internal.
(gomp_map_vars_async): Adjust and rename into...
(goacc_map_vars): ...this new function, adjust call to
gomp_map_vars_internal.
(gomp_remove_splay_tree_key): New function with code factored out from
gomp_remove_var_internal.
(gomp_remove_var_internal): Add code to handle removing multiple
splay_tree_key sequence for structure elements, adjust code to use
gomp_remove_splay_tree_key for splay-tree key removal.
(gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use
gomp_decrement_refcount.
(gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage
local refcount_set if caller passed in NULL, adjust call to
gomp_unmap_vars_internal.
(gomp_unmap_vars_async): Adjust and rename into...
(goacc_unmap_vars): ...this new function, adjust call to
gomp_unmap_vars_internal.
(GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and
gomp_unmap_vars.
(GOMP_target_ext): Likewise.
(gomp_target_data_fallback): Adjust call to gomp_map_vars.
(GOMP_target_data): Likewise.
(GOMP_target_data_ext): Likewise.
(GOMP_target_end_data): Adjust call to gomp_unmap_vars.
(gomp_exit_data): Add refcount_set parameter, adjust to use
gomp_decrement_refcount, adjust to queue splay-tree keys for removal
after main loop.
(GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to
gomp_map_vars and gomp_exit_data.
(gomp_target_task_fn): Likewise.
* testsuite/libgomp.c-c++-common/refcount-1.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase.
* testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.
Move the OpenACC enter and exit data directives from using a single builtin to
having one each. For most purposes it was easy to tell which was which, from
the clauses given, but it's overhead we can easily avoid, and there may be
future uses where that isn't possible.
gcc/
* omp-builtins.def (BUILT_IN_GOACC_ENTER_EXIT_DATA): Split into...
(BUILT_IN_GOACC_ENTER_DATA, BUILT_IN_GOACC_EXIT_DATA): ... these.
* gimple.h (enum gf_mask): Split
'GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA' into
'GF_OMP_TARGET_KIND_OACC_ENTER_DATA' and
'GF_OMP_TARGET_KIND_OACC_EXIT_DATA'.
(is_gimple_omp_oacc): Update.
* gimple-pretty-print.c (dump_gimple_omp_target): Likewise.
* gimplify.c (gimplify_omp_target_update): Likewise.
* omp-expand.c (expand_omp_target, build_omp_regions_1)
(omp_make_gimple_edges): Likewise.
* omp-low.c (check_omp_nesting_restrictions, lower_omp_target):
Likewise.
gcc/testsuite/
* c-c++-common/goacc-gomp/nesting-fail-1.c: Adjust patterns.
* c-c++-common/goacc/finalize-1.c: Likewise.
* c-c++-common/goacc/mdc-1.c: Likewise.
* c-c++-common/goacc/nesting-fail-1.c: Likewise.
* c-c++-common/goacc/struct-enter-exit-data-1.c: Likewise.
* gfortran.dg/goacc/attach-descriptor.f90: Likewise.
* gfortran.dg/goacc/finalize-1.f: Likewise.
* gfortran.dg/goacc/mapping-tests-3.f90: Likewise.
libgomp/
* libgomp.map (GOACC_2.0.2): New symbol version.
* libgomp_g.h (GOACC_enter_data, GOACC_exit_data) New prototypes.
* oacc-mem.c (GOACC_enter_data, GOACC_exit_data) New functions.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* oacc-mem.c (goacc_enter_exit_data_internal): New function,
extracted from...
(GOACC_enter_exit_data): ... here.
(GOACC_declare): Use it.
Co-Authored-By: Andrew Stubbs <ams@codesourcery.com>
This deals with data management, after all.
Small fix-up for r230275 (commit 6e232ba424)
"[OpenACC] declare directive".
libgomp/
* oacc-parallel.c (GOACC_declare): Move...
* oacc-mem.c: ... here.
* libgomp_g.h: Adjust.
Given that we 'continue' for 'GOMP_MAP_POINTER', we cannot possibly encounter
it afterwards.
Small fix-up for r230275 (commit 6e232ba424)
"[OpenACC] declare directive".
libgomp/
* oacc-parallel.c (GOACC_declare): Clean up 'GOMP_MAP_POINTER'
handling.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
The dsdotr and dsdoti variables uninitialized and the testcase fails e.g.
on i686-linux. Fixed by zero initialization.
2021-06-10 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/100981
* testsuite/libgomp.fortran/pr100981-2.f90 (cdcdot): Initialize
dsdotr and dsdoti to 0.
Don't add -march=i486 if atomic compare-and-swap is supported on 'int'.
This fixes libgomp tests with "-march=x86-64 -m32 -fcf-protection".
* testsuite/lib/libgomp.exp (libgomp_init): Don't add -march=i486
if atomic compare-and-swap is supported on 'int'.
The following fixes the SLP FMA patterns to preserve reduction
info and the reduction vectorization to consider internal function
call defs for the reduction stmt.
2021-06-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/100981
gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
gimple_get_lhs to also handle calls.
* tree-vect-slp-patterns.c (complex_pattern::build): Transfer
reduction info.
gcc/testsuite/
* gfortran.dg/vect/pr100981-1.f90: New testcase.
libgomp/
* testsuite/libgomp.fortran/pr100981-2.f90: New testcase.
... which currently has *not* been forced to 'num_workers (1)'.
In addition to the testcases modified here, this also fixes:
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/mode-transitions.c -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test
[Etc.]
mode-transitions.exe: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:702: t17: Assertion `arr_b[i] == (i ^ 31) * 8' failed.
libgomp/
* plugin/plugin-gcn.c (gcn_exec): Force 'num_workers (1)'
unconditionally.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
Update.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.
..., by simplifying 'libgomp.oacc-c-c++-common/parallel-dims.c', and updating
the former correspondingly. '__builtin_goacc_parlevel_id' does the right thing
for all 'acc_device_*'.
Follow-up to commit 09e0ad6253 "Update OpenACC
tests for amdgcn".
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Simplify.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: Update.
... on top of r279378 (commit 26b74ed022)
"Update OpenACC tests for amdgcn".
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Fix
for 'acc_device_radeon'.
That is, re-enable it for host-fallback, and enable it for GCN offloading.
Fix-up for r279378 (commit 26b74ed022)
"Update OpenACC tests for amdgcn".
libgomp/
* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Don't
require 'openacc_nvidia_accel_selected'. Fix up for
'ACC_DEVICE_TYPE_radeon'.
The GCN support that got added in r278935 (commit
83caa34e2a) "Enable OpenACC GCN testing" was
forked before my r269107 (commit ee332b4a9a)
"[libgomp] Clarify difference between offload target, offload plugin, and
OpenACC device type", and didn't later pick up these changes.
No functional change.
libgomp/
* testsuite/lib/libgomp.exp
(check_effective_target_openacc_radeon_accel_selected):
Streamline.
This problem has been fixed long ago, in r267934 (commit
d41d952c9b) "[nvptx] Handle assignment to
gang-level reduction variable".
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Revert
PR80547 workaround.
Small fix-up for r267889 (commit 2b9d9e3937)
"[nvptx] Enable large vectors":
> * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Expect vector
> length 2097152 to be reduced to 1024 instead of 32.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
<acc_device_nvidia>: Update comment.
When gcc is configured for nvptx offloading with --without-cuda-driver
and full CUDA isn't installed, many libgomp.oacc-*/* tests fail,
some of them because cuda.h header can't be found, others because
the tests can't be linked against -lcuda, -lcudart or -lcublas.
I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms
installed, so libcuda.so.1 can be dlopened and the offloading works,
but linking against those libraries isn't possible nor are the
headers around (for the plugin itself there is the fallback
libgomp/plugin/cuda/cuda.h).
The following patch adds 3 new effective targets and uses them in tests that
needs those.
2021-05-27 Jakub Jelinek <jakub@redhat.com>
* testsuite/lib/libgomp.exp (check_effective_target_openacc_cuda,
check_effective_target_openacc_cublas,
check_effective_target_openacc_cudart): New.
* testsuite/libgomp.oacc-fortran/host_data-4.f90: Require effective
target openacc_cublas.
* testsuite/libgomp.oacc-fortran/host_data-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/host_data-3.f: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-91.c: Require effective
target openacc_cuda.
* testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr87835.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/host_data-1.c: Require effective
targets openacc_cublas and openacc_cudart.
* testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-nvptx.c:
Require effective target openacc_cudart.
* testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Add -DUSE_CUDA_H
for effective target openacc_cuda and add && defined USE_CUDA_H to
preprocessor conditionals. Guard -lcuda also on openacc_cuda
effective target.