gcc/libgomp/icv-device.c

121 lines
2.7 KiB
C
Raw Permalink Normal View History

2024-01-03 19:19:35 +08:00
/* Copyright (C) 2005-2024 Free Software Foundation, Inc.
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
Contributed by Richard Henderson <rth@redhat.com>.
This file is part of the GNU Offloading and Multi Processing Library
(libgomp).
Libgomp is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3, or (at your option)
any later version.
Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
Under Section 7 of GPL version 3, you are granted additional
permissions described in the GCC Runtime Library Exception, version
3.1, as published by the Free Software Foundation.
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
/* This file defines OpenMP API entry points that accelerator targets are
expected to replace. */
#include "libgomp.h"
OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in OpenMP 5.2 it was clarified/extended by having implications on the default-device-var; additionally, omp_initial_device and omp_invalid_device enum values/PARAMETERs were added; support for it was added in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and non-conforming device numbers. Only the mandatory handling was missing. Namely, while the default-device-var is usually initialized to value 0, with 'mandatory' it must have the value 'omp_invalid_device' if and only if zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var overrides this as it comes semantically after the initialization.) To achieve this, default-device-var is now initialized to MIN_INT. If there is no 'mandatory', it is set to 0 directly after env var parsing. Otherwise, it is updated in gomp_target_init to either 0 or omp_invalid_device. To ensure INT_MIN is never seen by the user, both the omp_get_default_device API routine and omp_display_env (user call and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case. libgomp/ChangeLog: * env.c (gomp_default_icv_values): Init default_device_var to an nonconforming value - INT_MIN. (initialize_env): After env-var parsing, set default_device_var to device 0 unless OMP_TARGET_OFFLOAD=mandatory. (omp_display_env): If default_device_var is INT_MIN, call gomp_init_targets_once. * icv-device.c (omp_get_default_device): Likewise. * libgomp.texi (OMP_DEFAULT_DEVICE): Update init description. (OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'. * target.c (resolve_device): Improve error message device-num < 0 with 'mandatory' and no no-host devices available. (gomp_target_init): Set default-device-var if INT_MIN. * testsuite/libgomp.c/target-48.c: New test. * testsuite/libgomp.c/target-49.c: New test. * testsuite/libgomp.c/target-50.c: New test. * testsuite/libgomp.c/target-50a.c: New test. * testsuite/libgomp.c/target-51.c: New test. * testsuite/libgomp.c/target-52.c: New test. * testsuite/libgomp.c/target-53.c: New test. * testsuite/libgomp.c/target-54.c: New test.
2023-06-14 13:53:02 +08:00
#include <limits.h>
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
void
omp_set_default_device (int device_num)
{
struct gomp_task_icv *icv = gomp_icv (true);
openmp: Conforming device numbers and omp_{initial,invalid}_device OpenMP 5.2 changed once more what device numbers are allowed. In 5.1, valid device numbers were [0, omp_get_num_devices()]. 5.2 makes also -1 valid (calls it omp_initial_device), which is equivalent in behavior to omp_get_num_devices() number but has the advantage that it is a constant. And it also introduces omp_invalid_device which is also a constant with implementation defined value < -1. That value should act like sNaN, any time any device construct (GOMP_target*) or OpenMP runtime API routine is asked for such a device, the program is terminated. And if OMP_TARGET_OFFLOAD=mandatory, all non-conforming device numbers (which is all but [-1, omp_get_num_devices()] other than omp_invalid_device) must be treated like omp_invalid_device. For device constructs, we have a compatibility problem, we've historically used 2 magic negative values to mean something special. GOMP_DEVICE_ICV (-1) means device clause wasn't present, pick the omp_get_default_device () number GOMP_DEVICE_FALLBACK (-2) means the host device (this is used e.g. for #pragma omp target if (cond) where if cond is false, we pass -2 But 5.2 requires that omp_initial_device is -1 (there were discussions about it, advantage of -1 is that one can say iterate over the [-1, omp_get_num_devices()-1] range to get all devices starting with the host/initial one. And also, if user passes -2, unless it is omp_invalid_device, we need to treat it like non-conforming with OMP_TARGET_OFFLOAD=mandatory. So, the patch does on the compiler side some number remapping, user_device_num >= -2U ? user_device_num - 1 : user_device_num. This remapping is done at compile time if device clause has constant argument, otherwise at runtime, and means that for user -1 (omp_initial_device) we pass -2 to GOMP_* in the runtime library where it treats it like host fallback, while -2 is remapped to -3 (one of the non-conforming device numbers, for those it doesn't matter which one is which). omp_invalid_device is then -4. For the OpenMP device runtime APIs, no remapping is done. This patch doesn't deal with the initial default-device-var for OMP_TARGET_OFFLOAD=mandatory , the spec says that the inital ICV value for that should in that case depend on whether there are any offloading devices or not (if not, should be omp_invalid_device), but that means we can't determine the number of devices lazily (and let libraries have the possibility to register their offloading data etc.). 2022-06-13 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-expand.cc (expand_omp_target): Remap user provided device clause arguments, -1 to -2 and -2 to -3, either at compile time if constant, or at runtime. include/ * gomp-constants.h (GOMP_DEVICE_INVALID): Define. libgomp/ * omp.h.in (omp_initial_device, omp_invalid_device): New enumerators. * omp_lib.f90.in (omp_initial_device, omp_invalid_device): New parameters. * omp_lib.h.in (omp_initial_device, omp_invalid_device): Likewise. * target.c (resolve_device): Add remapped argument, handle GOMP_DEVICE_ICV only if remapped is true (and clear remapped), for negative values, treat GOMP_DEVICE_FALLBACK as fallback only if remapped, otherwise treat omp_initial_device that way. For omp_invalid_device, always emit gomp_fatal, even when OMP_TARGET_OFFLOAD isn't mandatory. (GOMP_target, GOMP_target_ext, GOMP_target_data, GOMP_target_data_ext, GOMP_target_update, GOMP_target_update_ext, GOMP_target_enter_exit_data): Pass true as remapped argument to resolve_device. (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy_check, omp_target_associate_ptr, omp_target_disassociate_ptr, omp_get_mapped_ptr, omp_target_is_accessible): Pass false as remapped argument to resolve_device. Treat omp_initial_device the same as gomp_get_num_devices (). Don't bypass resolve_device calls if device_num is negative. (omp_pause_resource): Treat omp_initial_device the same as gomp_get_num_devices (). Call resolve_device. * icv-device.c (omp_set_default_device): Always set to device_num even when it is negative. * libgomp.texi: Document that Conforming device numbers, omp_initial_device and omp_invalid_device is implemented. * testsuite/libgomp.c/target-41.c (main): Add test with omp_initial_device. * testsuite/libgomp.c/target-45.c: New test. * testsuite/libgomp.c/target-46.c: New test. * testsuite/libgomp.c/target-47.c: New test. * testsuite/libgomp.c-c++-common/target-is-accessible-1.c (main): Add test with omp_initial_device. Use -5 instead of -1 for negative value test. * testsuite/libgomp.fortran/target-is-accessible-1.f90 (main): Likewise. Reorder stop numbers.
2022-06-13 19:42:59 +08:00
icv->default_device_var = device_num;
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
}
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
ialias (omp_set_default_device)
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
int
omp_get_default_device (void)
{
struct gomp_task_icv *icv = gomp_icv (false);
OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in OpenMP 5.2 it was clarified/extended by having implications on the default-device-var; additionally, omp_initial_device and omp_invalid_device enum values/PARAMETERs were added; support for it was added in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and non-conforming device numbers. Only the mandatory handling was missing. Namely, while the default-device-var is usually initialized to value 0, with 'mandatory' it must have the value 'omp_invalid_device' if and only if zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var overrides this as it comes semantically after the initialization.) To achieve this, default-device-var is now initialized to MIN_INT. If there is no 'mandatory', it is set to 0 directly after env var parsing. Otherwise, it is updated in gomp_target_init to either 0 or omp_invalid_device. To ensure INT_MIN is never seen by the user, both the omp_get_default_device API routine and omp_display_env (user call and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case. libgomp/ChangeLog: * env.c (gomp_default_icv_values): Init default_device_var to an nonconforming value - INT_MIN. (initialize_env): After env-var parsing, set default_device_var to device 0 unless OMP_TARGET_OFFLOAD=mandatory. (omp_display_env): If default_device_var is INT_MIN, call gomp_init_targets_once. * icv-device.c (omp_get_default_device): Likewise. * libgomp.texi (OMP_DEFAULT_DEVICE): Update init description. (OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'. * target.c (resolve_device): Improve error message device-num < 0 with 'mandatory' and no no-host devices available. (gomp_target_init): Set default-device-var if INT_MIN. * testsuite/libgomp.c/target-48.c: New test. * testsuite/libgomp.c/target-49.c: New test. * testsuite/libgomp.c/target-50.c: New test. * testsuite/libgomp.c/target-50a.c: New test. * testsuite/libgomp.c/target-51.c: New test. * testsuite/libgomp.c/target-52.c: New test. * testsuite/libgomp.c/target-53.c: New test. * testsuite/libgomp.c/target-54.c: New test.
2023-06-14 13:53:02 +08:00
if (icv->default_device_var == INT_MIN)
/* This implies OMP_TARGET_OFFLOAD=mandatory. */
gomp_init_targets_once ();
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
return icv->default_device_var;
}
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
ialias (omp_get_default_device)
openmp: Change omp_get_initial_device () to match OpenMP 5.1 requirements > Therefore, I think until omp_get_initial_device () value is changed, we The following so far untested patch implements that change. OpenMP 4.5 said for omp_get_initial_device: The value of the device number is implementation defined. If it is between 0 and one less than omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is outside that range, then it is only valid for use with the device memory routines and not in the device clause. and OpenMP 5.0 similarly, but OpenMP 5.1 says: The value of the device number is the value returned by the omp_get_num_devices routine. As the new value is compatible with what has been required earlier, I think we can change it already now. 2020-10-22 Jakub Jelinek <jakub@redhat.com> * icv.c (omp_get_initial_device): Remove including corresponding ialias. * icv-device.c (omp_get_initial_device): New function. Return gomp_get_num_devices (). Add ialias. * target.c (resolve_device): Don't fail with OMP_TARGET_OFFLOAD=mandatory if device_id is equal to gomp_get_num_devices (). (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr, omp_pause_resource): Use gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the first use in the functions, in uses dominated by the gomp_get_num_devices call use num_devices_openmp instead. * libgomp.texi (omp_get_initial_device): Document. * config/gcn/icv-device.c (omp_get_initial_device): New function. Add ialias. * config/nvptx/icv-device.c (omp_get_initial_device): Likewise. * testsuite/libgomp.c/target-40.c: New test.
2020-10-22 15:31:01 +08:00
int
omp_get_initial_device (void)
{
return gomp_get_num_devices ();
}
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
ialias (omp_get_initial_device)
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
int
omp_get_num_devices (void)
{
return gomp_get_num_devices ();
}
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
ialias (omp_get_num_devices)
OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. * configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789
2016-11-24 02:36:41 +08:00
int
omp_is_initial_device (void)
{
/* Hardcoded to 1 on host, should be 0 on MIC, HSAIL, PTX. */
return 1;
}
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
ialias (omp_is_initial_device)
openmp: Implement omp_get_device_num routine This patch implements the omp_get_device_num library routine, specified in OpenMP 5.0. GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number" variable, is defined on the device-side libgomp, has it's address returned to host-side libgomp during device initialization, and the host libgomp then sets its value to the designated device number. libgomp/ChangeLog: * icv-device.c (omp_get_device_num): New API function, host side. * fortran.c (omp_get_device_num_): New interface function. * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol. * libgomp.map (OMP_5.0.2): New version space with omp_get_device_num, omp_get_device_num_. * libgomp.texi (omp_get_device_num): Add documentation for new API function. * omp.h.in (omp_get_device_num): Add declaration. * omp_lib.f90.in (omp_get_device_num): Likewise. * omp_lib.h.in (omp_get_device_num): Likewise. * target.c (gomp_load_image_to_device): If additional entry for device number exists at end of returned entries from 'load_image_func' hook, copy the assigned device number over to the device variable. * config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-gcn.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-nvptx.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * testsuite/lib/libgomp.exp (check_effective_target_offload_target_intelmic): New function for testing for intelmic offloading. * testsuite/libgomp.c-c++-common/target-45.c: New test. * testsuite/libgomp.fortran/target10.f90: New test.
2021-08-05 23:29:03 +08:00
int
omp_get_device_num (void)
{
/* By specification, this is equivalent to omp_get_initial_device
on the host. */
openmp: Avoid PLT relocations for omp_* symbols in libgomp This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 00000000000470e0 0000020700000007 R_X86_64_JUMP_SLOT 000000000001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 0000000000047170 000000b800000007 R_X86_64_JUMP_SLOT 000000000000e760 omp_display_env@@OMP_5.1 + 0 00000000000471e0 000000e800000007 R_X86_64_JUMP_SLOT 000000000000f910 omp_get_initial_device@@OMP_4.5 + 0 0000000000047280 0000019500000007 R_X86_64_JUMP_SLOT 0000000000015940 omp_get_active_level@@OMP_3.0 + 0 00000000000472c8 0000020d00000007 R_X86_64_JUMP_SLOT 0000000000035210 omp_get_team_num@@OMP_4.0 + 0 00000000000472f0 0000014700000007 R_X86_64_JUMP_SLOT 0000000000035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 0000000000046fb8 000001ed00000006 R_X86_64_GLOB_DAT 0000000000036350 acc_prof_unregister@@OACC_2.5.1 + 0 0000000000046fd8 000000a400000006 R_X86_64_GLOB_DAT 0000000000035f30 acc_prof_register@@OACC_2.5.1 + 0 0000000000046fe0 000001d100000006 R_X86_64_GLOB_DAT 0000000000035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 0000000000047058 000001dd00000007 R_X86_64_JUMP_SLOT 0000000000031f40 acc_create_async@@OACC_2.5 + 0 0000000000047068 0000011500000007 R_X86_64_JUMP_SLOT 000000000002fc60 acc_get_property@@OACC_2.6 + 0 0000000000047070 000001fb00000007 R_X86_64_JUMP_SLOT 0000000000032ce0 acc_wait_all@@OACC_2.0 + 0 0000000000047080 0000006500000007 R_X86_64_JUMP_SLOT 000000000002f990 acc_on_device@@OACC_2.0 + 0 0000000000047088 000000ae00000007 R_X86_64_JUMP_SLOT 0000000000032140 acc_attach_async@@OACC_2.6 + 0 0000000000047090 0000021900000007 R_X86_64_JUMP_SLOT 000000000002f550 acc_get_device_type@@OACC_2.0 + 0 0000000000047098 000001cb00000007 R_X86_64_JUMP_SLOT 0000000000032090 acc_copyout_finalize@@OACC_2.5 + 0 00000000000470a8 0000005200000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_copyin@@OACC_2.0 + 0 00000000000470b8 000001ad00000007 R_X86_64_JUMP_SLOT 0000000000032030 acc_delete_finalize@@OACC_2.5 + 0 00000000000470e8 0000010900000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_create@@OACC_2.0 + 0 00000000000470f8 0000005900000007 R_X86_64_JUMP_SLOT 0000000000032b70 acc_wait_async@@OACC_2.0 + 0 0000000000047110 0000013100000007 R_X86_64_JUMP_SLOT 0000000000032860 acc_async_test@@OACC_2.0 + 0 0000000000047118 000001ff00000007 R_X86_64_JUMP_SLOT 000000000002f720 acc_get_device_num@@OACC_2.0 + 0 0000000000047128 0000019100000007 R_X86_64_JUMP_SLOT 0000000000032020 acc_delete_async@@OACC_2.5 + 0 0000000000047130 000001d200000007 R_X86_64_JUMP_SLOT 000000000002efa0 acc_shutdown@@OACC_2.0 + 0 0000000000047150 000000d000000007 R_X86_64_JUMP_SLOT 0000000000031f00 acc_present_or_create@@OACC_2.0 + 0 0000000000047188 0000019200000007 R_X86_64_JUMP_SLOT 0000000000031910 acc_is_present@@OACC_2.0 + 0 0000000000047190 000001aa00000007 R_X86_64_JUMP_SLOT 000000000002fca0 acc_get_property_string@@OACC_2.6 + 0 00000000000471d0 000001bf00000007 R_X86_64_JUMP_SLOT 0000000000032120 acc_update_self_async@@OACC_2.5 + 0 0000000000047200 0000020500000007 R_X86_64_JUMP_SLOT 0000000000032e00 acc_wait_all_async@@OACC_2.0 + 0 0000000000047208 000000a600000007 R_X86_64_JUMP_SLOT 0000000000031790 acc_deviceptr@@OACC_2.0 + 0 0000000000047218 0000007500000007 R_X86_64_JUMP_SLOT 0000000000032000 acc_delete@@OACC_2.0 + 0 0000000000047238 000001e900000007 R_X86_64_JUMP_SLOT 000000000002f3a0 acc_set_device_type@@OACC_2.0 + 0 0000000000047240 000001f600000007 R_X86_64_JUMP_SLOT 000000000002ef20 acc_init@@OACC_2.0 + 0 0000000000047248 0000018800000007 R_X86_64_JUMP_SLOT 0000000000032060 acc_copyout@@OACC_2.0 + 0 0000000000047258 0000021f00000007 R_X86_64_JUMP_SLOT 0000000000032a80 acc_wait@@OACC_2.0 + 0 0000000000047270 000001bc00000007 R_X86_64_JUMP_SLOT 0000000000032100 acc_update_self@@OACC_2.0 + 0 0000000000047288 0000011400000007 R_X86_64_JUMP_SLOT 0000000000032080 acc_copyout_async@@OACC_2.5 + 0 0000000000047290 0000013d00000007 R_X86_64_JUMP_SLOT 000000000002f850 acc_set_device_num@@OACC_2.0 + 0 00000000000472a8 000000c500000007 R_X86_64_JUMP_SLOT 00000000000320e0 acc_update_device_async@@OACC_2.5 + 0 00000000000472c0 0000014600000007 R_X86_64_JUMP_SLOT 0000000000031fc0 acc_copyin_async@@OACC_2.5 + 0 00000000000472f8 0000006a00000007 R_X86_64_JUMP_SLOT 000000000002f310 acc_get_num_devices@@OACC_2.0 + 0 0000000000047350 0000021700000007 R_X86_64_JUMP_SLOT 0000000000031f80 acc_present_or_copyin@@OACC_2.0 + 0 0000000000047360 0000020900000007 R_X86_64_JUMP_SLOT 00000000000320c0 acc_update_device@@OACC_2.0 + 0 0000000000047380 0000008400000007 R_X86_64_JUMP_SLOT 0000000000032950 acc_async_test_all@@OACC_2.0 + 0 2021-10-01 Jakub Jelinek <jakub@redhat.com> * affinity-fmt.c (omp_get_team_num, omp_get_num_teams): Add ialias_redirect. * env.c (handle_omp_display_env): Use ialias_call. * icv-device.c: Move ialias right below each function. (omp_get_device_num): Use ialias_call. * fortran.c (omp_fulfill_event): Add ialias_redirect. * icv.c (omp_get_active_level): Add ialias_redirect.
2021-10-01 16:42:07 +08:00
return ialias_call (omp_get_initial_device) ();
openmp: Implement omp_get_device_num routine This patch implements the omp_get_device_num library routine, specified in OpenMP 5.0. GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number" variable, is defined on the device-side libgomp, has it's address returned to host-side libgomp during device initialization, and the host libgomp then sets its value to the designated device number. libgomp/ChangeLog: * icv-device.c (omp_get_device_num): New API function, host side. * fortran.c (omp_get_device_num_): New interface function. * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol. * libgomp.map (OMP_5.0.2): New version space with omp_get_device_num, omp_get_device_num_. * libgomp.texi (omp_get_device_num): Add documentation for new API function. * omp.h.in (omp_get_device_num): Add declaration. * omp_lib.f90.in (omp_get_device_num): Likewise. * omp_lib.h.in (omp_get_device_num): Likewise. * target.c (gomp_load_image_to_device): If additional entry for device number exists at end of returned entries from 'load_image_func' hook, copy the assigned device number over to the device variable. * config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-gcn.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global. (omp_get_device_num): New API function, device side. * plugin/plugin-nvptx.c ("symcat.h"): Add include. (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR at end of returned 'target_table' entries. * testsuite/lib/libgomp.exp (check_effective_target_offload_target_intelmic): New function for testing for intelmic offloading. * testsuite/libgomp.c-c++-common/target-45.c: New test. * testsuite/libgomp.fortran/target10.f90: New test.
2021-08-05 23:29:03 +08:00
}
ialias (omp_get_device_num)
OpenMP, libgomp: Environment variable syntax extension This patch considers the environment variable syntax extension for device-specific variants of environment variables from OpenMP 5.1 (see OpenMP 5.1 specification, p. 75 and p. 639). An environment variable (e.g. OMP_NUM_TEAMS) can have different suffixes: _DEV (e.g. OMP_NUM_TEAMS_DEV): affects all devices but not the host. _DEV_<device> (e.g. OMP_NUM_TEAMS_DEV_42): affects only device with number <device>. no suffix (e.g. OMP_NUM_TEAMS): affects only the host. In future OpenMP versions also suffix _ALL will be introduced (see discussion https://github.com/OpenMP/spec/issues/3179). This is also considered in this patch: _ALL (e.g. OMP_NUM_TEAMS_ALL): affects all devices and the host. The precedence is as follows (descending). For the host: 1. no suffix 2. _ALL For devices: 1. _DEV_<device> 2. _DEV 3. _ALL That means, _DEV_<device> is used whenever available. Otherwise _DEV is used if available, and at last _ALL. If there is no value for any of the variable variants, default values are used as already implemented before. This patch concerns parsing (a), storing (b), output (c) and transmission to the device (d): (a) The actual number of devices and the numbering are not known when parsing the environment variables. Thus all environment variables are iterated and searched for device-specific ones. (b) Only configured device-specific variables are stored. Thus, a linked list is used. (c) The output is done in omp_display_env (see specification p. 468f). Global ICVs are tagged with [all], see https://github.com/OpenMP/spec/issues/3179. ICVs which are not global but aren't handled device-specific yet are tagged with [host]. omp_display_env outputs the initial values of the ICVs. That is why a dedicated data structure is introduced for the inital values only (gomp_initial_icv_list). (d) Device-specific ICVs are transmitted to the device via GOMP_ADDITIONAL_ICVS. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_default_device): Return device- specific ICV. (omp_get_max_teams): Added for GCN devices. (omp_set_num_teams): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_default_device): Return device- specific ICV. (omp_get_max_teams): Added for NVPTX devices. (omp_set_num_teams): Likewise. (ialias): Likewise. * env.c (struct gomp_icv_list): New struct to store entries of initial ICV values. (struct gomp_offload_icv_list): New struct to store entries of device- specific ICV values that are copied to the device and back. (struct gomp_default_icv_values): New struct to store default values of ICVs according to the OpenMP standard. (parse_schedule): Generalized for different variants of OMP_SCHEDULE. (print_env_var_error): Function that prints an error for invalid values for ICVs. (parse_unsigned_long_1): Removed getenv. Generalized. (parse_unsigned_long): Likewise. (parse_int_1): Likewise. (parse_int): Likewise. (parse_int_secure): Likewise. (parse_unsigned_long_list): Likewise. (parse_target_offload): Likewise. (parse_bind_var): Likewise. (parse_stacksize): Likewise. (parse_boolean): Likewise. (parse_wait_policy): Likewise. (parse_allocator): Likewise. (omp_display_env): Extended to output different variants of environment variables. (print_schedule): New helper function for omp_display_env which prints the values of run_sched_var. (print_proc_bind): New helper function for omp_display_env which prints the values of proc_bind_var. (enum gomp_parse_type): Collection of types used for parsing environment variables. (ENTRY): Preprocess string lengths of environment variables. (OMP_VAR_CNT): Preprocess table size. (OMP_HOST_VAR_CNT): Likewise. (INT_MAX_STR_LEN): Constant for the maximal number of digits of a device number. (gomp_get_icv_flag): Returns if a flag for a particular ICV is set. (gomp_set_icv_flag): Sets a flag for a particular ICV. (print_device_specific_icvs): New helper function for omp_display_env to print device specific ICV values. (get_device_num): New helper function for parse_device_specific. Extracts the device number from an environment variable name. (get_icv_member_addr): Gets the memory address for a particular member of an ICV struct. (gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list. (initialize_icvs): New function to initialize a gomp_initial_icvs struct. (add_initial_icv_to_list): Adds an ICV struct to gomp_initial_icv_list. (startswith): Checks if a string starts with a given prefix. (initialize_env): Extended to parse the new syntax of environment variables. * icv-device.c (omp_get_max_teams): Added. (ialias): Likewise. (omp_set_num_teams): Likewise. * icv.c (omp_set_num_teams): Moved to icv-device.c. (omp_get_max_teams): Likewise. (ialias): Likewise. * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Removed. (GOMP_ADDITIONAL_ICVS): New target-side struct that holds the designated ICVs of the target device. * libgomp.h (enum gomp_icvs): Collection of ICVs. (enum gomp_device_num): Definition of device numbers for _ALL, _DEV, and no suffix. (enum gomp_env_suffix): Collection of possible suffixes of environment variables. (struct gomp_initial_icvs): Contains all ICVs for which we need to store initial values. (struct gomp_default_icv):New struct to hold ICVs for which we need to store initial values. (struct gomp_icv_list): Definition of a linked list that is used for storing ICVs for the devices and also for _DEV, _ALL, and without suffix. (struct gomp_offload_icvs): New struct to hold ICVs that are copied to a device. (struct gomp_offload_icv_list): Definition of a linked list that holds device-specific ICVs that are copied to devices. (gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list. (gomp_get_icv_flag): Returns if a flag for a particular ICV is set. * libgomp.texi: Updated. * plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Extended to read further ICVs from the offload image. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise. * target.c (gomp_get_offload_icv_item): Get a list item of gomp_offload_icv_list. (get_gomp_offload_icvs): New. Returns the ICV values depending on the device num and the variable hierarchy. (gomp_load_image_to_device): Extended to copy further ICVs to a device. * testsuite/libgomp.c-c++-common/icv-5.c: New test. * testsuite/libgomp.c-c++-common/icv-6.c: New test. * testsuite/libgomp.c-c++-common/icv-7.c: New test. * testsuite/libgomp.c-c++-common/icv-8.c: New test. * testsuite/libgomp.c-c++-common/omp-display-env-1.c: New test. * testsuite/libgomp.c-c++-common/omp-display-env-2.c: New test.
2022-09-09 01:01:33 +08:00
int
omp_get_max_teams (void)
{
return gomp_nteams_var;
}
ialias (omp_get_max_teams)
void
omp_set_num_teams (int num_teams)
{
if (num_teams >= 0)
gomp_nteams_var = num_teams;
}
ialias (omp_set_num_teams)
OpenMP: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices This patch adds support for omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of device-specific ICV values (specified as environment variables or changed on a device). In order to reuse device-specific ICV values, a copy back mechanism is implemented that copies ICV values back from device to the host. Additionally, a limitation of the number of teams on gcn offload devices is implemented. The number of teams is limited by twice the number of compute units (one team is executed on one compute unit). This avoids queueing unnessecary many teams and a corresponding allocation of large amounts of memory. Without that limitation the memory allocation for a large number of user-specified teams can result in an "memory access fault". A limitation of the number of teams is already also implemented for nvptx devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c). gcc/ChangeLog: * gimplify.cc (optimize_target_teams): Set initial num_teams_upper to "-2" instead of "1" for non-existing num_teams clause in order to disambiguate from the case of an existing num_teams clause with value 1. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to allow processing of device-specific values. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * icv-device.c (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. (omp_set_teams_thread_limit): Likewise. * icv.c (omp_set_teams_thread_limit): Removed. (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. * libgomp.texi: Updated documentation for nvptx and gcn corresponding to the limitation of the number of teams. * plugin/plugin-gcn.c (limit_teams): New helper function that limits the number of teams by twice the number of compute units. (parse_target_attributes): Limit the number of teams on gcn offload devices. * target.c (get_gomp_offload_icvs): Added teams_thread_limit_var handling. (gomp_load_image_to_device): Added a size check for the ICVs struct variable. (gomp_copy_back_icvs): New function that is used in GOMP_target_ext to copy back the ICV values from device to host. (GOMP_target_ext): Update the number of teams and threads in the kernel args also considering device-specific values. * testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading of OMP_TEAMS_THREAD_LIMIT from the environment. * testsuite/libgomp.c-c++-common/icv-5.c: Extended. * testsuite/libgomp.c-c++-common/icv-6.c: Extended. * testsuite/libgomp.c-c++-common/icv-7.c: Extended. * testsuite/libgomp.c-c++-common/icv-9.c: New test. * testsuite/libgomp.fortran/icv-5.f90: New test. * testsuite/libgomp.fortran/icv-6.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-teams-1.c: Adapt expected values for num_teams from "1" to "-2" in cases without num_teams clause. * g++.dg/gomp/target-teams-1.C: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
2022-12-06 21:42:46 +08:00
int
omp_get_teams_thread_limit (void)
{
return gomp_teams_thread_limit_var;
}
ialias (omp_get_teams_thread_limit)
void
omp_set_teams_thread_limit (int thread_limit)
{
if (thread_limit >= 0)
gomp_teams_thread_limit_var = thread_limit;
}
ialias (omp_set_teams_thread_limit)