Commit Graph

1077 Commits

Author SHA1 Message Date
Kevin Sala 5acee7dd47 [OpenMP][libomptarget] Add hasQueue() function in NextGen plugin's AsyncInfoWrapperTy
This patch prepares the PluginInterface for the new AMDGPU NextGen plugin.

Differential Revision: https://reviews.llvm.org/D139263
2022-12-04 13:24:40 +01:00
Kevin Sala cea616f847 [OpenMP][libomptarget] Simplify resource managers in NextGen plugins
This patch removes the classes GenericStreamManagerTy and GenericEventManagerTy
from the PluginInterface header.

Differential Revision: https://reviews.llvm.org/D138769
2022-12-03 22:28:34 +01:00
Kevin Sala 2cb83cd288 [OpenMP][libomptarget] Improve NextGen plugin interface for initialization
This patch modifies the PluginInterface to define functions for initializing
and deinitializing GenericPluginTy instances instead of using the constructor
and destructor. This way, we can return errors from these functions. Also, it
defines some functions that each plugin should implement for creating
plugin-specific objects.

This patch prepares the PluginInterface for the new AMDGPU NextGen plugin.

Differential Revision: https://reviews.llvm.org/D138625
2022-12-03 22:25:15 +01:00
Kevin Sala 73a6cd23a4 [OpenMP][libomptarget] Add minor fixes to NextGen plugins
List of fixes:
  - omptarget_device_environment symbol is not mandatory in device images
  - Do not synchronize in ~AsyncInfoWrapperTy() if the async info's queue is null
  - GenericDeviceResourceRef's create() and destroy() require the device as parameter

Differential Revision: https://reviews.llvm.org/D138619
2022-12-03 22:10:31 +01:00
Kevin Sala 4fde81679c [OpenMP][libomptarget] Allow overriding function that gets ELF symbol info
The OpenMP target's NextGen plugins retrieve symbol information in the ELF image
(i.e., address and size) through the ELF section and ELF symbol objects. However,
the images of CUDA programs compute the address differently from the images of
AMDGPU programs:

  - Address for CUDA symbols: image begin + section's offset + symbol's st_value
  - Address for AMDGPU symbols: image + begin + symbol's st_value

Differential Revision: https://reviews.llvm.org/D138604
2022-12-03 21:51:09 +01:00
Dhruva Chakrabarti 4763e877f7 Revert "[OpenMP] [OMPT] [3/8] Implemented callback registration in libomptarget"
This reverts commit 2b234ce3f0.
2022-12-01 22:01:54 -08:00
Dhruva Chakrabarti 2b234ce3f0 [OpenMP] [OMPT] [3/8] Implemented callback registration in libomptarget
The purpose of this patch is to have tool-provided callbacks registered
in libomptarget. The overall design document is in
https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc

Defined a class OmptDeviceCallbacksTy that will be used by libomptarget
and a plugin for callbacks registered by a tool. Once the callbacks are
registered in libomp, a lookup function is passed to libomptarget that is
used to retrieve the callbacks and register them in libomptarget.

Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)

Reviewed By: jplehr

Differential Revision: https://reviews.llvm.org/D123974
2022-12-01 16:06:26 -08:00
Ron Lieberman b09a5e5cb3 Revert "Add mean_anyway to hpc config"
my bad, wrong repo ,so sorry.

This reverts commit 0b9350f3da.
2022-11-29 15:20:23 -06:00
Ron Lieberman 0b9350f3da Add mean_anyway to hpc config 2022-11-29 15:11:57 -06:00
Joseph Huber 3458a2b737 [Libomptarget][NFC] Add missing LLVM header 2022-11-29 09:46:51 -06:00
Ron Lieberman a1066569b8 [check-openmp] fix bug49334 bot fails - temporary 2022-11-28 19:10:43 -06:00
Shilei Tian fa06d4d3e2 [OpenMP][Test] Fixed the issue that lit complains test doesn't have run line 2022-11-28 18:13:55 -05:00
Shilei Tian 3523f94bfa [OpenMP][Test] Disable bug49334.cpp because of its flaky failure 2022-11-28 18:08:14 -05:00
Vitaly Buka 98441fc9e4 [NFC][OpenMP] Remove unused label 2022-11-17 23:35:08 -08:00
Vitaly Buka a35ad711d9 [NFC][OpenMP] Fix const cast warning 2022-11-17 23:24:40 -08:00
Vitaly Buka e42080ae3f [NFC][OpenMP] Remove extra ";" 2022-11-17 23:24:40 -08:00
Vitaly Buka 143050f552 [NFC][OpenMP] Fix warning about non-virtual dtor 2022-11-17 23:24:39 -08:00
Joseph Huber 0e7e426c0c [OMPT] Fix debug prefix not being defined
Summary:
This header file uses the `DP` prefixes but does not define
`DEBUG_PREFIX`. This patch adds a simple fix, but realistically the `DP`
system isn't ideal. Now that we have access to LLVM libraries and other
utilities we should consider rewriting all of the debugging and error
handling glue.
2022-11-16 07:53:16 -06:00
Kevin Sala 6bacbea826 [Libomptarget] Build plugins-nextgen/common/PluginInterface with protected visibility
Summary:
This commit sets the default visibility of PluginInterface's symbols (in
nextgen plugins) as protected. This prevents symbols from a plugin
library to be preempted by another plugin library's symbol. It applies
the same fix introduced by D136365.

Issue reported by @ggeorgakoudis.

Differential Revision: https://reviews.llvm.org/D138002
2022-11-16 07:11:38 -06:00
Dhruva Chakrabarti 5b67bce787 [OpenMP] [OMPT] [2/8] Implemented a connector for communication of OMPT callbacks between libraries.
This is part of a set of patches implementing OMPT target callback support and has been split out of the originally submitted https://reviews.llvm.org/D113728. The overall design can be found in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc

The purpose of this patch is to provide a way to register tool-provided callbacks into libomp when libomptarget is loaded.

Introduced a cmake variable LIBOMPTARGET_OMPT_SUPPORT that can be used to control OMPT target support. It follows host OMPT support, controlled by LIBOMP_HAVE_OMPT_SUPPORT.

Added a connector that can be used to communicate between OMPT implementations in libomp and libomptarget or libomptarget and a plugin.

Added a global constructor in libomptarget that uses the connector to force registration of tool-provided callbacks in libomp. A pair of init and fini functions are provided to libomp as part of the connect process which will be used to register the tool-provided callbacks in libomptarget.

Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)

Reviewed By: dreachem, jhuber6

Differential Revision: https://reviews.llvm.org/D123572
2022-11-15 14:21:55 -08:00
Jennifer Yu eace13928b Back out test that failed.
But I can not reproduce the problem on my local machine. My local machine run:

222 0x5a6780
222 0x7fffbef9400e
222 0x5a677e 0x5a6780 0x7fffbef936c8
222 0x376f8e 0x376f90 0x7fffbef94008
222 0x281f20
222 0x7fffbef9400e
PASSED
2022-11-04 17:23:05 -07:00
Jennifer Yu de14befa77 Remove redundant loads.
It is caused by regenerate captured var value when processing the
has_device_addr, the captured var value has been generated in
GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture.
The fix just use Arg instead regenerated just same as is_device_ptr
2022-11-04 15:22:25 -07:00
Ethan Stewart 85c2d92b9b [openmp][AMDGPU] - Correct getNumberOfBlocks calculation.
This patch fixes the 6 amdgpu buildbot lit test failures
introduced by https://reviews.llvm.org/D135444.
      libomptarget :: amdgcn-amd-amdhsa :: mapping/reduction_implicit_map.cpp
      libomptarget :: amdgcn-amd-amdhsa :: offloading/cuda_no_devices.c
      libomptarget :: amdgcn-amd-amdhsa :: offloading/target-teams-atomic.c
      libomptarget :: amdgcn-amd-amdhsa-LTO :: mapping/reduction_implicit_map.cpp
      libomptarget :: amdgcn-amd-amdhsa-LTO :: offloading/cuda_no_devices.c
      libomptarget :: amdgcn-amd-amdhsa-LTO :: offloading/target-teams-atomic.c

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D137261
2022-11-02 11:38:56 -05:00
Kevin Sala Penadés 59a41809d8 [OpenMP][libomptarget] Fix AsyncInfoTy object in omp_target_memcpy
The AsyncInfoTy should be created in the same device as the async operation will be issued. In omp_target_memcpy, the AsyncInfoTy for the host to destination device transfer was created referring to the source device.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D137225
2022-11-02 12:03:49 -04:00
Johannes Doerfert d0f9ddde99 [OpenMP] Utilize the "non-uniform-workgroup" to simplify DeviceRTL
OpenMP offloading always uses uniform workgroups, see
https://reviews.llvm.org/D135374. The runtime doesn't need to handle
non-uniform workgroups at all either.

Differential Revision: https://reviews.llvm.org/D135444
2022-11-01 20:37:52 -07:00
Dhruva Chakrabarti 88e557cbc9 Revert "[OpenMP] [OMPT] [2/8] Implemented a connector for communication of OMPT callbacks between libraries."
This reverts commit f94c2679cb.
2022-11-01 08:59:58 -07:00
Dhruva Chakrabarti f94c2679cb [OpenMP] [OMPT] [2/8] Implemented a connector for communication of OMPT callbacks between libraries.
This is part of a set of patches implementing OMPT target callback support and has been split out of the originally submitted https://reviews.llvm.org/D113728. The overall design can be found in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc

The purpose of this patch is to provide a way to register tool-provided callbacks into libomp when libomptarget is loaded.

Introduced a cmake variable LIBOMPTARGET_OMPT_SUPPORT that can be used to control OMPT target support. It follows host OMPT support, controlled by LIBOMP_HAVE_OMPT_SUPPORT.

Added a connector that can be used to communicate between OMPT implementations in libomp and libomptarget or libomptarget and a plugin.

Added a global constructor in libomptarget that uses the connector to force registration of tool-provided callbacks in libomp. A pair of init and fini functions are provided to libomp as part of the connect process which will be used to register the tool-provided callbacks in libomptarget.

Depends on D123429

Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)

Reviewed By: dreachem

Differential Revision: https://reviews.llvm.org/D123572
2022-10-31 10:33:23 -07:00
Lechen Yu b923c15d3c [libomptarget] Fix a race condition in checkDeviceAndCtors
When multiple threads invoke checkDeviceAndCtors, both of them may read true
from the shared variable Device.HasPendingGlobals, and then invoke initLibrary
redundantly. Therefore only protecting the access to Device.HasPendingGlobals
is not sufficient to guarantee that initLibrary is invoked just once.

To fix this race condition, we move the invocation of initLibrary into the
critical section, and remove the same lock inside initLibrary.

Differential Revision: https://reviews.llvm.org/D136952
2022-10-31 15:38:22 +01:00
Shilei Tian 4b0c285ef2 [NFC][OpenMP] Fix compile warnings introduced by D134396 2022-10-28 11:22:43 -04:00
Ye Luo 0911e57f1d [DeviceRTL] Fix incremental build
Need both add_custom_command to resolve file-level dependency and add_custom_target to resolve target-level dependency.
From CMake add_custom_command doc:
Do not list the output in more than one independent target that may build in parallel or the two instances of the rule may conflict (instead use the add_custom_target() command to drive the command and make the other targets depend on that one).

${CMAKE_CURRENT_BINARY_DIR}/${bclib_name} is used by multiple targets and thus requires a custom target to avoid racing.

Differential Revision: https://reviews.llvm.org/D136911
2022-10-27 22:20:17 -05:00
Kevin Sala 846904195b [OpenMP][libomptarget] New plugin infrastructure and new CUDA plugin
This patch adds a new infrastructure for OpenMP target plugins. It also implements the CUDA and GenericELF64bit plugins under this new infrastructure. We place the sources in a separate directory named plugins-nextgen, and we build the new plugins as different plugin libraries. The original plugins, which remain untouched, will be used by default. However, the user can change this behavior at run-time through the boolean envar LIBOMPTARGET_NEXTGEN_PLUGINS. If enabled, the libomptarget will try to load the NextGen version of each plugin, falling back to the original if they are not present or valid.

The idea of this new plugin infrastructure is to implement the common parts of target plugins in generic classes (defined in files inside plugins-next/common/PluginInterface folder), and then, each specific plugin defines its own specific classes inheriting from the common ones. In this way, most logic remains on the common interface while reducing the plugin-specific source code. It is also beneficial in the sense that now most code and behavior are the same across the different plugins. As an example, we define classes for a plugin, a device, a device image, a stream manager, etc. The plugin object (a single instance per plugin library) holds different device objects (i.e., one per available device), while these latter are the responsible for managing its own resources.

Most code on this patch is based on the changes made by @jdoerfert (Johannes Doerfert)

Reviewed By: jhuber6, jdoerfert

Differential Revision: https://reviews.llvm.org/D134396
2022-10-27 18:10:14 +00:00
Joseph Huber 429d3d4e9d [Libomptarget] Build plugins with protected visibility by default
The plugins all define the same interface symbols. This is generally not
a problem when calling the plugin directly from the dynamic library's
handle. However, when calling from within the plugin itself it is
possible for another plugin's symbols to preempt the symbols. This was
observed with the `__tgt_rtl_is_valid_binary` call in the
`__tgt_rtl_is_valid_binary_info` function being mapped to the x86_64
plugin.

This patch changes the default visibility to `protected` intead. This
visibility ensures that these symbols are all externally visible from
the plugin, but ensures their definitions are fixed within the shared
library. Having protected visiiblity makes such symbol preemption
impossible.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D136365
2022-10-20 11:12:18 -05:00
Joseph Huber 586fc5999b [Libomptarget][NFC] clang-format the libomptarget OpenMP tests
Summary:
Recent changes to clang-format improved the handling of OpenMP pragmas.
Clean up the existing libomptarget tests.
2022-10-19 08:57:27 -05:00
Joseph Huber 1af7541741 [Libomptarget] Fix missing semicolon in exports 2022-10-14 09:02:42 -05:00
Joseph Huber 619dced0fc [Libomptarget] Don't use full names for exported plugin symbols
Summary:
This patch changes the `exports` file to export all `__tgt_rtl`
functions. This is a better option as not each plugin implements all of
these functions, furthermore any new functions added will be
automatically included.
2022-10-14 08:57:57 -05:00
Slava Zakharin 88da0de14f Revert "[Libomp] Do not error on undefined version script symbols"
This reverts commit 096f93e73d.

Revert "[Libomptarget] Make the plugins ingore undefined exported symbols"

This reverts commit 3f62314c23.

Revert "[LLD] Enable --no-undefined-version by default."

This reverts commit 7ec8b0d162.

Three commits are reverted because of the current omp build fail
with GNU ld. See discussion here: https://reviews.llvm.org/rG096f93e73dc3
2022-10-13 14:12:07 -07:00
Joseph Huber 3f62314c23 [Libomptarget] Make the plugins ingore undefined exported symbols
Summary:
Recent changes made the default behaviour to error when given an
undefined symbol in a version script. A previous patch fixed this for
`libomptarget` by removing the single undefined symbol. However, the
plguins are expected to only define a subset of the availible functions
so we shouldn't treat it as an error. This patch updates the build flags
to work appropriately.
2022-10-13 08:13:03 -05:00
Joseph Huber e801e8f3e7 [Libomptarget] Remove undefined 'omp_get_interop_rc_desc' symbol from exports list
Summary:
A recent patch made undefined symbols in version scripts cause errors by
default. The `omp_get_interop_rc_desc` function is declared but not
defined, so it is undefined in the final link unit. This patch removes
it from the exports list, it should be added back in when actually
defined and used.
2022-10-13 07:41:14 -05:00
Ye Luo 053e894106 [DeviceRTL] CMake fix using target-level dependency
File-level dependency should not be used on files generated during the build. The next command may execute before the generating command finishes writing the file. Use add_custom_target and use target-level dependency.

Differential Revision: https://reviews.llvm.org/D135630
2022-10-10 21:23:58 -05:00
Shilei Tian 395d261de7 [NFC] Remove trailing white space in openmp/libomptarget/src/CMakeLists.txt 2022-10-07 13:42:31 -04:00
Joseph Huber defe072010 [Libomptarget] Remove debug definitions DeviceRTL's CMake
These debugging definitions are no longer used in the new runtime. The
old runtime has been removed since Clang-14 so we can safely get rid of
these leftover variables.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D135452
2022-10-07 11:31:18 -05:00
Joseph Huber 1bddb0fc23 [Libomptarget] Clean up DeviceRTL CMake and remove unused flags
Summary:
This patch just cleans up the unused flags in the DeviceRTL. These
should no longer be necessary or are redundant. Also add the extract
tool and packager to the check and error message if not found. This will
make it easier to tell if they are not present.
2022-10-07 10:09:48 -05:00
Ye Luo deba92d6c2 [DeviceRTL] Fix a CMake multi-step compilation dependency issue.
caused by 9223315903
2022-10-06 19:07:39 -05:00
Shilei Tian 9dd0476293 [OpenMP][DeviceRTL] Fix build issue 2022-10-06 16:21:51 -04:00
Shilei Tian 32dc48094b [OpenMP][DeviceRTL] Fix an issue that thread array might be corrupted
The shared memory stack in the device runtime assumes no intervined uses.
D135037 breaks the assumption, potentially causing the shared stack corruption.
This patch moves the thread array to heap memory. Since it is already the slow
path, it doesn't matter that much anyway.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D135391
2022-10-06 16:13:33 -04:00
Joseph Huber 9223315903 [DeviceRTL] Allow IsSPMDMode to be optimized out in LTO mode
A previous patch merged the static and bitcode versions of the
deviceRTL. We previously used the static library's separate compilation
to set a special flag that prevented `IsSPMDMode` from being put in the
used list and preventing it from being optimized out. When they were
merged we could no longer do this separate compilation that allowed
users of LTO to get more optimal code.

This patch rearranges the code. The `IsSPMDMode` global is now
transitively used by its inclusion in the changed `__keep_alive`
function. This allows us to then manually delete the `__keep_alive`
function from the module when building the static library via
`llvm-extract`. The result is that the bitcode library correctly will
maintain the needed shared state, while the static library will be able
to internalize it and optimize it out.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135280
2022-10-05 14:40:01 -05:00
Johannes Doerfert a3a741c0bb [OpenMP][FIX] Update device API to match recent changes 2022-10-05 08:07:38 -07:00
Johannes Doerfert f8ee045c6d [OpenMP] Eliminate the ThreadStates array in favor of indirection
If we have thread states, the program is going to be rather slow. If we
don't, we want to avoid wasting shared memory. This patch introduces a
slight penalty (malloc + indirection) for the slow path and reduces
resource usage for the fast path.

Differential Revision: https://reviews.llvm.org/D135037
2022-10-04 20:27:34 -07:00
Johannes Doerfert b113965073 [OpenMP] Introduce more atomic operations into the runtime
We should use OpenMP atomics but they don't take variable orderings.
Maybe we should expose all of this in the header but that solves only
part of the problem anyway.

Differential Revision: https://reviews.llvm.org/D135036
2022-10-04 20:20:55 -07:00
Johannes Doerfert f85c1f3b7c [OpenMP] Replace __ATOMIC_XYZ with atomic::xyz for style
Also fixes one ordering argument not used.

Differential Revision: https://reviews.llvm.org/D135035
2022-10-04 19:43:30 -07:00