llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	abbc3fa17b	[OpenMP] Replace pointer comparison with `isSharedMemPtr` check The pointer comparison was causing confusion for capture tracking, let's avoid confusion. Differential Revision: https://reviews.llvm.org/D135160	2022-10-04 19:24:22 -07:00
Jonathan Peyton	f8d081c1a5	[OpenMP][libomp] Allow unused-but-set warnings Only a few remaining which are taken care of by this patch. Differential Revision: https://reviews.llvm.org/D133528	2022-10-03 10:24:33 -05:00
Hansang Bae	772fb97c0b	[OpenMP] Ignore schedule modifier in static scheduling The modifier bits in the schedule type is not used/supported in the static scheduler, so it should be ignored. Differential Revision: https://reviews.llvm.org/D134983	2022-10-03 08:29:57 -05:00
Dhruva Chakrabarti	667af48179	[OpenMP] [OMPT] [1/8] Create separate categories for host, device, [no]emi events In preparation for OMPT target changes, create separate categories of events that will be used by OMPT target support. Split up existing macro FOREACH_OMPT_EVENT into new ones. There is no change to the original macro. Created new macros FOREACH_OMPT_HOST_EVENT, FOREACH_OMPT_DEVICE_EVENT, FOREACH_OMPT_NOEMI_EVENT, FOREACH_OMPT_EMI_EVENT, and a few other sub-categories that can be used as required. One such use is in D123974 which uses events selectively. Patch from John Mellor-Crummey <johnmc@rice.edu> Reviewed By: dreachem Differential Revision: https://reviews.llvm.org/D123429	2022-10-01 00:46:40 +00:00
Vitaly Buka	adf4eda004	[test][openmp] Tsan may report more warnings here	2022-09-28 18:53:09 -07:00
Jennifer Yu	30cc712eb6	[Clang][OpenMP] Fix run time crash when use_device_addr is used. It is data mapping ordering problem. According omp spec If one or more map clauses are present, the list item conversions that are performed for any use_device_ptr or use_device_addr clause occur after all variables are mapped on entry to the region according to those map clauses. The change is to put mapping data for use_device_addr at end of data mapping array. Differential Revision: https://reviews.llvm.org/D134556	2022-09-27 11:53:57 -07:00
Dan Palermo	db021abf33	[OpenMP][AMDGPU] Enable OpenMP device runtime build for gfx110[0123] Add OpenMP device runtime build support for the gfx1100, gfx1101, gfx1102, and gfx1103 targets. Differential Revision: https://reviews.llvm.org/D134465	2022-09-23 01:49:51 +00:00
Jennifer Yu	48ffd40ba2	[Clang][OpenMP] Codegen generation for has_device_addr claues. This patch add codegen support for the has_device_addr clause. It use the same logic of is_device_ptr. But passing &var instead pointer to var to kernal. Differential Revision: https://reviews.llvm.org/D134268	2022-09-20 21:12:30 -07:00
Ron Lieberman	d5b5289561	revert `684f76643` [Clang][OpenMP] Codegen generation for has_device_addr claues. breaks amdgpu buildbot	2022-09-20 01:37:27 +00:00
Jennifer Yu	a1df13ecd6	Fix test case which is not working for AMDGPU. This is for the change of Differential Revision: https://reviews.llvm.org/D134186	2022-09-19 17:07:01 -07:00
Jennifer Yu	684f766431	[Clang][OpenMP] Codegen generation for has_device_addr claues. Summary: This patch add codegen support for the has_device_addr clause. It use the same logic of is_device_ptr. Differential Revision: https://reviews.llvm.org/D134186	2022-09-19 16:14:57 -07:00
SignKirigami	6772987fc3	[OpenMP] Add LoongArch64 support GCC, glibc, binutils, and LLVM have added support for LoongArch64. This patch adds support for LLVM OpenMP following D59880 for RISCV64. Reviewed By: MaskRay, SixWeining Differential Revision: https://reviews.llvm.org/D132925	2022-09-19 22:49:15 +00:00
Joseph Huber	292cb114b0	[Libomptarget] Revert changes to AMDGPU plugin destructors These patches exposed a lot of problems in the AMD toolchain. Rather than keep it broken we should revert it to its old semi-functional state. This will prevent us from using device destructors but should remove some new bugs. In the future this interface should be changed once these problems are addressed more correctly. This reverts commit `ed0f218115`. This reverts commit `2b7203a359`. Fixes #57536 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133997	2022-09-16 06:55:51 -05:00
Joseph Huber	4b004a0b83	[Libomptarget] Embed bitcode library in static library instead. This patch changes the CMake to instead embed the already generated LLVM-IR bitcode library into an object file to create the static library. This is different from the previous method which generated them separately. This will make the build faster and allow us to perform the same internalization into a single library we do with the bitcode library. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133952	2022-09-15 14:05:18 -05:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `7539e9cf81`.	2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Joseph Huber	23bc343855	[Libomptarget] Change device free routines to accept the allocation kind Previous support for device memory allocators used a single free routine and did not provide the original kind of the allocation. This is problematic as some of these memory types required different handling. Previously this was worked around using a map in runtime to record the original kind of each pointer. Instead, this patch introduces new free routines similar to the existing allocation routines. This allows us to avoid a map traversal every time we free a device pointer. The only interfaces defined by the standard are `omp_target_alloc` and `omp_target_free`, these do not take a kind as `omp_alloc` does. The standard dictates the following: "The omp_target_alloc routine returns a device pointer that references the device address of a storage location of size bytes. The storage location is dynamically allocated in the device data environment of the device specified by device_num." Which suggests that these routines only allocate the default device memory for the kind. So this has been changed to reflect this. This change is somewhat breaking if users were using `omp_target_free` as previously shown in the tests. Reviewed By: JonChesterfield, tianshilei1992 Differential Revision: https://reviews.llvm.org/D133053	2022-09-14 12:14:07 -05:00
Joseph Huber	c2acb1e5d3	[Libomptarget][NFC] Remove unused variable	2022-09-09 15:26:02 -05:00
Joseph Huber	86587f2891	[Libomptarget] Fix compiling with asserts using the bitcode library Sumnmary: A previous patch introduces an `exports` file which contains all the symbol names that are not internalized in the bitcode library. This is done to reduce the size of the bitcode library and only export needed functions. This export file must contain all the functoins expected to be called from the device. Since its introduction the `__assert_fail` function used to be provided but was mistakenly not included. This patch adds it. Fixes #57656 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133594	2022-09-09 15:25:24 -05:00
Joseph Huber	83fcba82cc	[Libomptarget] Add proper LLVM libraries now that the AMDGPU plugin uses them Summary: The AMDGPU and CUDA plugins now relies on the Object and Support libraries. This patch adds them explicitly rather than hoping that they share the symbols loaded from the standard `libomptarget`.	2022-09-09 10:33:26 -05:00
serge-sans-paille	6f2ed8fd3f	[OpenMP] Install ompt-multiplex.h alongside omp.h The default install direction may not be in the compiler search path. Differential Revision: https://reviews.llvm.org/D133420	2022-09-09 09:42:08 +02:00
Jonathan Peyton	e5ac98fa01	[OpenMP][libomp] Cleanup __kmpc_flush() code Have it be simple KMP_MFENCE() which incorporates x86-specific logic and reduces to KMP_MB() for other architectures. Differential Revision: https://reviews.llvm.org/D130928	2022-09-08 16:17:20 -05:00
Joseph Huber	6e8d93e5c2	[Libomptarget] Implement OpenMP 5.2 semantics for device pointers In OpenMP 5.2, §5.8.6, page 160 line 32-33, when a device pointer allocated by omp_target_alloc has implicitly been included on a target construct as a zero-length array, the pointer initialisation should not find a matching mapped list item, and so should retain its value as a firstprivate variable. Previously, we would return a null pointer if the list item was not found. This patch updates the map handling to the OpenMP 5.2 semantics. Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D133447	2022-09-07 17:01:14 -05:00
Joseph Huber	8d2a447bf9	[Libomptarget] Remove leftover ELF header from x86 plugin Summary: We removed the linking support for `gelf.h` in a previous patch. This header was incorrectly leftover causing build problems on some systems.	2022-09-07 13:41:40 -05:00
Joseph Huber	300155911a	[Libomptarget] Replace libelf with LLVM's Elf libraries This patch replaces the dependency on `libelf` with LLVM's ELF support. With this patch the user no-longer needs to have `libelf` on their system to build and configure OpenMP offloading. The replacement is mostly mechanical, with the exception of the hash table support which was added in D131309. Depends on D131309 Reviewed By: JonChesterfield, saiislam Differential Revision: https://reviews.llvm.org/D131401	2022-09-07 12:38:51 -05:00
Joseph Huber	894531f59b	[Libomptarget] Add utility functions for loading an ELF symbol by name The `SHT_HASH` sections in an ELF are used to look up a symbol in the symbol table using a symbol's name. This is done by obtaining the `SHT_HASH` section and using its `sh_link` attribute to access the associated symbol table, from which we can access the string table containing the associated name. We can then search for the symbol using the hash of the name and the buckets and chains in the hash table itself This patch adds utility functions that allow us to look up a symbol in an ELF file by name. It will first attempt to look through the hash tables, and then search the section tables manually if failed. This allows us to pull out constants necessary for setting up offloading without first loading the object. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131309	2022-09-07 12:38:50 -05:00
Joseph Huber	31f434ee3b	[Libomptarget][NFC] Clean up CUDA plugin and address warnings	2022-09-06 15:28:57 -05:00
Vignesh Balasubramanian	d2a6e165e8	[OpenMP][OMPD] GDB plugin code to leverage libompd to provide debugging support for OpenMP programs. This is 5th of 6 patches started from https://reviews.llvm.org/D100181 This plugin code, when loaded in gdb, adds a few commands like ompd icv, ompd bt, ompd parallel. These commands create an interface for GDB to read the OpenMP runtime through libompd. Reviewed By: @dreachem Differential Revision: https://reviews.llvm.org/D100185	2022-09-06 11:28:55 +05:30
Ye Luo	0e68f483d4	[OpenMP] add a offload test involving std::complex Taken from the https://github.com/llvm/llvm-project/issues/57064 reproducer. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133258	2022-09-03 13:28:11 -05:00
Joseph Huber	f8b1f93f26	[libomptarget] Enable the device allocator for AMDGPU This patch adds support for the device memory type, this is currently equivalent to the default type so it should be treated as the same. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D133128	2022-09-01 12:40:59 -05:00
Joseph Huber	56cf3d626f	[Libomptarget] Remove old workaround for GCC 5,6 from libomptarget Some code previous needed the `used` attribute to prevent the GCC compiler versions 5 and 6 from removing it. This is no longer required as the minimum supported GCC version for LLVM 16 is >=7.1.0. Reviewed By: JonChesterfield, vzakhari Differential Revision: https://reviews.llvm.org/D132976	2022-08-30 19:13:48 -05:00
Joseph Huber	52556c3c0f	[Libomptarget] Make unified shared memory test unsupported on AMDGPU This test is an expected failure on AMDGPU. The expected failure is a GPU memory failure, which will typically result in the device totally failing. This isn't an issue for some GPU configurations that do not use the offloading device to also drive the display server. However, if the main GPU is used for testing it will reliably result in the user's display becoming unresponsive. This makes it difficult to run the GPU offloading tests on many systems. This patch simply makes this test unsupported so it no longer runs and freezes my computer when using `ninja check-openmp`. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D132891	2022-08-30 12:14:25 -05:00
Joseph Huber	dc400f8612	[libomptarget] Deprecate old method for setting the tripcount Previously, the tripcount was set by a push call. We moved away from this with the new interface that added the tripcount to the kernel arguments struct, but kept around the old interface for legacy purposes for the LLVM 15 release. This patch removes the support for the legacy method. This removes the support for the old method, but does not break backwards compatibility. This will result in applications using the old interface being slower when run on the device. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132885	2022-08-29 20:08:26 -05:00
Joseph Huber	04ae35e592	[libomptarget] Always enable time tracing in libomptarget Previously time tracing features were hidden behind an optional CMake option. This was because `libomptarget` was not based on the LLVM libraries at that time. Now that `libomptarget` is an LLVM library we should be able to freely use the `LLVMSupport` library whenever we want and do not need to guard it in this way. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132852	2022-08-29 14:49:03 -05:00
Joseph Huber	22d71e72c9	[Libomptarget] Do not check for valid binaries twice. The only RTLs that get added to the `UsedRTLs` list have already been checked is they were valid binaries. We shouldn't need to do this again when we unregister all the used binaries as they wouldn't have been used if they were invalid anyway. Let me know if I'm incorrect in this assumption. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D131443	2022-08-29 08:36:50 -05:00
Joseph Huber	47166968db	[OpenMP] Deprecate the old driver for OpenMP offloading Recently OpenMP has transitioned to using the "new" driver which primarily merges the device and host linking phases into a single wrapper that handles both at the same time. This replaced a few tools that were only used for OpenMP offloading, such as the `clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver carries some marked benefits compared to the old driver that is now being deprecated. Things like device-side LTO, static library support, and more compatible tooling. As such, we should be able to completely deprecate the old driver, at least for OpenMP. The old driver support will still exist for CUDA and HIP, although both of these can currently be compiled on Linux with `--offload-new-driver` to use the new method. Note that this does not deprecate the `clang-offload-bundler`, although it is unused by OpenMP now, it is still used by the HIP toolchain both as their device binary format and object format. When I proposed deprecating this code I heard some vendors voice concernes about needing to update their code in their fork. They should be able to just revert this commit if it lands. Reviewed By: jdoerfert, MaskRay, ye-luo Differential Revision: https://reviews.llvm.org/D130020	2022-08-26 13:47:09 -05:00
Jon Chesterfield	ffabe997a5	[openmp][amdgpu] Implement target_alloc_host as fine grain HSA memory The cuda plugin maps TARGET_ALLOC_HOST onto cuMemAllocHost which is page locked host memory. Fine grain HSA memory is not necessarily page locked but has the same read/write from host or device semantics. The cuda plugin does this per-gpu and this patch makes it accessible from any gpu, but it can be locked down to match the cuda behaviour if preferred. Enabling tests requires an equivalent to // RUN: %libomptarget-compile-run-and-check-nvptx64-nvidia-cuda for amdgpu which doesn't seem to be in use yet. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D132660	2022-08-25 16:27:52 +01:00
Ye Luo	322ea53144	[libomptarget][amdgpu] enable tests whenever possible. if(TARGET amdgpu-arch) doesn't work when ENABLE_LLVM_PROJECTS=openmp because openmp subdirectory is processed before clang subdirectory. Adopt the same logic of enabling tests like the CUDA plugin. Differential Revision: https://reviews.llvm.org/D132579	2022-08-24 14:33:28 -05:00
Joseph Huber	540a13652f	[Libomptarget] Replace use of `dlopen` with LLVM's dynamic library support This patch replaces uses of `dlopen` and `dlsym` with LLVM's support with `loadPermanentLibrary` and `getSymbolAddress`. This allows us to remove the explicit dependency on the `dl` libraries in the CMake. This removes another explicit dependency and solves an issue encountered while building on Windows platforms. The one downside to this is that the LLVM library does not currently support `dlclose` functionality, but this could be added in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131507	2022-08-24 10:46:21 -05:00
Joseph Huber	30efb459e0	[Libomptarget] Remove use of ELF link_address in x86_64 plugin We use the offloading entires array to determine the relative names and addressed of device-side kernel functions. The x86_64 plugin previously derived the device-side entry table by first identifying the `omp_offloading_entries` section offset in the loaded elf. Then we would use the base offset of the loaded dyanmic library to identify the entries array within the loaded image. This relied on some more unconventional methods which prevented us from using the LLVM dynamic library loader for this plugin. This patch simplifies this by instead copying the host-side entry and replacing its address with the device-side address looked up through `dlsym`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131516	2022-08-24 10:46:20 -05:00
Vitaly Buka	3195449f2b	[test][openmp] Relax condition in test It runs 8 threads. Sometimes tsan is able to detect more than one of the same race.	2022-08-23 14:29:06 -07:00
Joseph Huber	2b8f722e63	[OpenMP] Add option to assert no nested OpenMP parallelism on the GPU The OpenMP device runtime needs to support the OpenMP standard. However constructs like nested parallelism are very uncommon in real application yet lead to complexity in the runtime that is sometimes difficult to optimize out. As a stop-gap for performance we should supply an argument that selectively disables this feature. This patch adds the `-fopenmp-assume-no-nested-parallelism` argument which explicitly disables the usee of nested parallelism in OpenMP. Reviewed By: carlo.bertolli Differential Revision: https://reviews.llvm.org/D132074	2022-08-23 14:09:51 -05:00
utsumi	2e2caea37f	[Clang][OpenMP] Make copyin clause on combined and composite construct work (patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com)) Make copyin clause on the following constructs work. - parallel for - parallel for simd - parallel sections Fixes https://github.com/llvm/llvm-project/issues/55547 Patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com) Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D132209	2022-08-23 07:58:35 -07:00
John Ericson	e941b031d3	Revert "[cmake] Use `CMAKE_INSTALL_LIBDIR` too" This reverts commit `f7a33090a9`. Unfortunately this causes a number of failures that didn't show up in my local build.	2022-08-18 22:46:32 -04:00
John Ericson	f7a33090a9	[cmake] Use `CMAKE_INSTALL_LIBDIR` too We held off on this before as `LLVM_LIBDIR_SUFFIX` conflicted with it. Now we return this. `LLVM_LIBDIR_SUFFIX` is kept as a deprecated way to set `CMAKE_INSTALL_LIBDIR`. The other `*_LIBDIR_SUFFIX` are just removed entirely. I imagine this is too potentially-breaking to make LLVM 15. That's fine. I have a more minimal version of this in the disto (NixOS) patches for LLVM 15 (like previous versions). This more expansive version I will test harder after the release is cut. Reviewed By: sebastian-ne, ldionne, #libc, #libc_abi Differential Revision: https://reviews.llvm.org/D130586	2022-08-18 15:33:35 -04:00
Kevin Sala Penads	1081bb08cc	[OpenMP][libomptarget] Fix run region async condition This patch fixes a condition in the openmp/libomptarget/src/device.cpp file. The code was checking if the run_region plugin API function was implemented, but it should actually check the run_region_async function instead. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D131782	2022-08-15 13:08:45 -04:00
Fangrui Song	acfe0d3b15	[openmp] Remove __ANDROID_API__ < 19 workaround https://github.com/android/ndk/wiki/Changelog-r24 shows that the NDK has moved forward to at least a minimum target API of 19. Remove old workaround.	2022-08-12 22:15:38 -07:00
Jennifer Yu	2ca27206f9	[OpenMP] Fix segmentation fault when data field is used in is_device_pt Currently, the field just emit map info for this pointer variable. It is failed at run time. For the fields, the PartialStruct is created and it needs call to emitCombinedEntry which create the base that covers all the pieces. The change is to generate map info as regular fields. Differential Revision: https://reviews.llvm.org/D129608	2022-08-12 17:10:26 -07:00
Jonathan Peyton	56f36f85e0	[OpenMP][OMPT] Fix memory leak when using GCC compatibility code Serialized parallels allocate lightweight task teams on the heap but never free them in the corresponding join. This patch adds a wrapper around the allocation (if ompt enabled) and also adds the corresponding free in the join call. Differential Revision: https://reviews.llvm.org/D131690	2022-08-11 15:26:09 -05:00
Johannes Doerfert	a8cda32909	[OpenMP][FIX] Ensure __kmpc_kernel_parallel is reachable The problem is we create the call to __kmpc_kernel_parallel in the openmp-opt pass but while we optimize the code, the call is not there yet. Thus, we assume we never reach it from __kmpc_target_deinit. That allows us to remove the store in there (`ParallelRegionFn = nullptr`), which leads to bad results later on. This is a shortstop solution until we come up with something better. Fixes https://github.com/llvm/llvm-project/issues/57064	2022-08-11 09:55:56 -05:00
Joseph Huber	fdbb15355e	[Libomptarget][CUDA] Check CUDA compatibilty correctly We recently added support for multi-architecture binaries in libomptarget. This is done by extracting the architecture from the embedded image and comparing it with the major and minor version supported by the current CUDA installation. Previously we just compared these directly, which was not correct for binary compatibility. The CUDA documentation states that we can consider any image with an equivalent major or a greater or equal to minor compatible with the current image. Change the check to use this new logic in the CUDA plugin. Fixes #57049 Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D131567	2022-08-10 11:15:27 -04:00
Ron Lieberman	9ff0cc7e0f	[openmp] Fix enumeration build issue for openmp library integer value 40962 is outside the valid range of values [0, 31] for this enumeration type [-Wenum-constexpr-conversion]` (Issue #57022) turn on -Wno-enum-constexpr-conversion to buy some time to fix the more egregious issue in hsa_agent_into_t and hsa_amd_agent_info_t interfaces. relates to https://reviews.llvm.org/D131307/new/ Differential Revision: https://reviews.llvm.org/D131477	2022-08-09 10:25:03 +00:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Jon Chesterfield	521a5c11ac	Rename OPENMP_HAVE_STD_CPP14_FLAG to match c++17	2022-08-08 17:07:45 +01:00
Ron Lieberman	af28b27d31	Move openmp from -std=c++14 to -std=c++17	2022-08-08 16:04:57 +00:00
Jon Chesterfield	104f11630a	[nfc][openmp] clang-format system.cpp prior to D131401	2022-08-08 16:24:34 +01:00
Shilei Tian	294bbdc0b8	[NFC] Fix wrong header in `LibC.cpp`	2022-08-04 23:54:07 -04:00
Shilei Tian	459e3c5184	[OpenMP] Fix the test case issue that printf cannot be used in target region for AMDGPU	2022-08-04 14:48:48 -04:00
Shilei Tian	db5a2afa62	[OpenMP][DeviceRTL] Implement libc function `memcmp` We will add some simple implementation of libc functions starting from this patch, and the first one is `memcmp`, which is reported in #56929. Note that `malloc` and `free` are not included in this patch because of the use of `declare variant`. In the near future we will implement the two functions w/o using any vendor provided function. This fixes #56929. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D131182	2022-08-04 14:37:54 -04:00
Joseph Huber	b3335e8ed7	[Libomptarget][NFC] Clang format the AMDGPU plugin Summary: A previous patch did not format the plugin again after making changes. Ensure that libomptarget stays formatted.	2022-08-03 15:18:16 -04:00
Joseph Huber	2b7203a359	[Libomptarget] Deinitialize AMDGPU global state more intentionally A previous patch made the destruction of the HSA plugin more deterministic. However, there were still other global values that are not handled this way. When attempting to call a destructor kernel, the device would have already been uninitialized and we could not find the appropriate kernel to call. This is because they were stored in global containers that had their destructors called already. Merges this global state into the rest of the info state by putting those global values inside of the global pointer already allocated and deallocated by the constructor and destructor. This should allow the AMDGPU plugin to correctly identify the destructors if we were to run them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131011	2022-08-02 18:24:39 -04:00
Jonathan Peyton	9cf6511bff	[OpenMP][libomp] Detect if test compiler has omp.h omp50_taskdep_depobj.c relies on the test compiler's omp.h file. If the test compiler does not have an omp.h file, then use the one within the build tree. Fixes: https://github.com/llvm/llvm-project/issues/56820 Differential Revision: https://reviews.llvm.org/D131000	2022-08-02 17:05:56 -05:00
Martin Storsjö	3f25ad335b	[OpenMP] Fix warnings about unused expressions when OMPT_LOOP_DISPATCH is a no-op. NFC. This fixes warnings like these: ../runtime/src/kmp_dispatch.cpp:2159:24: warning: left operand of comma operator has no effect [-Wunused-value] OMPT_LOOP_DISPATCH(p_lb, p_ub, pr->u.p.st, status); ^~~~~ ../runtime/src/kmp_dispatch.cpp:2159:31: warning: left operand of comma operator has no effect [-Wunused-value] OMPT_LOOP_DISPATCH(p_lb, p_ub, pr->u.p.st, status); ^~~~~ ../runtime/src/kmp_dispatch.cpp:2159:46: warning: left operand of comma operator has no effect [-Wunused-value] OMPT_LOOP_DISPATCH(p_lb, p_ub, pr->u.p.st, status); ~~~~~~~ ^~ ../runtime/src/kmp_dispatch.cpp:2159:50: warning: expression result unused [-Wunused-value] OMPT_LOOP_DISPATCH(p_lb, p_ub, pr->u.p.st, status); ^~~~~~	2022-08-02 11:16:23 +03:00
Martin Storsjö	7f24fd26a8	[OpenMP] Only include CMAKE_DL_LIBS on unix platforms CMAKE_DL_LIBS is documented as "Name of library containing dlopen and dlclose". On Windows platforms, there's no system provided dlopen/dlclose, but it can be argued that if you really intend to call dlopen/dlclose, you're going to be using a third party compat library like https://github.com/dlfcn-win32/dlfcn-win32, and CMAKE_DL_LIBS should expand to its name. This has been argued upstream in CMake in https://gitlab.kitware.com/cmake/cmake/-/issues/17600 and https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1642, that CMAKE_DL_LIBS should expand to "dl" on mingw platforms. The merge request wasn't merged though, as it caused some amount of breakage, but in practice, Fedora still carries a custom CMake patch with the same effect. Thus, this patch fixes cross compiling OpenMP for mingw targets on Fedora with their custom-patched CMake. Differential Revision: https://reviews.llvm.org/D130892	2022-08-02 10:56:30 +03:00
Joseph Huber	5afb5312a0	[Libomptarget][NFC] Remove unused CMake file Summary: This file is no longer used, get rid of it.	2022-08-01 16:21:53 -04:00
Joseph Huber	51bda3a0e7	[Libomptarget] Replace std::vector with llvm::SmallVector The runtime makes some use of `std::vector` data structures. We should be able to replace these trivially with `llvm::SmallVector` instead. This should allow us to avoid heap allocations in the majority of cases now. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130927	2022-08-01 15:59:15 -04:00
Michał Górny	eb4612ca23	[openmp] [test] Fix prepending config.library_dir to LD_LIBRARY_PATH Fix the LD_LIBRARY_PATH prepending order to make sure that config.library_path ends up before any potentially-system directories (e.g. config.hwloc_library_dir). This makes sure that we are testing against the just-built openmp libraries rather than the version that is already installed. Also rename the function to `prepend_*` to make it clearer what it actually does. https://github.com/llvm/llvm-project/issues/56821 Differential Revision: https://reviews.llvm.org/D130825	2022-08-01 18:54:06 +02:00
Joseph Huber	1d03b2efcd	[Libomptarget] Disable testing map_back_race.cpp This test hasn't been fixed and causes spurious failures when testing. This patch sets it as unsupported until we have a reliable fix. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D130789	2022-07-30 15:01:47 -04:00
tlattner	a140f43431	Update references to mailing lists that have moved to Discourse.	2022-07-29 15:55:38 -07:00
tlattner	520d29f381	Update references to mailing lists that have moved to Discourse.	2022-07-28 16:54:58 -07:00
Jon Chesterfield	ed0f218115	[openmp][amdgpu] Tear down amdgpu plugin accurately Moves DeviceInfo global to heap to accurately control lifetime. Moves calls from libomptarget to deinit_plugin later, plugins need to stay alive until very shortly before libomptarget is destructed. Leaving the deinit_plugin calls where initially inserted hits use after free from the dynamic_module.c offloading test (verified with valgrind that the new location is sound with respect to this) Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130714	2022-07-28 20:00:03 +01:00
Jon Chesterfield	c214cb6a68	[amdgpu][openmp][nfc] Restore stb_local on DeviceInfo symbol	2022-07-28 16:50:46 +01:00
Jon Chesterfield	75aa521064	[openmp][amdgpu] Move global DeviceInfo behind call syntax prior to using D130712	2022-07-28 16:40:42 +01:00
Jon Chesterfield	1f9d3974e4	[openmp] Introduce optional plugin init/deinit functions Will allow plugins to migrate away from using global variables to manage lifetime, which will fix a segfault discovered in relation to D127432 Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D130712	2022-07-28 16:21:38 +01:00
Sebastian Neubauer	50716ba2b3	[CMake][OpenMP] Remove wrong backslash outdir is defined in the line above, it will not exist in the install command, so it should not be escaped.	2022-07-28 14:35:04 +02:00
Joseph Huber	b08369f7f2	Revert "[OpenMP] Remove noinline attributes in the device runtime" The behaviour of this patch is not great, but it has some side-effects that are required for OpenMPOpt to work. The problem is that when we use `-mlink-builtin-bitcode` we only import used symbols from the runtime. Then OpenMPOpt will insert calls to symbols that were not previously included. This patch removed this implicit behaviour as these functions were kept alive by the `noinline` simply because it kept calls to them in the module. This caused regression in some tests that relied on some OpenMPOpt passes without using LTO. Reverting for the LLVM15 release but will try to fix it more correctly on main. This reverts commit `d61d72dae6`. Fixes #56752	2022-07-27 11:09:18 -04:00
Tom Stellard	809855b56f	Bump the trunk major version to 16	2022-07-26 21:34:45 -07:00
John Ericson	28e665fa05	[cmake] Slight fix ups to make robust to the full range of GNUInstallDirs See https://cmake.org/cmake/help/v3.14/module/GNUInstallDirs.html#result-variables for `CMAKE_INSTALL_FULL_*` Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D130545	2022-07-26 14:48:49 +00:00
Saiyedul Islam	4075a811ad	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127769	2022-07-26 02:44:31 -05:00
Joseph Huber	8c626fc0c8	[Libomptarget] Reintroduce host architecture checks for device RTL A previous patch removed the need to set the auxiliary architecture as it was no longer needed for the clang invocation after moving to using the clang frontend. However, this had a second use of preventing unsupported host architectures from building the device runtime. This caused failures when trying to build on 32-bit hosts for example. Fixes #56699 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130509	2022-07-25 17:01:12 -04:00
Joseph Huber	d61d72dae6	[OpenMP] Remove noinline attributes in the device runtime We previously used the `noinline` attributes to specify some defintions which should be kept alive in the runtime. These were then stripped immediately in the OpenMPOpt module pass. However, Since the changes in D130298, we not explicitly state which functions will have external visiblity in the bitcode library. Additionally the OpenMPOpt module pass should run before the inliner pass, so this shouldn't make a difference in whether or not the functions will be alive for the initial pass of OpenMPOpt. This should simplify the interface, and additionally save time spend on scanning funciton names for noinline. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130368	2022-07-25 15:44:50 -04:00
Saiyedul Islam	4cf30c5157	Revert "Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info""" This reverts commit `281eb9223c`.	2022-07-25 11:35:37 -05:00
Saiyedul Islam	281eb9223c	Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info"" This reverts commit `8cbf4a386b`.	2022-07-25 08:32:26 -05:00
Saiyedul Islam	8cbf4a386b	Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info" This reverts commit `471f2abc62`.	2022-07-25 05:32:59 -05:00
Saiyedul Islam	471f2abc62	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Differential Revision: https://reviews.llvm.org/D127769	2022-07-25 04:44:36 -05:00
Shilei Tian	b95d31a849	[OpenMP][Offloading] Enlarge the work size of `wtime.c` in case of any noise	2022-07-22 16:03:39 -04:00
Joel E. Denny	cfa6e79df3	[Libomptarget] Don't report lack of CUDA devices Sometimes libomptarget's CUDA plugin produces unhelpful diagnostics about a lack of CUDA devices before an application runs: ``` $ clang -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa hello-world.c $ ./a.out CUDA error: Error returned from cuInit CUDA error: no CUDA-capable device is detected Hello World: 4 ``` This can happen when the CUDA plugin was built but all CUDA devices are currently disabled in some manner, perhaps because `CUDA_VISIBLE_DEVICES` is set to the empty string. As shown in the above example, it can even happen when we haven't compiled the application for offloading to CUDA. The following code from `openmp/libomptarget/plugins/cuda/src/rtl.cpp` appears to be intended to handle this case, and it chooses not to write a diagnostic to stderr unless debugging is enabled: ``` if (NumberOfDevices == 0) { DP("There are no devices supporting CUDA.\n"); return; } ``` The problem is that the above code is never reached because the earlier `cuInit` returns `CUDA_ERROR_NO_DEVICE`. This patch handles that `cuInit` case in the same manner as the above code handles the `NumberOfDevices == 0` case. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130371	2022-07-22 14:46:45 -04:00
Shilei Tian	0c86c4f50c	[OpenMP] Fix test error introduced in D130179	2022-07-22 14:16:47 -04:00
Shilei Tian	602e0eb9f0	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 13:46:45 -04:00
Shilei Tian	77cb30e3a6	Revert "[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake" This reverts commit `ad34f1dba8`.	2022-07-22 11:45:13 -04:00
Shilei Tian	ad34f1dba8	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 11:43:30 -04:00
Joseph Huber	a3804a3145	[Libomptarget] Make the plugins link as LLVM libraries Previously we made `libomptarget` link as an LLVM library so we have access to the LLVM core libraries. After the initial patch stuck we can now apply the same changes to the plugins. This will allow us to use LLVM in all of `libomptarget` when we have uses for them. In the future this should allow us to remove the dependencies on `libelf`, `libffi`, and `dl`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D130262	2022-07-22 09:34:12 -04:00
Joseph Huber	908054df4f	[Libomptarget] Only export needed definitions in the BC library This patch adds the use of the `-internalize-public-api-file` option in the internalization pass to internalize any definition that isn't explicitly needed for the interface. This will allow us to perform more optimizations on the file that normally would not have been possible with functions internal to the library not being internal. Depends on D130293 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130298	2022-07-22 08:24:35 -04:00
Joseph Huber	e82e07d74a	[Libomptarget] Build the DeviceRTL BC using clang directly Currently the bitcode library is build using the clang front-end manually. This was originally done because we did not support device only compilation. Now we support device only compilation, at least for a single offloading toolchain, so we can instead use clang directly rather than using the front-end. This saves us needing to define things like `aux_triple`. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130293	2022-07-22 08:24:29 -04:00
Ron Lieberman	45a379ce2f	Revert "[Libomptarget] Stop testing CPU offloading with LTO" This reverts commit `3e8d46921f`.	2022-07-22 12:10:06 +00:00
Ye Luo	4794bbffb2	Revert "[OpenMP][OMPD] GDB plugin code to leverage libompd to provide debugging" This reverts commit `51d3f421f4`.	2022-07-21 22:00:33 -05:00
Ye Luo	ee95be3c46	Revert "Fixing build bot failure due to python-pip unavailability." This reverts commit `9dc0d6aaa1`.	2022-07-21 22:00:32 -05:00
Johannes Doerfert	1da6ae4b54	[OpenMP][FIX] Ensure thread and team state are defined properly The namespaces were missing causing the symbols to have "C" mangling. To avoid this in the future we qualify the names now fully.	2022-07-21 21:57:14 -05:00
Joseph Huber	3e8d46921f	[Libomptarget] Stop testing CPU offloading with LTO Summary: Some of the buildbots don't find the libraries because they don't build for the GPU. Although it should always be there it's unclear why these buildbots are having problemsd. LTO is only interesting on the GPU and these tests take extra time anyway so I'm just going to disable them for now.	2022-07-21 16:47:41 -04:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Johannes Doerfert	d150152615	[OpenMP] Introduce more fine-grained control over the thread state use We can help optimizations by making sure we use the team state whenever it is clear there is no thread state. To this end we introduce a new state flag (`state::HasThreadState`) and explicit control for the `state::ValueRAII` helpers, including a dedicated "assert equal". Differential Revision: https://reviews.llvm.org/D130113	2022-07-21 12:30:38 -05:00
Johannes Doerfert	7472b42b78	[OpenMP] Use Undef instead of null as pointer for inactive lanes Our conditional writes in the runtime look like this: ``` if (active) *ptr = value; ``` In the RAII we need to assign `ptr` which comes from a lookup call. If a thread that is not the main thread calls lookup with the intention to write the pointer, we'll create a new thread state. As such, we need to avoid calling lookup for inactive threads. We used to use `nullptr` as their `ptr` value but that can cause pessimistic reasoning. We now use `undef` instead. Differential Revision: https://reviews.llvm.org/D130114	2022-07-21 12:28:45 -05:00
Johannes Doerfert	a42361dc1c	[OpenMP] Expose the state in the header to allow non-lto optimizations We used to inline the `lookup` calls such that the runtime had "known" access offsets when it was shipped. With the new static library build it doesn't as the lookup is an indirection we cannot look through. This should help us optimize the code better until we can do LTO for the runtime again. Differential Revision: https://reviews.llvm.org/D130111	2022-07-21 12:28:44 -05:00
Joseph Huber	e01ce4e88a	[Libomptarget] Add checks for CUDA subarchitecture using new info This patch extends the `is_valid_binary` routine to also check if the binary's architecture string matches the one parsed from the runtime. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for CUDA. Depends on D127432 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D127505	2022-07-21 13:20:06 -04:00
Joseph Huber	fbcb1ee7f3	[Libomptarget] Add support for offloading binaries in libomptarget The previous path changed the linker wrapper to embed the offloading binary format inside the target image instead. This will allow us to more generically bundle metadata with these images, such as requires clauses or the target architecture it was compiled for. I wasn't sure how to handle this best, so I introduced a new type that replaces the old `__tgt_device_image` struct that we can expand inside the runtime library. I made the new `__tgt_device_binary` struct pretty much the same for now. In the future we could change this struct to pretty much be the `OffloadBinary` class in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127432	2022-07-21 13:20:04 -04:00
Joseph Huber	5d8a76feb0	[Libomptarget] Build the device library even if the sm list is empty We previously had some logic that stopped us from building the device runtime if there were no NVPTX architectures provided. This is incorrect because we could have AMDGPU libraries. Even if the lists are empty we should be able to attempt to build these and get dummy output. THis wilil make it much easier for our tooling which expects certain libraries. If the user wishes to disable the library entirely they should use `-DLIBOMPTARGET_BUILD_DEVICERTL_BCLIB=OFF" Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130266	2022-07-21 10:57:47 -04:00
Joseph Huber	dc52712a06	[Libomptarget] Make libomptarget an LLVM library This patch makes libomptarget depend on LLVM libraries to be built. The reason for this is because we already have an implicit dependency on LLVM headers for ELF identification and extraction as well as an optional dependenly on the LLVMSupport library for time tracing information. Furthermore, there are changes in the future that require using more LLVM libraries, and will heavily simplify some future code as well as open up the large amount of useful LLVM libraries to libomptarget. This will make "standalone" builds of `libomptarget' more difficult for vendors wishing to ship their own. This will require a sufficiently new version of LLVM to be installed on the system that should be picked up by the existing handling for the implicit headers. The things this patch changes are as follows: - `libomptarget.so` links against LLVMSupport and LLVMObject - `libomptarget.so` is a symbolic link to `libomptarget.so.15` - If using a shared library build, user applications will depend on LLVM libraries as well - We can now use LLVM resources in Libomptarget. Note that this patch only changes this to apply to libomptarget itself, not the plugins. Additional patches will be necessary for that. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D129875	2022-07-20 15:58:06 -04:00
Joseph Huber	b5b20164d2	Revert "[Libomptarget] Make libomptarget an LLVM library" This reverts commit `643dfd97d5`. This patch still makes the AMDGPU buildbots unhappy. Reverting for now until the AMD folks figure it out.	2022-07-20 10:18:55 -04:00
Joseph Huber	6b0db92bbd	[Libomptarget] Fix LTO command line in test Summary: The test passed -offload-lto instead of -foffload-lto.	2022-07-20 10:18:55 -04:00
Joseph Huber	643dfd97d5	[Libomptarget] Make libomptarget an LLVM library This patch makes libomptarget depend on LLVM libraries to be built. The reason for this is because we already have an implicit dependency on LLVM headers for ELF identification and extraction as well as an optional dependenly on the LLVMSupport library for time tracing information. Furthermore, there are changes in the future that require using more LLVM libraries, and will heavily simplify some future code as well as open up the large amount of useful LLVM libraries to libomptarget. This will make "standalone" builds of `libomptarget' more difficult for vendors wishing to ship their own. This will require a sufficiently new version of LLVM to be installed on the system that should be picked up by the existing handling for the implicit headers. The things this patch changes are as follows: - `libomptarget.so` links against LLVMSupport and LLVMObject - `libomptarget.so` is a symbolic link to `libomptarget.so.15` - If using a shared library build, user applications will depend on LLVM libraries as well - We can now use LLVM resources in Libomptarget. Note that this patch only changes this to apply to libomptarget itself, not the plugins. Additional patches will be necessary for that. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D129875	2022-07-20 09:52:09 -04:00
Jonathan Peyton	40ce65b5b2	[OpenMP][libomp] Fix affinity warnings and unify under one macro Warnings that occur during affinity initialization are supposed to be guarded by KMP_AFFINITY=nowarnings,noverbose, but some had been missed by this logic. Create one macro for affinity warnings that takes these settings into account. Differential Revision: https://reviews.llvm.org/D125991	2022-07-19 13:10:25 -05:00
AndreyChurbanov	17dcde5f1b	[OpenMP][libomp] Allow reset affinity mask after parallel Added control to reset affinity of primary thread after outermost parallel region to initial affinity encountered before OpenMP runtime was initialized. KMP_AFFINITY environment variable reset/noreset modifier introduced. Default behavior is unchanged. Differential Revision: https://reviews.llvm.org/D125993	2022-07-19 13:05:05 -05:00
Jonathan Peyton	28c8da2965	[OpenMP][libomp] Fix fallthrough attribute detection for Intel compilers icc does not properly detect lack of fallthrough attribute since it defines __GNU__ > 7 and also icc's __has_cpp_attribute/__has_attribute feature detectors do not properly detect the lack of fallthrough attribute. Differential Revision: https://reviews.llvm.org/D126001	2022-07-19 13:04:25 -05:00
AndreyChurbanov	a01d274fbd	[OpenMP][libomp] Fix /dev/shm pollution after forked child process terminates Made library registration conditional and skip it in the __kmp_atfork_child handler, postponed it till middle initialization in the child. This fixes the problem of applications those use e.g. popen/pclose which terminate the forked child process. Differential Revision: https://reviews.llvm.org/D125996	2022-07-19 12:59:58 -05:00
Jon Chesterfield	e46f727b38	Revert "[Libomptarget] Make libomptarget an LLVM library" This reverts commit `70039be627`.	2022-07-19 17:59:45 +01:00
Joseph Huber	70039be627	[Libomptarget] Make libomptarget an LLVM library This patch makes libomptarget depend on LLVM libraries to be built. The reason for this is because we already have an implicit dependency on LLVM headers for ELF identification and extraction as well as an optional dependenly on the LLVMSupport library for time tracing information. Furthermore, there are changes in the future that require using more LLVM libraries, and will heavily simplify some future code as well as open up the large amount of useful LLVM libraries to libomptarget. This will make "standalone" builds of `libomptarget' more difficult for vendors wishing to ship their own. This will require a sufficiently new version of LLVM to be installed on the system that should be picked up by the existing handling for the implicit headers. The things this patch changes are as follows: - `libomptarget.so` links against LLVMSupport and LLVMObject - `libomptarget.so` is a symbolic link to `libomptarget.so.15` - If using a shared library build, user applications will depend on LLVM libraries as well - We can now use LLVM resources in Libomptarget. Note that this patch only changes this to apply to libomptarget itself, not the plugins. Additional patches will be necessary for that. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D129875	2022-07-19 12:33:31 -04:00
Joseph Huber	cdea437057	[Libomptarget] Fix warnings on address space attributes The device runtime uses the address space attribute to control the placement of important constants on the GPU. The changes made in D126061 caused these to start emitting errors as they were not applied to the type. This patch fixes the issues to make the warnings go away. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D129896	2022-07-15 17:21:30 -04:00
Joseph Huber	1f940b69c3	[Libomptarget][NFC] Fix signed comparison warnings Summary: Non-functional change, just fixing some sign comparison warnings by making both match.	2022-07-15 13:22:55 -04:00
Shilei Tian	65ebcee197	[OpenMP] Ignore .eggs file in OpenMP The OMPD patches introduces GDB plugin. When it is built, it will create a coulple of temp files in `.eggs`. This patch add it into `.gitignore` in case it messed up the git tracking. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D129711	2022-07-14 12:06:50 -04:00
Joseph Huber	b1d574867d	[Libomptarget] Allow static assert to work on 32-bit systems Summary: We use a static assert to make sure that someone doesn't change the size of an argument struct without properly updating all the other logic. This originally only checked the size on a 64-bit system with 8-byte pointers, causing builds on 32-bit systems to fail. This patch allows either pointer size to work. Fixes #56486	2022-07-12 08:05:01 -04:00
Vignesh Balasubramanian	9dc0d6aaa1	Fixing build bot failure due to python-pip unavailability. commit: `51d3f421f4` failed due to missing python-pip om machine. Now the ompd gdb-plugin code will be skipped with a warning if pip is not available in the machine.	2022-07-12 16:01:59 +05:30
Vignesh Balasubramanian	51d3f421f4	[OpenMP][OMPD] GDB plugin code to leverage libompd to provide debugging support for OpenMP programs. This is 5th of 6 patches started from https://reviews.llvm.org/D100181 This plugin code, when loaded in gdb, adds a few commands like ompd icv, ompd bt, ompd parallel. These commands create an interface for GDB to read the OpenMP runtime through libompd. Reviewed By: @dreachem Differential Revision: https://reviews.llvm.org/D100185	2022-07-12 14:38:41 +05:30
Shilei Tian	e7d998e51e	[NFC][OpenMP][Offloading] Fix compilation warning caused by misuse of `static_cast`	2022-07-08 20:59:37 -04:00
Joseph Huber	269d5c16bc	[Libomptarget][NFC] Move legacy functions to a separate file This patch moves the old legacy interfaces into `libomptarget` to a separate file. These do not need to be included anywhere and are simply provided for backwards compatibility with the ABI. This cleans up the interface greatly. Depends on D128817 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D128818	2022-07-08 14:44:21 -04:00
Joseph Huber	c9353eb4bc	[Libomptarget] Use new tripcount argument in the runtime. The previous patch added an argument to the `__tgt_target_kernel` runtime function which includes the tripcount used for the loop clause. This was originally passed in via the `__kmpc_push_target_tripcount` function. Now we move this logic to the kernel launch itself and remove the need for the push function. Depends on D128816 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128817	2022-07-08 14:44:19 -04:00
Joseph Huber	ad23e4d85f	[Libomptarget] Implement a unified kernel entry function This patch implements a unified kernel entry function that will be targeted from both teams and non-teams clauses. We introduce a new interface and make the old functions call in using the new one. A following patch will include the necessary changes to Clang to call these new functions instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128549	2022-07-08 14:44:06 -04:00
Ye Luo	fca79b78c4	[libomptarget] compile DeviceRTL bc files with -O3 bc files of DeviceRTL are compiled with -O3, the same as the static library. Differential Revision: https://reviews.llvm.org/D129344	2022-07-08 10:00:26 -05:00
Vadim Paretsky	43d5c4d539	[OpenMP] add 4 custom APIs supporting MSVC OMP codegen This check-in adds 4 APIs to support MSVC, specifically: * 3 APIs (__kmpc_sections_init, __kmpc_next_section, __kmpc_end_sections) to support the dynamic scheduling of OMP sections. * 1 API (__kmpc_copyprivate_light, a light-weight version of __kmpc_copyrprivate) to support the OMP single copyprivate clause. Differential Revision: https://reviews.llvm.org/D128403	2022-07-05 17:26:18 -05:00
Joseph Huber	d27d0a673c	[Libomptarget][NFC] Make Libomptarget use the LLVM naming convention Libomptarget grew out of a project that was originally not in LLVM. As we develop libomptarget this has led to an increasingly large clash between the naming conventions used. This patch fixes most of the variable names that did not confrom to the LLVM standard, that is `VariableName` for variables and `functionName` for functions. This patch was primarily done using my editor's linting messages, if there are any issues I missed arising from the automation let me know. Reviewed By: saiislam Differential Revision: https://reviews.llvm.org/D128997	2022-07-05 14:53:38 -04:00
Shilei Tian	696bca9bb2	[NFC][OpenMP][CUDA] Remove unnecessary default label	2022-07-01 09:50:29 -04:00
Jose M Monsalve Diaz	616dd9ae14	[OpenMP] Implementing omp_get_device_num() This patch implements omp_get_device_num() in the host and the device. It uses the already existing getDeviceNum in the device config for the device. And in the host it uses the omp_get_num_devices(). Two simple tests added Differential Revision: https://reviews.llvm.org/D128347	2022-06-29 02:18:21 -05:00
Shilei Tian	2695e23ad9	[OpenMP][CUDA] Fix the issue that P2P memcpy doesn't work This patch fixes the issue that P2P memcpy doesn't work. The root cause is we didn't set current context when calling the API function. In addition, a matrix to track the states of each pair of devices is also added such that we only need to query and configure the device once. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D122764	2022-06-28 15:32:03 -04:00
Daniel Douglas	d4a7b8de52	[OpenMP][libomp] avoid spin wait and yield on arm64 macOS This patch changes the default behavior to avoid spin waiting and yielding. (See “Don’t Keep Threads Active And Idle” section here: https://developer.apple.com/documentation/apple-silicon/tuning-your-code-s-performance-for-apple-silicon) We verified using instruments traces that the changes improve scheduling behavior on macOS. We also collected results using EPCC schedbench (https://github.com/LangdalP/EPCC-OpenMP-micro-benchmarks) that are attached here that show a reduction in standard deviation and max test run time across all scheduling types. Static scheduling sees dramatic improvements with these changes, we see a 2-4x average runtime improvement in the benchmark. Differential Revision: https://reviews.llvm.org/D126510	2022-06-24 12:02:16 -05:00
Jonathan Peyton	b7b4986576	[OpenMP][libomp] Hold old __kmp_threads arrays until library shutdown When many nested teams are formed, __kmp_threads may be reallocated to accommodate new threads. This reallocation causes a data race when another existing team's thread simultaneously references __kmp_threads. This patch keeps the old thread arrays around until library shutdown so these lingering references can complete without issue and access to __kmp_threads remains a simple array reference. Fixes: https://github.com/llvm/llvm-project/issues/54708 Differential Revision: https://reviews.llvm.org/D125013	2022-06-22 10:30:35 -05:00
Joseph Huber	3351ae61d9	[Libomptarget] Remove duplicate data environment exit Summary: This patch removes a duplicated exit from the OpenMP data envrionment. We already have an RAII method that guards this environment so it is unnecessary.	2022-06-21 22:35:32 -04:00
Ye Luo	4d9499e8cc	[libomptarget] Make libomptarget.devicertl.a built in all cases. Make libomptarget.device.a built when using -DLLVM_ENABLE_PROJECTS=openmp Use add_custom_command. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D128130	2022-06-20 08:29:16 -05:00
Ye Luo	54b45afb59	[libomptarget]Add a trap for external omptarget from LLVM Old LLVM installation may expose its internal omptarget CMake target when being used by find_package(LLVM) and caused issues in the CMake of libomptarget that is being built. Trap the issue early. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D128129	2022-06-18 21:08:53 -05:00
Joseph Huber	d87ca519c9	[Libomptarget] Use binutils archive executable to address failing tests Summary: The static linking test ensures that we can statically link offloading programs. To create the test we used `llvm-ar`. However, this may not exist in the user's environment. This patch changes it to use the binutils `ar` which should exist on every system running these tests currently. In the future we should set up the dependencies properly.	2022-06-14 22:14:17 -04:00
Joseph Huber	d5d836635c	[Libomptarget] Add test config for compiling in LTO-mode We are planning on making LTO the default compilation mode for offloading. In order to make sure it works we should run these tests on the test suite. AMDGPU already uses the LTO compilation path for its linking, but in LTO mode it also links the static library late. Performing LTO requires the static library to be built, if we make the change this will be a hard requirement and the old bitcode library will go away. This means users will need to use either a two-step build or a runtimes build for libomptarget. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127512	2022-06-14 10:16:03 -04:00
John Ericson	0bb317b7bf	Revert "[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore" This reverts commit `d5daa5c5b0`.	2022-06-10 19:26:12 +00:00
John Ericson	d5daa5c5b0	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-06-10 14:35:18 +00:00
Yuki Okushi	074f12e467	[OpenMP] Fix the build on Windows The code expanded from kmp_barrier.h uses some `KMP_INTERNAL_*`s, so the definitions have to be placed before it. Fixes #55815 Differential Revision: https://reviews.llvm.org/D126873	2022-06-09 22:12:42 +09:00
Jose Manuel Monsalve Diaz	15ed5c0a07	[LIBOMPTARGET] Adding AMD to llvm-omp-device-info Adding device information print for AMD devices on the `llvm-omp-device-info` command line tool. The output is inspired by the rocminfo command line tool. This commit adds missing HSA functions, enums and structs needed to query additional information from the HSA agents. A generic message for the `generic-elf-64bit` plugin is also added Example of an output: ``` llvm-omp-device-info Device (0): This is a generic-elf-64bit device Device (1): This is a generic-elf-64bit device Device (2): This is a generic-elf-64bit device Device (3): This is a generic-elf-64bit device Device (4): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 0 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (5): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 1 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (6): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 2 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (7): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 3 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE ``` Differential Revision: https://reviews.llvm.org/D126836	2022-06-09 11:58:39 +00:00
Jose Manuel Monsalve Diaz	84e020a061	Revert "[LIBOMPTARGET] Adding AMD to llvm-omp-device-info" This reverts commit `d16a0877d8`.	2022-06-09 10:46:03 +00:00
Jose Manuel Monsalve Diaz	d16a0877d8	[LIBOMPTARGET] Adding AMD to llvm-omp-device-info Adding device information print for AMD devices on the `llvm-omp-device-info` command line tool. The output is inspired by the rocminfo command line tool. This commit adds missing HSA functions, enums and structs needed to query additional information from the HSA agents. A generic message for the `generic-elf-64bit` plugin is also added Example of an output: ``` llvm-omp-device-info Device (0): This is a generic-elf-64bit device Device (1): This is a generic-elf-64bit device Device (2): This is a generic-elf-64bit device Device (3): This is a generic-elf-64bit device Device (4): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 0 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (5): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 1 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (6): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 2 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (7): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 3 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE ``` Differential Revision: https://reviews.llvm.org/D126836	2022-06-08 16:31:12 +00:00
Joseph Huber	86a4c78047	[Libomptarget] Add missing include to define `printf` Summary: This test was failing because of an implicit declaration of `printf` which isn't legal with newer C, causing it to fail. This patch just adds the necessary header.	2022-06-08 09:56:51 -04:00
Joseph Huber	421b1f55c6	[Libomptarget] Do not use retaining attributes for the static library When we build the libomptarget device runtime library targeting bitcode, we need special care to make sure that certain functions are not optimized out. This is because we manually internalize and optimize these definitions, ignoring their standard linkage semantics. When we build with the static library, we can maintain these semantics and we do not need these to be kept-alive. Furthermore, if they are kept-alive it prevents them from being removed during LTO. This prevents us from completely internalizing `IsSPMDMode` and removing several other functions. This patch removes these for the static library target by using a macro definition to enable them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D126701	2022-06-07 12:16:34 -04:00
Vadim Paretsky	f58fe2e186	[OpenMP] allow loc to be NULL in __kmp_determine_reduction_method for MSVC MSVC may not supply source location information to kmpc_reduce passing NULL for the value. The patch adds a check for the loc value being NULL in kmp_determine_reduction_method. Differential Revision: https://reviews.llvm.org/D126564	2022-06-03 14:11:39 -05:00
Daniel Douglas	5d25dbff67	[OpenMP][libomp] do not try to dlopen libmemkind on macOS The memkind library is only available for linux. Calling dlopen here can also be problematic in a client app that fork'ed. Differential Revision: https://reviews.llvm.org/D126579	2022-06-02 14:28:09 -05:00
David CARLIER	2ba5d820e2	[OpenMP] omp_get_proc_id uses sched_getcpu fallback on FreeBSD 13.1 and above. Reviewers: jlpeyton, jdoerfert Reviewed-By: jlpeyton Differential-Revision: https://reviews.llvm.org/D126408	2022-06-02 17:10:29 +01:00

1 2 3 4 5 ...

2498 Commits