llvm-project

Commit Graph

Author	SHA1	Message	Date
Alex Richardson	a602f76a24	[clang][TargetInfo] Use LangAS for getPointer{Width,Align}() Mixing LLVM and Clang address spaces can result in subtle bugs, and there is no need for this hook to use the LLVM IR level address spaces. Most of this change is just replacing zero with LangAS::Default, but it also allows us to remove a few calls to getTargetAddressSpace(). This also removes a stale comment+workaround in CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does return the expected size for ReferenceType (and handles address spaces). Differential Revision: https://reviews.llvm.org/D138295	2022-11-30 20:24:01 +00:00
Artem Belevich	0e8a414ab3	[CUDA, NVPTX] Added basic __bf16 support for NVPTX. Recent Clang changes expose _bf16 types for SSE2-enabled host compilations and that makes those types visible furing GPU-side compilation, where it currently fails with Sema complaining that __bf16 is not supported. Considering that __bf16 is a storage-only type, enabling it for NVPTX if it's enabled on the host should pose no issues, correctness-wise. Recent NVIDIA GPUs have introduced bf16 support, so we'll likely grow better support for __bf16 on NVPTX going forward. Differential Revision: https://reviews.llvm.org/D136311	2022-10-25 11:08:06 -07:00
Artem Belevich	9a01cca660	Add support for CUDA-11.8 and sm_{87,89,90} GPUs. Differential Revision: https://reviews.llvm.org/D135306	2022-10-07 13:59:28 -07:00
Artem Belevich	f3a2cbcf97	Refactored CUDA version housekeeping to use less boilerplate. Differential Revision: https://reviews.llvm.org/D135328	2022-10-07 13:59:23 -07:00
Joseph Huber	002a63f937	[OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP Currently we define the `__CUDA_ARCH__` macro only in CUDA mode. This patch allows us to use this macro in OpenMP-offloading mode when targeting NVPTX. Reviewed By: tra, tianshilei1992 Differential Revision: https://reviews.llvm.org/D125256	2022-05-13 14:38:35 -04:00
Joe Nash	8bdfc73f63	[AMDGPU][clang] Definition of gfx11 subtarget Contributors: Jay Foad <jay.foad@amd.com> Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> Patch 2/N for upstreaming of AMDGPU gfx11 architecture Depends on D124536 Reviewed By: foad, kzhuravl, #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D124537	2022-04-29 13:55:56 -04:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin	2e2e64df4a	[AMDGPU] Add gfx940 target This is target definition only. Differential Revision: https://reviews.llvm.org/D120688	2022-03-02 13:54:48 -08:00
Yaxun (Sam) Liu	a6786cdd57	[HIPSPV][3/4] Enable SPIR-V emission for HIP This patch enables SPIR-V binary emission for HIP device code via the HIPSPV tool chain. ‘--offload’ option, which is envisioned in [1], is added for specifying offload targets. This option is used to override default device target (amdgcn-amd-amdhsa) for HIP compilation for emitting device code as SPIR-V binary. The option is handled in getHIPOffloadTargetTriple(). getOffloadingDeviceToolChain() function (based on the design in the SYCL repository) is added to select HIPSPVToolChain when HIP offload target is ‘spirv64’. The HIPActionBuilder is modified to produce LLVM IR at the backend phase. HIPSPV tool chain expects to receive HIP device code as LLVM IR so it can run external LLVM passes over them. HIPSPV TC is also responsible for emitting the SPIR-V binary. A Cuda GPU architecture ‘generic’ is added. The name is picked from the LLVM SPIR-V Backend. In the HIPSPV code path the architecture name is inserted to the bundle entry ID as target ID. Target ID is expected to be always present so a component in the target triple is not mistaken as target ID. Tests are added for checking the HIPSPV tool chain. [1]: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html Patch by: Henry Linjamäki Reviewed by: Yaxun Liu, Artem Belevich, Alexey Bader Differential Revision: https://reviews.llvm.org/D110622	2021-12-20 10:45:09 -05:00
Carlos Galvez	7ecec3f0f5	[CUDA] Bump supported CUDA version to 11.5 Differential Revision: https://reviews.llvm.org/D113249	2021-11-09 08:20:53 +00:00
Artem Belevich	49d982d8cb	[CUDA] Add support for CUDA-11.4 Differential Revision: https://reviews.llvm.org/D108239	2021-08-23 13:24:46 -07:00
Jon Chesterfield	c2574e63ff	[openmp][nfc] Refactor GridValues Remove redundant fields and replace pointer with virtual function Of fourteen fields, three are dead and four can be computed from the remainder. This leaves a couple of currently dead fields in place as they are expected to be used from the deviceRTL shortly. Two of the fields that can be computed are only used from codegen and require a log2() implementation so are inlined into codegen instead. This change leaves the new methods in the same location in the struct as the previous fields for convenience at review. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108380	2021-08-23 16:19:11 +01:00
Jon Chesterfield	b1efeface7	Revert "[openmp][nfc] Refactor GridValues" Failed a nvptx codegen test This reverts commit `2a47a84b40`.	2021-08-20 18:17:27 +01:00
Jon Chesterfield	2a47a84b40	[openmp][nfc] Refactor GridValues Remove redundant fields and replace pointer with virtual function Of fourteen fields, three are dead and four can be computed from the remainder. This leaves a couple of currently dead fields in place as they are expected to be used from the deviceRTL shortly. Two of the fields that can be computed are only used from codegen and require a log2() implementation so are inlined into codegen instead. This change leaves the new methods in the same location in the struct as the previous fields for convenience at review. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108380	2021-08-20 16:41:26 +01:00
Jon Chesterfield	77579b99e9	[openmp][nfc] Replace OMPGridValues array with struct [nfc] Replaces enum indices into an array with a struct. Named the fields to match the enum, leaves memory layout and initialization unchanged. Motivation is to later safely remove dead fields and replace redundant ones with (compile time) computation. It should also be possible to factor some common fields into a base and introduce a gfx10 amdgpu instance with less duplication than the arrays of integers require. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D108339	2021-08-19 13:25:42 +01:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit `211e584fa2`. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit `ea10a86984`. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Brendon Cahoon	ea10a86984	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
Artem Belevich	2aa01ccec3	[CUDA, NVPTX] Allow targeting sm_86 GPUs. The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o adding any new functionality. Differential Revision: https://reviews.llvm.org/D95974	2021-02-09 11:01:10 -08:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Stanislav Mekhanoshin	d1beb95d12	[AMDGPU] gfx1032 target Differential Revision: https://reviews.llvm.org/D89487	2020-10-15 12:41:18 -07:00
Tim Renouf	666ef0db20	[AMDGPU] Add gfx602, gfx705, gfx805 targets At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-10-10 17:22:22 +01:00
Yaxun (Sam) Liu	cbd420c5ed	[CUDA][HIP] Fix bound arch for offload action for fat binary Currently CUDA/HIP toolchain uses "unknown" as bound arch for offload action for fat binary. This causes -mcpu or -march with "unknown" added in HIPToolChain::TranslateArgs or CUDAToolChain::TranslateArgs. This causes issue for https://reviews.llvm.org/D88377 since HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs. The bound arch of offload action for fat binary is not really used, therefore set it to CudaArch::UNUSED. Differential Revision: https://reviews.llvm.org/D88524	2020-10-02 19:05:51 -04:00
Stanislav Mekhanoshin	ea7d0e2996	[AMDGPU] gfx1031 target Differential Revision: https://reviews.llvm.org/D85337	2020-08-05 12:36:26 -07:00
Stanislav Mekhanoshin	9ee272f13d	[AMDGPU] Add gfx1030 target Differential Revision: https://reviews.llvm.org/D81886	2020-06-15 16:18:05 -07:00
Saiyedul Islam	4022bc2a6c	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 2 Summary: New file include to support platform dependent grid constants. It will be used by clang, libomptarget plugins, and deviceRTLs to access constant values consistently and with fast access in the deviceRTLs. Originally authored by Greg Rodgers (@gregrodgers). Reviewers: arsenm, sameerds, jdoerfert, yaxunl, b-sumner, scchan, JonChesterfield Reviewed By: arsenm Subscribers: llvm-commits, pdhaliwal, jholewinski, jvesely, wdng, nhaehnle, guansong, kerbowa, sstefan1, cfe-commits, ronlieb, gregrodgers Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80917	2020-06-10 18:09:59 +00:00
Artem Belevich	a9627b7ea7	[CUDA] Add partial support for recent CUDA versions. Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11. None of the new features of CUDA-10.2+ have been implemented yet, so using these versions will still produce a warning. Differential Revision: https://reviews.llvm.org/D77670	2020-04-08 11:19:44 -07:00
Yaxun Liu	6add24adaf	[HIP] Add GPU arch gfx1010, gfx1011, and gfx1012 Differential Revision: https://reviews.llvm.org/D64364 llvm-svn: 365799	2019-07-11 17:50:09 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Tom Tan	b7c6d95af5	[COFF, ARM64] Align global symbol by size for ARM64 MSVC ABI According to alignment section in below ARM64 ABI document, MSVC could increase alignment of global data based on its total size. Clang doesn't do this. Compile the same symbol into different alignments by Clang and MSVC could cause link error because some instruction encodings, like 64-bit LDR/STR with immediate, require the target to be 8 bytes aligned, and linker could choose code stream with such LDR/STR instruction from MSVC and 4 bytes aligned data from Clang into final image, which actually cannot be linked together (see https://bugs.llvm.org/show_bug.cgi?id=41506 for more details). https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#alignment Differential Revision: https://reviews.llvm.org/D61225 llvm-svn: 359744	2019-05-02 00:38:14 +00:00
Artem Belevich	5fe85a003f	[CUDA] Implemented _[bi]mma* builtins. These builtins provide access to the new integer and sub-integer variants of MMA (matrix multiply-accumulate) instructions provided by CUDA-10.x on sm_75 (AKA Turing) GPUs. Also added a feature for PTX 6.4. While Clang/LLVM does not generate any PTX instructions that need it, we still need to pass it through to ptxas in order to be able to compile code that uses the new 'mma' instruction as inline assembly (e.g used by NVIDIA's CUTLASS library https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101) Differential Revision: https://reviews.llvm.org/D60279 llvm-svn: 359248	2019-04-25 22:28:09 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Tim Renouf	632f35d495	Add gfx909 to GPU Arch Subscribers: jholewinski, cfe-commits Differential Revision: https://reviews.llvm.org/D53558 llvm-svn: 345198	2018-10-24 21:19:02 +00:00
Yaxun Liu	83b5f35d85	Add gfx904 and gfx906 to GPU Arch Differential Revision: https://reviews.llvm.org/D53472 llvm-svn: 344996	2018-10-23 02:05:31 +00:00
Artem Belevich	44ecb0e3c2	[CUDA] Added basic support for compiling with CUDA-10.0 llvm-svn: 342924	2018-09-24 23:10:44 +00:00
Artem Belevich	679dafe69e	[CUDA] Added -f[no-]cuda-short-ptr option The option enables use of 32-bit pointers for accessing const/local/shared memory. The feature is disabled by default. Differential Revision: https://reviews.llvm.org/D46148 llvm-svn: 331938	2018-05-09 23:10:09 +00:00
Artem Belevich	2f8efcf3ca	[NVPTX] Removed 'satom' feature which is no longer used. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830	2018-04-11 17:51:33 +00:00
Artem Belevich	24e8a680e5	[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829	2018-04-11 17:51:19 +00:00
Yaxun Liu	bec8a66454	[CUDA] Revert defining __CUDA_ARCH__ for amdgcn targets amdgcn targets only support HIP, which does not define __CUDA_ARCH__. this is a partial unroll of r329232 / D45277. Differential Revision: https://reviews.llvm.org/D45387 llvm-svn: 329584	2018-04-09 15:43:01 +00:00
Yaxun Liu	8a5fc15aa4	[CUDA] Add amdgpu sub archs Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45277 llvm-svn: 329232	2018-04-04 21:19:27 +00:00
Erich Keane	d45879d8ad	Add NVPTX Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 This patch adds NVPTX support for enabling the march notes. Differential Revision: https://reviews.llvm.org/D43045 llvm-svn: 324675	2018-02-08 23:16:00 +00:00
Artem Belevich	fbc56a904f	[CUDA] Added partial support for CUDA-9.1 Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went through substantial changes that started in CUDA-9.0 which required substantial changes in the cuda compatibility headers provided by clang. There are two major issues: * CUDA SDK no longer provides declarations for libdevice functions. * A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations. This patch changes the way CUDA headers are handled if we compile with CUDA 9.x. Both 9.0 and 9.1 are affected. * Clang provides its own declarations of libdevice functions. * For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice. This patch should not affect compilation with CUDA-8. There may be some observable differences for CUDA-9.0, though they are not expected to affect functionality. Tested: CUDA test-suite tests for all supported combinations of: CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70 Differential Revision: https://reviews.llvm.org/D42513 llvm-svn: 323713	2018-01-30 00:00:12 +00:00
Jonas Hahnfeld	87d4426988	[OpenMP] Show error if VLAs are not supported Some target devices (e.g. Nvidia GPUs) don't support dynamic stack allocation and hence no VLAs. Print errors with description instead of failing in the backend or generating code that doesn't work. This patch handles explicit uses of VLAs (local variable in target or declare target region) or implicitly generated (private) VLAs for reductions on VLAs or on array sections with non-constant size. Differential Revision: https://reviews.llvm.org/D39505 llvm-svn: 318601	2017-11-18 21:00:46 +00:00
Artem Belevich	8af4e23d1e	[CUDA] Added rudimentary support for CUDA-9 and sm_70. For now CUDA-9 is not included in the list of CUDA versions clang searches for, so the path to CUDA-9 must be explicitly passed via --cuda-path=. On LLVM side NVPTX added sm_70 GPU type which bumps required PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment. Differential Revision: https://reviews.llvm.org/D37576 llvm-svn: 312734	2017-09-07 18:14:32 +00:00
Erich Keane	ebba592682	Break up Targets.cpp into a header/impl pair per target type[NFCI] Targets.cpp is getting unwieldy, and even minor changes cause the entire thing to cause recompilation for everyone. This patch bites the bullet and breaks it up into a number of files. I tended to keep function definitions in the class declaration unless it caused additional includes to be necessary. In those cases, I pulled it over into the .cpp file. Content is copy/paste for the most part, besides includes/format/etc. Differential Revision: https://reviews.llvm.org/D35701 llvm-svn: 308791	2017-07-21 22:37:03 +00:00

49 Commits