Commit Graph

33 Commits

Author SHA1 Message Date
Yaxun Liu c912e867c0 [HIP] Make __hip_gpubin_handle hidden to avoid being merged across different shared libraries
Different shared libraries contain different fat binary, which is stored in a global variable
__hip_gpubin_handle. Since different compilation units share the same fat binary, this
variable has linkonce linkage. However, it should not be merged across different shared
libraries.

This patch set the visibility of the global variable to be hidden, which will make it invisible
in the shared library, therefore preventing it from being merged.

Differential Revision: https://reviews.llvm.org/D50596


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@340056 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-17 17:47:31 +00:00
Yaxun Liu 67428b72ed [HIP] Register/unregister device fat binary only once
HIP generates one fat binary for all devices after linking. However, for each compilation
unit a ctor function is emitted which register the same fat binary. Measures need to be
taken to make sure the fat binary is only registered once.

Currently each ctor function calls __hipRegisterFatBinary and stores the returned value
to __hip_gpubin_handle. This patch changes the linkage of __hip_gpubin_handle to be linkonce
so that they are shared between LLVM modules. Then this patch adds check of value of
__hip_gpubin_handle to make sure __hipRegisterFatBinary is only called once. The code
is equivalent to

void *_gpubin_handle;
void ctor() {
  if (__hip_gpubin_handle == 0) {
    __hip_gpubin_handle = __hipRegisterFatBinary(...);
  }
  // register kernels and variables.
}
The patch also does similar change to dtors so that __hipUnregisterFatBinary
is called once.

Differential Revision: https://reviews.llvm.org/D49083


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@337631 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 22:45:24 +00:00
Artem Belevich 9c9dd21ad7 [CUDA] Place all CUDA sections in __NV_CUDA segment on Mac.
That's where CUDA binaries appear to put them.

Differential Revision: https://reviews.llvm.org/D48615

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@335880 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 17:15:52 +00:00
Artem Belevich 9185e52f3e [CUDA] Use atexit() to call module destructor.
This matches the way NVCC does it. Doing module cleanup at global
destructor phase used to work, but is, apparently, too late for
the CUDA runtime in CUDA-9.2, which ends up crashing with double-free.

Differential Revision: https://reviews.llvm.org/D48613

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@335763 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 18:32:51 +00:00
Jonas Hahnfeld dcdd53793e [CUDA] Fix emission of constant strings in sections
CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue
to unnamed. When emitting the object file LLVM will mark the surrounding
section as SHF_MERGE iff the string is nul-terminated and contains no
other nuls (see IsNullTerminatedString). This results in problems when
saving temporaries because LLVM doesn't set an EntrySize, so reading in
the serialized assembly file fails.
This never happened for the GPU binaries because they usually contain
a nul-character somewhere. Instead this only affected the module ID
when compiling relocatable device code.

However, this points to a potentially larger problem: If we put a
constant string into a named section, we really want the data to end
up in that section in the object file. To avoid LLVM merging sections
this patch unmarks the GlobalVariable's address as unnamed which also
fixes the problem of invalid serialized assembly files when saving
temporaries.

Differential Revision: https://reviews.llvm.org/D47902

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@334281 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-08 11:17:08 +00:00
Yaxun Liu 69f63a0cc2 [HIP] Support offloading by linker script
To support linking device code in different source files, it is necessary to
embed fat binary at host linking stage.

This patch emits an external symbol for fat binary in host codegen, then
embed the fat binary by lld through a linker script.

Differential Revision: https://reviews.llvm.org/D46472


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@332724 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-18 15:07:56 +00:00
Yaxun Liu 21ec9544a5 [HIP] Add hip input kind and codegen for kernel launching
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).

This patch adds support of input kind and language standard hip.

When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.

This patch also adds support of kernel launching of HIP program using HIP host API.

When -x hip is not specified, there is no behaviour change for CUDA.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Differential Revision: https://reviews.llvm.org/D44984


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@330790 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-25 01:10:37 +00:00
Jonas Hahnfeld d2a426651d [CUDA] Register relocatable GPU binaries
nvcc generates a unique registration function for each object file
that contains relocatable device code. Unique names are achieved
with a module id that is also reflected in the function's name.

Differential Revision: https://reviews.llvm.org/D42922

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@330425 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-20 13:04:45 +00:00
Jonas Hahnfeld f828172bcf [CUDA] Include single GPU binary, NFCI.
Binaries for multiple architectures are combined by fatbinary,
so the current code was effectively not needed.

Differential Revision: https://reviews.llvm.org/D43461

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@326342 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-28 17:53:46 +00:00
Serge Guelton b0c092f298 Suppress all uses of LLVM_END_WITH_NULL. NFC.
Use variadic templates instead of relying on <cstdarg> + sentinel.

This enforces better type checking and makes code more readable.

Differential revision: https://reviews.llvm.org/D32550


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@302572 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-09 19:31:30 +00:00
John McCall cf2a38c652 Promote ConstantInitBuilder to be a public CodeGen API; it's
a generally useful utility for other frontends.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@296806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 20:04:19 +00:00
John McCall d01a625cd8 ConstantBuilder -> ConstantInitBuilder for clarity, and
move the member classes up to top level to allow forward
declarations to name them.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@288079 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-28 22:18:27 +00:00
John McCall 8a6ea81342 Introduce a helper class for building complex constant initializers. NFC.
I've adopted this in most of the places it makes sense, but v-tables
and CGObjCMac will need a second pass.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@287437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 08:17:24 +00:00
Justin Lebar 852cb69b52 [CUDA] Use the right section and constant names for fatbins when compiling for macos.
Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D26777

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@287287 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 00:41:31 +00:00
Artem Belevich 647e473998 [CUDA] Place GPU binary into .nv_fatbin section and align it by 8.
This matches the way nvcc encapsulates GPU binaries into host object file.
Now cuobjdump can deal with clang-compiled object files.

Differential Revision: https://reviews.llvm.org/D23429

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@278549 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-12 18:44:01 +00:00
Justin Lebar 721ebe8d5e [CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment.
Summary:
Before this patch, we computed the offsets in memory of args passed to
GPU kernel functions by throwing all of the args into an LLVM struct.

clang emits packed llvm structs basically whenever it feels like it, and
packed structs have alignment 1.  So we cannot rely on the llvm type's
alignment matching the C++ type's alignment.

This patch fixes our codegen so we always respect the clang types'
alignments.

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D22879

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@276927 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-27 22:36:21 +00:00
Benjamin Kramer c8a4b126e8 [CUDA] Move argument type lists to the stack. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@274433 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-02 12:03:57 +00:00
Benjamin Kramer 259294aa92 Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@274432 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-02 11:41:41 +00:00
Artem Belevich 0e064caaff [CUDA] Do not generate unnecessary runtime init code.
Differential Revision: http://reviews.llvm.org/D17780

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@262499 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 18:28:53 +00:00
Artem Belevich 3ee234f49d [CUDA] Emit host-side 'shadows' for device-side global variables
... and register them with CUDA runtime.

This is needed for commonly used cudaMemcpy*() APIs that use address of
host-side shadow to access their counterparts on device side.

Fixes PR26340

Differential Revision: http://reviews.llvm.org/D17779

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@262498 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 18:28:50 +00:00
Justin Lebar 5001ed56f2 [CUDA] Invoke ptxas and fatbinary during compilation.
Summary:
Previously we compiled CUDA device code to PTX assembly and embedded
that asm as text in our host binary.  Now we compile to PTX assembly and
then invoke ptxas to assemble the PTX into a cubin file.  We gather the
ptx and cubin files for each of our --cuda-gpu-archs and combine them
using fatbinary, and then embed that into the host binary.

Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary,
which pass args down to the external tools.

Reviewers: tra, echristo

Subscribers: cfe-commits, jhen

Differential Revision: http://reviews.llvm.org/D16082

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@257809 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-14 21:41:27 +00:00
John McCall f4ddf94ecb Compute and preserve alignment more faithfully in IR-generation.
Introduce an Address type to bundle a pointer value with an
alignment.  Introduce APIs on CGBuilderTy to work with Address
values.  Change core APIs on CGF/CGM to traffic in Address where
appropriate.  Require alignments to be non-zero.  Update a ton
of code to compute and propagate alignment information.

As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.

The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned.  Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay.  I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.

Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.

We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment.  In particular,
field access now uses alignmentAtOffset instead of min.

Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs.  For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint.  That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.

ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments.  In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments.  That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.

I partially punted on applying this work to CGBuiltin.  Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@246985 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 08:05:57 +00:00
Alexander Kornienko 8ca7705aa3 Revert r240270 ("Fixed/added namespace ending comments using clang-tidy").
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@240353 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 23:07:51 +00:00
Alexander Kornienko ac58acc7f2 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

  $ tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
      -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
      work/llvm/tools/clang

To reduce churn, not touching namespaces spanning less than 10 lines.



git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@240270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 09:47:44 +00:00
Artem Belevich 1508f392a4 [cuda] Include GPU binary into host object file and generate init/deinit code.
- added -fcuda-include-gpubinary option to incorporate results of
  device-side compilation into host-side one.
- generate code to register GPU binaries and associated kernels
  with CUDA runtime and clean-up on exit.
- added test case for init/deinit code generation.

Differential Revision: http://reviews.llvm.org/D9507

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@236765 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 19:34:16 +00:00
Craig Topper f7bc497ad1 [C++11] Add 'override' keyword to virtual methods that override their base class.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@203643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-12 06:41:41 +00:00
Chandler Carruth 235e24a90e [Modules] Update to reflect the move of CallSite into the IR library in
LLVM r202816.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@202817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-04 11:02:08 +00:00
John McCall bd7370a786 Use the actual ABI-determined C calling convention for runtime
calls and declarations.

LLVM has a default CC determined by the target triple.  This is
not always the actual default CC for the ABI we've been asked to
target, and so we sometimes find ourselves annotating all user
functions with an explicit calling convention.  Since these
calling conventions usually agree for the simple set of argument
types passed to most runtime functions, using the LLVM-default CC
in principle has no effect.  However, the LLVM optimizer goes
into histrionics if it sees this kind of formal CC mismatch,
since it has no concept of CC compatibility.  Therefore, if this
module happens to define the "runtime" function, or got LTO'ed
with such a definition, we can miscompile;  so it's quite
important to get this right.

Defining runtime functions locally is quite common in embedded
applications.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@176286 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-28 19:01:20 +00:00
Dmitri Gribenko cfa88f8939 Remove useless 'llvm::' qualifier from names like StringRef and others that are
brought into 'clang' namespace by clang/Basic/LLVM.h


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@172323 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-12 19:30:44 +00:00
Chandler Carruth 3b844ba7d5 Rewrite #includes for llvm/Foo.h to llvm/IR/Foo.h as appropriate to
reflect the migration in r171366.

Re-sort the #include lines to reflect the new paths.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@171369 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 11:45:17 +00:00
Chandler Carruth 55fc873017 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@169237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 09:13:33 +00:00
Peter Collingbourne a4ae2294b6 CUDA: IR generation support for device stubs
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@141304 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-06 18:51:56 +00:00
Peter Collingbourne 6c0aa5ff6e CUDA: IR generation support for kernel call expressions
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@141300 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-06 18:29:37 +00:00