Commit Graph

25 Commits

Author SHA1 Message Date
Artem Belevich 75198e124f [CUDA] Pre-include sm_60 and sm_61 headers.
CUDA-8.0 comes with new headers which nvcc pre-includes via cuda_runtime.h
Clang now makes them available as well.

Differential Revision: https://reviews.llvm.org/D28301

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@290982 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 18:39:29 +00:00
Justin Lebar b42f77fee6 [CUDA] Wrapper header changes necessary to support MacOS.
Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D26780

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@287288 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 00:41:35 +00:00
Justin Lebar 06f550b3c3 [CUDA] Move device placement new definitions into a wrapper header.
Previously, these were always included -- after this change, you have to
 #include <new>, which is consistent with how things ought to work.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285251 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 22:13:26 +00:00
Justin Lebar 029a5ad288 [CUDA] Re-land support for <complex> (r283683 and r283680).
These were reverted in r283753 and r283747.

The first patch added a header to the root 'Headers' install directory,
instead of into 'Headers/cuda_wrappers'.  This was fixed in the second
patch, but by then the damage was done: The bad header stayed in the
'Headers' directory, continuing to break the build.

We reverted both patches in an attempt to fix things, but that still
didn't get rid of the header, so the Windows boostrap build remained
broken.

It's probably worth fixing up our cmake logic to remove things from the
install dirs, but in the meantime, re-land these patches, since we
believe they no longer have this bug.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283907 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 17:36:03 +00:00
Nico Weber d1ea6adb36 Revert r283680.
Breaks bootstrap builds on (at least) Windows:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19:
D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) :
    error: unknown type name '__device__'
    inline __device__ const __T &


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-10 14:10:00 +00:00
Justin Lebar 8f211f3799 [CUDA] Support <complex> and std::min/max on the device.
Summary:
We do this by wrapping <complex> and <algorithm>.

Tests are in the test-suite.

Reviewers: tra

Subscribers: jhen, beanz, cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D24979

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283680 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-08 22:16:12 +00:00
Justin Lebar 204eb55522 [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.
Summary: This matches the idiom we use for our other CUDA wrapper headers.

Reviewers: tra

Subscribers: beanz, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D24978

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283679 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-08 22:16:08 +00:00
Artem Belevich 6a96fd6a52 [CUDA] Added support for CUDA-8
Differential Revision: https://reviews.llvm.org/D24946

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@282610 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 17:47:40 +00:00
Justin Lebar 984f42a083 [CUDA] Add __device__ overloads for placement new and delete.
Summary:
Previously these sort of worked because they didn't end up resulting in
calls at the ptx layer.  But I'm adding stricter checks that break
placement new without these changes.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23239

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@278194 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-10 01:09:14 +00:00
Justin Lebar cfd0eb5a97 [CUDA] Implement __shfl* intrinsics in clang headers.
Summary: Clang changes to make use of the LLVM intrinsics added in D21160.

Reviewers: tra

Subscribers: jholewinski, cfe-commits

Differential Revision: http://reviews.llvm.org/D21162

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@272299 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 20:04:57 +00:00
Justin Lebar 445e59e90c [CUDA] Add -fcuda-approx-transcendentals flag.
Summary:
This lets us emit e.g. sin.approx.f32.  See
http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sin

Reviewers: rnk

Subscribers: tra, cfe-commits

Differential Revision: http://reviews.llvm.org/D20493

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@270484 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-23 20:19:56 +00:00
Justin Lebar 58d65b28b4 [CUDA] Implement __ldg using intrinsics.
Summary:
Previously it was implemented as inline asm in the CUDA headers.

This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions.  This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.

Reviewers: tra, rsmith

Subscribers: jhen, cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D19990

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@270150 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-19 22:49:13 +00:00
Artem Belevich 7189032a3b [CUDA] removed unneeded __nvvm_reflect_anchor()
Since r265060 LLVM infers correct __nvvm_reflect attributes, so
explicit declaration of __nvvm_reflect() is no longer needed.

Differential Revision: http://reviews.llvm.org/D19074

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@267062 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-21 21:40:27 +00:00
Justin Lebar fa9ac5e37e [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.
Summary:
See comments in patch; we were assuming that some stdlib math functions
would be defined in namespace std, when in fact the spec says they
should be defined in the global namespace.  libstdc++4.9 became more
conforming and broke us.

This new implementation seems to cover the known knowns.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: http://reviews.llvm.org/D18882

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@265751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-07 23:55:53 +00:00
Justin Lebar 0e99eeaa87 [CUDA] Fix typo in __clang_cuda_runtime_wrapper.h.
We're #including the wrong file!

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@265083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-01 00:25:42 +00:00
Justin Lebar c50aacffb4 [CUDA] Add math forward declares to CUDA header wrapper.
Summary:
This is necessary for a future patch which will make all constexpr
functions implicitly host+device.  cmath may declare constexpr
functions, but these we do *not* want to be host+device.  The forward
declares added in this patch prevent this (because the rule will be,
constexpr functions become implicitly host+device unless they're
preceeded by a decl with __device__).

Reviewers: tra

Subscribers: cfe-commits, rnk, rsmith

Differential Revision: http://reviews.llvm.org/D18539

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@264963 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30 23:30:14 +00:00
Justin Lebar 7b9fff9af1 [CUDA] Don't define __NVCC__.
Summary:
We decided this makes life too difficult for code authors.  For example,
people may want to detect NVCC and disable variadic templates, which
NVCC does not support, but which we do.

Since people are going to have to change compiler flags *anyway* in
order to compile with clang, if they really want the old behavior, they
can pass -D__NVCC__.

Tested with tensorflow and thrust, no apparent problems.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18417

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@264205 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-23 22:42:27 +00:00
Justin Lebar 93b27f5e1a [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.
Summary:
This lets you write, e.g.

  uint3 a = threadIdx;
  uint3 b = blockIdx;
  dim3 c = gridDim;
  dim3 d = blockDim;

which is legal in nvcc, but was not legal in clang.

The fact that e.g. the type of threadIdx is not actually uint3 is still
observable, but now you have to try to observe it.

Reviewers: tra

Subscribers: echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D17561

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@261777 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-24 21:49:33 +00:00
Justin Lebar 5a68fbd86b [CUDA] Add hack so code which includes "curand.h" doesn't break.
Summary:
curand.h includes curand_mtgp32_kernel.h.  In host mode, this header
redefines threadIdx and blockDim, giving them their "proper" types of
uint3 and dim3, respectively.

clang has its own plan for these variables -- their types are magic
builtin classes.  So these redefinitions are incompatible.

As a hack, we force-include the offending CUDA header and use #defines
to get the right types for threadIdx and blockDim.

Reviewers: tra

Subscribers: echristo, cfe-commits

Differential Revision: http://reviews.llvm.org/D17562

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@261776 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-24 21:49:31 +00:00
Eric Christopher 58e445a784 Update functions in clang supplied headers to use the compiler reserved
namespace for arguments.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@260647 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:22:53 +00:00
Artem Belevich 2b7153031c [CUDA] added declarations for device-side system calls
...and std:: wrappers for free/malloc.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@259690 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-03 20:53:58 +00:00
Artem Belevich 8379dbf80c [CUDA] Implemented device-side support functions in <cmath>.
CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.

Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing

Differential Revision: http://reviews.llvm.org/D16593

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@258880 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 23:37:29 +00:00
Justin Lebar e2636ac0ba [CUDA] Make printf work.
Summary:
The code in CGCUDACall is largely based on a patch written by Eli
Bendersky:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html

That patch implemented an LLVM pass lowering printf to vprintf; this
one does something similar, but in Clang codegen.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16372

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@258642 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 21:28:14 +00:00
Artem Belevich 4c27e4665a [CUDA] runtime wrapper header tweaks
* Pull in host-only implementations of few CUDA-specific math functions.
* #nclude <cmath> early to prevent its inclusion from CUDA headers after
  they've messed with __THROW macro.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@255933 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 22:25:22 +00:00
Artem Belevich b208e87658 [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h
Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.

Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.

Differential Revision: http://reviews.llvm.org/D15534

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@255802 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 18:51:59 +00:00