Previously, these were always included -- after this change, you have to
#include <new>, which is consistent with how things ought to work.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285251 91177308-0d34-0410-b5e6-96231b3b80d8
These were reverted in r283753 and r283747.
The first patch added a header to the root 'Headers' install directory,
instead of into 'Headers/cuda_wrappers'. This was fixed in the second
patch, but by then the damage was done: The bad header stayed in the
'Headers' directory, continuing to break the build.
We reverted both patches in an attempt to fix things, but that still
didn't get rid of the header, so the Windows boostrap build remained
broken.
It's probably worth fixing up our cmake logic to remove things from the
install dirs, but in the meantime, re-land these patches, since we
believe they no longer have this bug.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283907 91177308-0d34-0410-b5e6-96231b3b80d8
Breaks bootstrap builds on (at least) Windows:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\lib\Support\Allocator.cpp:14:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/Allocator.h:24:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/ADT/SmallVector.h:20:
In file included from D:\buildslave\clang-x64-ninja-win7\llvm\include\llvm/Support/MathExtras.h:19:
D:\buildslave\clang-x64-ninja-win7\stage1.install\bin\..\lib\clang\4.0.0\include\algorithm(63,8) :
error: unknown type name '__device__'
inline __device__ const __T &
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283747 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously these sort of worked because they didn't end up resulting in
calls at the ptx layer. But I'm adding stricter checks that break
placement new without these changes.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D23239
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@278194 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Reviewers: tra, rsmith
Subscribers: jhen, cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D19990
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@270150 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
See comments in patch; we were assuming that some stdlib math functions
would be defined in namespace std, when in fact the spec says they
should be defined in the global namespace. libstdc++4.9 became more
conforming and broke us.
This new implementation seems to cover the known knowns.
Reviewers: rsmith
Subscribers: cfe-commits, tra
Differential Revision: http://reviews.llvm.org/D18882
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@265751 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is necessary for a future patch which will make all constexpr
functions implicitly host+device. cmath may declare constexpr
functions, but these we do *not* want to be host+device. The forward
declares added in this patch prevent this (because the rule will be,
constexpr functions become implicitly host+device unless they're
preceeded by a decl with __device__).
Reviewers: tra
Subscribers: cfe-commits, rnk, rsmith
Differential Revision: http://reviews.llvm.org/D18539
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@264963 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We decided this makes life too difficult for code authors. For example,
people may want to detect NVCC and disable variadic templates, which
NVCC does not support, but which we do.
Since people are going to have to change compiler flags *anyway* in
order to compile with clang, if they really want the old behavior, they
can pass -D__NVCC__.
Tested with tensorflow and thrust, no apparent problems.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18417
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@264205 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This lets you write, e.g.
uint3 a = threadIdx;
uint3 b = blockIdx;
dim3 c = gridDim;
dim3 d = blockDim;
which is legal in nvcc, but was not legal in clang.
The fact that e.g. the type of threadIdx is not actually uint3 is still
observable, but now you have to try to observe it.
Reviewers: tra
Subscribers: echristo, cfe-commits
Differential Revision: http://reviews.llvm.org/D17561
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@261777 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
curand.h includes curand_mtgp32_kernel.h. In host mode, this header
redefines threadIdx and blockDim, giving them their "proper" types of
uint3 and dim3, respectively.
clang has its own plan for these variables -- their types are magic
builtin classes. So these redefinitions are incompatible.
As a hack, we force-include the offending CUDA header and use #defines
to get the right types for threadIdx and blockDim.
Reviewers: tra
Subscribers: echristo, cfe-commits
Differential Revision: http://reviews.llvm.org/D17562
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@261776 91177308-0d34-0410-b5e6-96231b3b80d8
CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.
Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing
Differential Revision: http://reviews.llvm.org/D16593
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@258880 91177308-0d34-0410-b5e6-96231b3b80d8
* Pull in host-only implementations of few CUDA-specific math functions.
* #nclude <cmath> early to prevent its inclusion from CUDA headers after
they've messed with __THROW macro.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@255933 91177308-0d34-0410-b5e6-96231b3b80d8
Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.
Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.
Differential Revision: http://reviews.llvm.org/D15534
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@255802 91177308-0d34-0410-b5e6-96231b3b80d8