Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@330220 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:
- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation
These can be combined either as comma-separated values, or as
repeated flag values.
Reviewers: echristo, kpw, eizan, pelikan
Reviewed By: pelikan
Subscribers: mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44970
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@329985 91177308-0d34-0410-b5e6-96231b3b80d8
I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts.
Differential Revision: https://reviews.llvm.org/D45421
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@329605 91177308-0d34-0410-b5e6-96231b3b80d8
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@329399 91177308-0d34-0410-b5e6-96231b3b80d8
A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined
coroutine noop_coroutine that does nothing. To implement this feature, we implemented
an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that
does nothing when resumed or destroyed.
This patch adds a builtin __builtin_coro_noop() that maps to llvm.coro.noop intrinsic.
Related llvm change: https://reviews.llvm.org/D45114
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328993 91177308-0d34-0410-b5e6-96231b3b80d8
The conversion of operatios to bitcode helps to eliminate an additional
store in certain cases. We used to lower these load intrinsics in DAG to
DAG conversion by which time, the "Dead Store Elimination" pass is
already run. There is an associated LLVM patch.
Patch by Sumanth Gundapaneni.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328776 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions have been around for a long time, but we
haven't supported intrinsics for them. The "new" vesrions use
the CSx register for the start of the buffer instead of the K
field in the Mx register.
There is a related llvm patch.
Patch by Brendon Cahoon.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328725 91177308-0d34-0410-b5e6-96231b3b80d8
Putting back the code in commit r327189 that was reverted in r322737. The code is being committed in three stages and this one is the last stage: 1) r327455 fp16 feature flags, 2) r327836 pass half type or i16 based on FullFP16, and 3) the code here which the front-end fp16 vector intrinsic for ARM.
Differential revision https://reviews.llvm.org/D43650
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328277 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Libc++'s default allocator uses `__builtin_operator_new` and `__builtin_operator_delete` in order to allow the calls to new/delete to be ellided. However, libc++ now needs to support over-aligned types in the default allocator. In order to support this without disabling the existing optimization Clang needs to support calling the aligned new overloads from the builtins.
See llvm.org/PR22634 for more information about the libc++ bug.
This patch changes `__builtin_operator_new`/`__builtin_operator_delete` to call any usual `operator new`/`operator delete` function. It does this by performing overload resolution with the arguments passed to the builtin to determine which allocation function to call. If the selected function is not a usual allocation function a diagnostic is issued.
One open issue is if the `align_val_t` overloads should be considered "usual" when `LangOpts::AlignedAllocation` is disabled.
In order to allow libc++ to detect this new behavior the value for `__has_builtin(__builtin_operator_new)` has been updated to `201802`.
Reviewers: rsmith, majnemer, aaron.ballman, erik.pilkington, bogner, ahatanak
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43047
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328134 91177308-0d34-0410-b5e6-96231b3b80d8
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.
Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.
Updated tests to use non-default address spaces.
Differential Revision: https://reviews.llvm.org/D43268
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@328006 91177308-0d34-0410-b5e6-96231b3b80d8
For generating NEON intrinsics, this determines the NEON data type, and whether
it should be a half type or an i16 type. I.e., we always pass a half type for
AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is
enabled, and i16 otherwise.
This is intended to be non-functional change, but together with the backend
work in D44538 which adds support for f16 vectors, this enables adding the
AArch32 FP16 (vector) intrinsics.
Differential Revision: https://reviews.llvm.org/D44561
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@327836 91177308-0d34-0410-b5e6-96231b3b80d8
This is causing problems in testing, and PR36683 was raised.
Reverting it until we have sorted out how to pass f16 vectors.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@327437 91177308-0d34-0410-b5e6-96231b3b80d8
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.
Fixes PR36360.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@324954 91177308-0d34-0410-b5e6-96231b3b80d8
This corresponds to r323222 in LLVM. The new names are not yet
finalized, so use them at your own risk.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@323224 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation.
I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations.
Reviewers: RKSimon, spatel, zvi, jina.nahias
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42016
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@322461 91177308-0d34-0410-b5e6-96231b3b80d8
These just overloads for _Float128. They're supported by GCC 7 and used
by glibc. APFloat support is already there so just add the overloads.
__builtin_copysignf128
__builtin_fabsf128
__builtin_huge_valf128
__builtin_inff128
__builtin_nanf128
__builtin_nansf128
This is the same support that GCC has, according to the documentation,
but limited to _Float128.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@321948 91177308-0d34-0410-b5e6-96231b3b80d8
r320902 fixed the IRGen for some types of checked multiplications. It
did not handle unsigned overflow correctly in the case where the signed
operand is negative (PR35750).
Eli pointed out that on overflow, the result must be equal to the unique
value that is equivalent to the mathematically-correct result modulo two
raised to the k power, where k is the number of bits in the result type.
This patch fixes the specialized IRGen from r320902 accordingly.
Testing: Apart from check-clang, I modified the test harness from
r320902 to validate the results of all multiplications -- not just the
ones which don't overflow:
https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081
llvm.org/PR35750, rdar://34963321
Differential Revision: https://reviews.llvm.org/D41717
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@321771 91177308-0d34-0410-b5e6-96231b3b80d8
Diagnose 'unreachable' UB when a noreturn function returns.
1. Insert a check at the end of functions marked noreturn.
2. A decl may be marked noreturn in the caller TU, but not marked in
the TU where it's defined. To diagnose this scenario, strip away the
noreturn attribute on the callee and insert check after calls to it.
Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700
rdar://33660464
Differential Revision: https://reviews.llvm.org/D40698
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@321231 91177308-0d34-0410-b5e6-96231b3b80d8
This patch introduces a specialized way to lower overflow-checked
multiplications with mixed-sign operands. This fixes link failures and
ICEs on code like this:
void mul(int64_t a, uint64_t b) {
int64_t res;
__builtin_mul_overflow(a, b, &res);
}
The generic checked-binop irgen would use a 65-bit multiplication
intrinsic here, which requires runtime support for _muloti4 (128-bit
multiplication), and therefore fails to link on i386. To get an ICE
on x86_64, change the example to use __int128_t / __uint128_t.
Adding runtime and backend support for 65-bit or 129-bit checked
multiplication on all of our supported targets is infeasible.
This patch solves the problem by using simpler, specialized irgen for
the mixed-sign case.
llvm.org/PR34920, rdar://34963321
Testing: Apart from check-clang, I compared the output from this fairly
comprehensive test driver using unpatched & patched clangs:
https://gist.github.com/vedantk/3eb9c88f82e5c32f2e590555b4af5081
Differential Revision: https://reviews.llvm.org/D41149
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@320902 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
InterlockedCompareExchange128 is a bit more complicated than the other
InterlockedCompareExchange functions, so it requires a bit more work. It
doesn't directly refer to 128bit ints, instead it takes pointers to
64bit ints for Destination and ComparandResult, and exchange is taken as
two 64bit ints (high & low). The previous value is written to
ComparandResult, and success is returned. This implementation does the
following in order to produce a cmpxchg instruction:
1. Cast everything to 128bit ints or int pointers, and glues together
the Exchange values
2. Reads from CompareandResult to get the comparand
3. Calls cmpxchg volatile (on X86 this will produce a lock cmpxchg16b
instruction)
1. Result 0 (previous value) is written back to ComparandResult
2. Result 1 (success bool) is zext'ed to a uchar and returned
Resolves bug https://llvm.org/PR35251
Patch by Colden Cullen!
Reviewers: rnk, agutowski
Reviewed By: rnk
Subscribers: majnemer, cfe-commits
Differential Revision: https://reviews.llvm.org/D41032
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@320730 91177308-0d34-0410-b5e6-96231b3b80d8
There are 20 LLVM math intrinsics that correspond to mathlib calls according to the LangRef:
http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics
We were only converting 3 mathlib calls (sqrt, fma, pow) and 12 builtin calls (ceil, copysign,
fabs, floor, fma, fmax, fmin, nearbyint, pow, rint, round, trunc) to their intrinsic-equivalents.
This patch pulls the transforms together and handles all 20 cases. The switch is guarded by a
check for const-ness to make sure we're not doing the transform if errno could possibly be set by
the libcall or builtin.
Differential Revision: https://reviews.llvm.org/D40044
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@319593 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@319388 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM exposes a file in the backend (X86TargetParser.def) that
contains information about the correct list of CpuIs values.
This patch removes 2 of the copied and pasted versions of this
list from clang and instead includes the data from the .def file.
Differential Revision: https://reviews.llvm.org/D40054
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@318234 91177308-0d34-0410-b5e6-96231b3b80d8
cbrt() is always constant because it can't overflow or underflow. Therefore, it can't set errno.
fma() is not always constant because it can overflow or underflow. Therefore, it can set errno.
But we know that it never sets errno on GNU / MSVC, so make it constant in those environments.
Differential Revision: https://reviews.llvm.org/D39641
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@318093 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This just seems to have been an oversight. We already supported the f64
atomic add with an explicit scope (e.g. "cta"), but not the scopeless
version.
Reviewers: tra
Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39638
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@317623 91177308-0d34-0410-b5e6-96231b3b80d8
This shortens the intrinsic headers a little and allows us to get rid of the cmpeq and cmpgt handling from CGBuiltin.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@317506 91177308-0d34-0410-b5e6-96231b3b80d8
The LLVM sqrt intrinsic definition changed with:
D28797
...so we don't have to use any relaxed FP settings other than errno handling.
This patch sidesteps a question raised in PR27435:
https://bugs.llvm.org/show_bug.cgi?id=27435
Is a programmer using __builtin_sqrt() invoking the compiler's intrinsic definition of sqrt or the mathlib definition of sqrt?
But we have an answer now: the builtin should match the behavior of the libm function including errno handling.
Differential Revision: https://reviews.llvm.org/D39204
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@317031 91177308-0d34-0410-b5e6-96231b3b80d8
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@315804 91177308-0d34-0410-b5e6-96231b3b80d8
The Cpu Init functionality is required for the target
attribute, so this patch simply splits it out into its own
function, exactly like CpuIs and CpuSupports.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@315075 91177308-0d34-0410-b5e6-96231b3b80d8
code size.
Currently clang expands a call to __builtin_os_log_format into a long
sequence of instructions at the call site, causing code size to
increase in some cases.
This commit attempts to reduce code size by emitting a helper function
that can be shared by calls to __builtin_os_log_format with similar
formats and arguments. The helper function has linkonce_odr linkage to
enable the linker to merge identical functions across translation units.
Attribute 'noinline' is attached to the helper function at -Oz so that
the inliner doesn't inline functions that can potentially be merged.
This commit also fixes a bug where the generated IR writes past the end
of the buffer when "%m" is the last specifier appearing in the format
string passed to __builtin_os_log_format.
Original patch by Duncan Exon Smith.
rdar://problem/34065973
rdar://problem/34196543
Differential Revision: https://reviews.llvm.org/D38606
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@315045 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Restore the `__builtin_wasm_rethrow` builtin deleted in D37931. On second
thought, it appears it can be used to implement `__cxa_rethrow`.
Reviewers: dschuff, sunfish
Reviewed By: dschuff
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37942
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@313430 91177308-0d34-0410-b5e6-96231b3b80d8
This patch replaces the perm2f128 intrinsics with native shuffle vectors.
This uses a pretty simple approach to allocate source 0 to the lower half input and source 1 to the upper half input. Then its just a matter of filling in the indices to use either the lower or upper half of that specific source. This can result in the same source being used by both operands. InstCombine or SelectionDAGBuilder should be able to clean that up.
Differential Revision: https://reviews.llvm.org/D37892
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@313418 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Remove `__builtin_wasm_rethrow` builtin. I thought it was required to implement
`__cxa_rethrow` function in libcxxabi, but it turned out it will be using
`__builtin_wasm_throw` instead.
Reviewers: dschuff, jgravelle-google
Reviewed By: jgravelle-google
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37931
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@313402 91177308-0d34-0410-b5e6-96231b3b80d8
Not all targets support vararg (e.g. amdgpu). Instead of using vararg in the emitted functions for enqueue_kernel,
this patch creates a temporary array of size_t, stores the size arguments in the temporary array
and passes it to the emitted functions for enqueue_kernel.
Differential Revision: https://reviews.llvm.org/D36678
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@312441 91177308-0d34-0410-b5e6-96231b3b80d8
"target" implementation
A small set of refactors that'll make it easier for me to implement 'target'
support.
First, extract the CPUSupports functionality into its own function.
THis has the advantage of not wasting time in this builtin to deal with
arguments.
Second, pulls both CPUSupports and CPUIs implementation into a member-function,
so that it can be called from the resolver generation that I'm working on.
Third, creates an overload that takes simply the feature/cpu name (rather than
extracting it from a callexpr), since that info isn't available later.
Note that despite how the 'diff' looks, the EmitX86CPUSupports function simply
takes the implementation out of the 'switch'.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@312355 91177308-0d34-0410-b5e6-96231b3b80d8
This adds builtin_cpu_init which will emit a call to cpu_indicator_init in libgcc or compiler-rt.
This is needed to support builtin_cpu_supports/builtin_cpu_is in an ifunc resolver.
Differential Revision: https://reviews.llvm.org/D36336
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@311874 91177308-0d34-0410-b5e6-96231b3b80d8
the interface.
The ultimate goal here is to make it easier to do some more interesting
things in constant emission, like emit constant initializers that have
ignorable side-effects, or doing the majority of an initialization
in-place and then patching up the last few things with calls. But for
now this is mostly just a refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@310964 91177308-0d34-0410-b5e6-96231b3b80d8
They still need to be implemented in the intrinsics, the command line, and the backend. But this change isn't dependent on any of that and resolves a TODO.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@310386 91177308-0d34-0410-b5e6-96231b3b80d8
On some targets, passing zero to the clz() or ctz() builtins has undefined
behavior. I ran into this issue while debugging UB in __hash_table from libcxx:
the bug I was seeing manifested itself differently under -O0 vs -Os, due to a
UB call to clz() (see: libcxx/r304617).
This patch introduces a check which can detect UB calls to builtins.
llvm.org/PR26979
Differential Revision: https://reviews.llvm.org/D34590
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@309459 91177308-0d34-0410-b5e6-96231b3b80d8
Move builtins from the x86 specific scope into the global
scope. Their use is still limited to x86_64 and aarch64 though.
This allows wine on aarch64 to properly handle variadic functions.
Differential Revision: https://reviews.llvm.org/D34475
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@308218 91177308-0d34-0410-b5e6-96231b3b80d8
This patch series adds support for the IBM z14 processor. This part includes:
- Basic support for the new processor and its features.
- Support for low-level builtins mapped to new LLVM intrinsics.
Support for the -fzvector extension to vector float and the new
high-level vector intrinsics is provided by separate patches.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@308197 91177308-0d34-0410-b5e6-96231b3b80d8
There are two other features before it that we don't currently support in the the frontend or backend so I left placeholders to keep the encoding correct.
I think the compiler-rt implementation of this feature is even further out of date.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@307456 91177308-0d34-0410-b5e6-96231b3b80d8
problems in testing, see comments in D34161 for some more details.
A fix is in progres in D35011, but a revert seems better now as the fix will
probably take some more time to land.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@307277 91177308-0d34-0410-b5e6-96231b3b80d8
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the Clang side of the addition of six intrinsics for two new machine instructions (vpopcntd and vpopcntq).
It also includes the addition of the new feature set.
Differential Revision: https://reviews.llvm.org/D33170
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@303857 91177308-0d34-0410-b5e6-96231b3b80d8
The functions creating LValues propagated information about alignment
source. Extend the propagated data to also include information about
possible unrestricted aliasing. A new class LValueBaseInfo will
contain both AlignmentSource and MayAlias info.
This patch should not introduce any functional changes.
Differential Revision: https://reviews.llvm.org/D33284
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@303358 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We define the `__xray_customeevent` builtin that gets translated to
IR calls to the correct intrinsic. The default implementation of this is
a no-op function. The codegen side of this follows the following logic:
- When `-fxray-instrument` is not provided in the driver, we elide all
calls to `__xray_customevent`.
- When `-fxray-instrument` is enabled and a function is marked as "never
instrumented", we elide all calls to `__xray_customevent` in that
function; if either marked as "always instrumented" or subject to
threshold-based instrumentation, we emit a call to the
`llvm.xray.customevent` intrinsic from LLVM for each
`__xray_customevent` occurrence in the function.
This change depends on D27503 (to land in LLVM first).
Reviewers: echristo, rsmith
Subscribers: mehdi_amini, pelikan, lrl, cfe-commits
Differential Revision: https://reviews.llvm.org/D30018
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@302492 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a part two of two reviews, one for the clang and the other for LLVM.
In this patch, I covered the clang side, by introducing the intrinsic to the front end.
This is done by creating a generic replacement.
Differential Revision: https://reviews.llvm.org/D31394a
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@299431 91177308-0d34-0410-b5e6-96231b3b80d8
It seems MS headers have started using __readgsqword, and since it's
used in a header that doesn't include intrin.h, we can't implement it as
an inline function anymore.
That was already the case for __readfsdword, which Saleem added support
for in r220859. This patch reuses that codegen to implement all of
__read[fg]s{byte,word,dword,qword}.
Differential Revision: https://reviews.llvm.org/D31248
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@298538 91177308-0d34-0410-b5e6-96231b3b80d8
D28494 adds another parameter to @llvm.objectsize. Clang needs to be
sure to pass that third arg whenever applicable.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@298431 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes an assertion failure in cases where we had expression
statements that declared variables nested inside of pass_object_size
args. Since we were emitting the same ExprStmt twice (once for the arg,
once for the @llvm.objectsize call), we were getting issues with
redefining locals.
This also means that we can be more lax about when we emit
@llvm.objectsize for pass_object_size args: since we're reusing the
arg's value itself, we don't have to care so much about side-effects.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@295935 91177308-0d34-0410-b5e6-96231b3b80d8
The following code would crash clang:
void foo(unsigned *const __attribute__((pass_object_size(0))));
void bar(unsigned *i) { foo(i); }
This is because we were always selecting the version of
`@llvm.objectsize` that takes an i8* in CodeGen. Passing an i32* as an
i8* makes LLVM very unhappy.
(Yes, I'm surprised that this remained uncaught for so long, too. :) )
As an added bonus, we'll now also use the appropriate address space when
emitting @llvm.objectsize calls.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@295805 91177308-0d34-0410-b5e6-96231b3b80d8
Removed ndrange_t as Clang builtin type and added
as a struct type in the OpenCL header.
Use type name to do the Sema checking in enqueue_kernel
and modify IR generation accordingly.
Review: D28058
Patch by Dmitry Borisenkov!
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@295311 91177308-0d34-0410-b5e6-96231b3b80d8
__fastfail terminates the process immediately with a special system
call. It does not run any process shutdown code or exception recovery
logic.
Fixes PR31854
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@294606 91177308-0d34-0410-b5e6-96231b3b80d8
Support for CUDA printf is exploited to support printf for
an NVPTX OpenMP device.
To reflect the support of both programming models, the file
CGCUDABuiltin.cpp has been renamed to CGGPUBuiltin.cpp, and
the call EmitCUDADevicePrintfCallExpr has been renamed to
EmitGPUDevicePrintfCallExpr.
Reviewers: jlebar
Differential Revision: https://reviews.llvm.org/D17890
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@293444 91177308-0d34-0410-b5e6-96231b3b80d8
Modify ObjC blocks impl wrt address spaces as follows:
- keep default private address space for blocks generated
as local variables (with captures);
- add global address space for global block literals (no captures);
- make the block invoke function and enqueue_kernel prototype with
the generic AS block pointer parameter to accommodate both
private and global AS cases from above;
- add block handling into default AS because it's implemented as
a special pointer type (BlockPointer) in the frontend and therefore
it is used as a pointer everywhere. This is also needed to accommodate
both private and global AS blocks for the two cases above.
- removes ObjC RT specific symbols (NSConcreteStackBlock and
NSConcreteGlobalBlock) in the OpenCL mode.
Review: https://reviews.llvm.org/D28814
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@293286 91177308-0d34-0410-b5e6-96231b3b80d8
by providing a memchr builtin that returns char* instead of void*.
Also add a __has_feature flag to indicate the presence of constexpr forms of
the relevant <string> functions.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@292555 91177308-0d34-0410-b5e6-96231b3b80d8
Unfortunately _setjmp3 can be both import or local. The ASAN tests try to
emulate the flags which makes this harder to detect. Rely on the linker
creating or using thunks here instead. Should repair the ASAN windows bots.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@289783 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a way for us to version any UBSan handler by itself.
The patch overrides D21289 for a better implementation (we're able to
rev up a single handler).
After this, then we can land a slight modification of D19667+D19668.
We probably don't want to keep all the versions in compiler-rt (maybe we
want to deprecate on one release and remove the old handler on the next
one?), but with this patch we will loudly fail to compile when mixing
incompatible handler calls, instead of silently compiling and then
providing bad error messages.
Reviewers: kcc, samsonov, rsmith, vsk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D21695
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@289444 91177308-0d34-0410-b5e6-96231b3b80d8
have the same size.
This fixes an asset that is triggered when an address of a boolean
variable is passed to __builtin_arm_ldrex or __builtin_arm_strex.
rdar://problem/29269006
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@288404 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@286971 91177308-0d34-0410-b5e6-96231b3b80d8
Make handling integer parameters more flexible:
- For the number of events argument allow to pass larger
integers than 32 bits as soon as compiler can prove that
the range fits in 32 bits. If not, the diagnostic will be given.
- Change type of the arguments specifying the sizes of
the corresponding block arguments to be size_t.
Review: https://reviews.llvm.org/D26509
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@286849 91177308-0d34-0410-b5e6-96231b3b80d8
__builtin_alloca always uses __BIGGEST_ALIGNMENT__ for the alignment of
the allocation. __builtin_alloca_with_align allows the programmer to
specify the alignment of the allocation.
This fixes PR30658.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285544 91177308-0d34-0410-b5e6-96231b3b80d8
abstract information about the callee. NFC.
The goal here is to make it easier to recognize indirect calls and
trigger additional logic in certain cases. That logic will come in
a later patch; in the meantime, I felt that this was a significant
improvement to the code.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285258 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r285007 and reapply r284990, with a fix for the
opencl test that I broke. Original commit message follows:
These new builtins support a mechanism for logging OS events, using a
printf-like format string to specify the layout of data in a buffer.
The _buffer_size version of the builtin can be used to determine the size
of the buffer to allocate to hold the data, and then __builtin_os_log_format
can write data into that buffer. This implements format checking to report
mismatches between the format string and the data arguments. Most of this
code was written by Chris Willmore.
Differential Revision: https://reviews.llvm.org/D25888
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@285019 91177308-0d34-0410-b5e6-96231b3b80d8
These new builtins support a mechanism for logging OS events, using a
printf-like format string to specify the layout of data in a buffer.
The _buffer_size version of the builtin can be used to determine the size
of the buffer to allocate to hold the data, and then __builtin_os_log_format
can write data into that buffer. This implements format checking to report
mismatches between the format string and the data arguments. Most of this
code was written by Chris Willmore.
Differential Revision: https://reviews.llvm.org/D25888
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284990 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: We need `__stosb` to be an intrinsic, because SecureZeroMemory function uses it without including intrin.h. Implementing it as a volatile memset is not consistent with MSDN specification, but it gives us target-independent IR while keeping the most important properties of `__stosb`.
Reviewers: rnk, hans, thakis, majnemer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D25334
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284253 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Previously global 64-bit versions of _Interlocked functions broke buildbots on i386, so now I'm adding them as builtins for x86-64 and ARM only (should they be also on AArch64? I had problems with testing it for AArch64, so I left it)
Reviewers: hans, majnemer, mstorsjo, rnk
Subscribers: cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25576
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284172 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin.
Reviewers: hans, thakis, rnk, majnemer
Subscribers: RKSimon, cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25264
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@284060 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r283802. It introduces temporarily static
initializers, because StringRef ctor isn't (yet) constexpr for
string literals.
I plan to get there this week, but apparently GCC is so terrible
with these static initializer right now (10 min+ extra codegen
time was reported) that I'll hold on to this patch till the
constexpr one is ready, and land these at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283920 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386.
Reviewers: thakis, majnemer, hans, rnk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24598
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283264 91177308-0d34-0410-b5e6-96231b3b80d8