If the memory object is scalable type, we do not know the exact size of
it at compile time. Set the size of lifetime marker to unknown if the
object is scalable one.
Differential Revision: https://reviews.llvm.org/D102822
I wouldn't recommend writing code like the testcase; a function
parameter isn't atomic, so using an atomic type doesn't really make
sense. But it's valid, so clang shouldn't crash on it.
The code was assuming hasAggregateEvaluationKind(Ty) implies Ty is a
RecordType, which isn't true. Just use isRecordType() instead.
Differential Revision: https://reviews.llvm.org/D102015
The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.
While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.
So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.
This reverts commit 6270b3a1ea,
relanding 0aa0458f14.
As it was discovered in post-commit feedback
for 0aa0458f14,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.
While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D100879
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.
The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.
Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.
Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D99517
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example
struct S {
__attribute__ ((__aligned__(16))) double v[4];
};
Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.
This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.
The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.
On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
As it is being noted in D99249, lack of alignment information on `this`
has been preventing LICM from happening.
For some time now, lack of alignment attribute does *not* imply
natural alignment, but an alignment of `1`.
Also, we used to treat dereferenceable as implying alignment,
but we no longer do, so it's a bugfix.
Differential Revision: https://reviews.llvm.org/D99790
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
These are incompatible with opaque pointers. This is in preparation
of dropping this API on the IRBuilder side as well.
Instead explicitly pass the loaded type.
It's available both in CodeGenOptions and in LangOptions, and LangOptions
implementation is slightly better as it uses a StringRef instead of a char
pointer, so use it.
Differential Revision: https://reviews.llvm.org/D98175
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.
This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.
Differential Revision: https://reviews.llvm.org/D96109
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up
unreachable blocks. This invalidates the reference to the deferred
replacement placeholder. Cope with it.
Fixes PR25102 (from 2015!)
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.
In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.
Differential Revision: https://reviews.llvm.org/D81678
This takes advantage of the implicit default behavior to reduce the number of
attributes, which in turns reduces compilation time. I've observed -3% in
instruction count when compiling sqlite3 amalgamation with -O0
Differential Revision: https://reviews.llvm.org/D96400
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.
Differential Revision: https://reviews.llvm.org/D94537
VLST return values are coerced to VLATs in the function epilog for
consistency with the VLAT ABI. Previously, this coercion was done
through memory. It is preferable to use the
llvm.experimental.vector.insert intrinsic to avoid going through memory
here.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D94290
VLST arguments are coerced to VLATs at the function boundary for
consistency with the VLAT ABI. They are then bitcast back to VLSTs in
the function prolog. Previously, this conversion is done through memory.
With the introduction of the llvm.vector.{insert,extract} intrinsic, we
can avoid going through memory here.
Depends on D92761
Differential Revision: https://reviews.llvm.org/D92762
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.
This patch adds support of hot function attribute to LLVM IR. This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).
This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
isFunctionColdInCallGraph is true, this function will be marked as
cold.
The changes are:
(1) user annotated function attribute will used in setting function
section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.
The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.
Differential Revision: https://reviews.llvm.org/D92493
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.
A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.
The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D91979
This patch adds the functionality to compare BFI counts with real
profile
counts right after reading the profile. It will print remarks under
-Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo.
Differential Revision: https://reviews.llvm.org/D91813
Swiftcall does it's own target-independent argument type classification,
since it is not designed to be ABI compatible with anything local on the
target that isn't LLVM-based. This means it never uses inalloca.
However, we have duplicate logic for checking for inalloca parameters
that runs before call argument setup. This logic needs to know ahead of
time if inalloca will be used later, and we can't move the
CGFunctionInfo calculation earlier.
This change gets the calling convention from either the
FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so
skips the stackbase setup.
Depends on D92883.
Differential Revision: https://reviews.llvm.org/D92944
This template exists to abstract over FunctionPrototype and
ObjCMethodDecl, which have similar APIs for storing parameter types. In
place of a template, use a PointerUnion with two cases to handle this.
Hopefully this improves readability, since the type of the prototype is
easier to discover. This allows me to sink this code, which is mostly
assertions, out of the header file and into the cpp file. I can also
simplify the overloaded methods for computing isGenericMethod, and get
rid of the second EmitCallArgs overload.
Differential Revision: https://reviews.llvm.org/D92883
After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer.
We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks.
Switching to dereferenceable_or_null fixes the problem.
dereferenceable currently does not always respect NullPointerIsValid and may
imply nonnull and lead to aggressive optimization. The optimization may be
related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or
`Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions.
Reviewed By: bkramer, rsmith
Differential Revision: https://reviews.llvm.org/D92297
arguments.
* Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments
* Gates 'nonnull' on -f(no-)delete-null-pointer-checks
* Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to
explicitly test the behavior of this change
* Refactors hundreds of over-constrained clang tests to permit these
attributes, where needed
* Updates Clang12 patch notes mentioning this change
Reviewed-by: rsmith, jdoerfert
Differential Revision: https://reviews.llvm.org/D17993
Followup to D85191.
This changes getTypeInfoInChars to return a TypeInfoChars
struct instead of a std::pair of CharUnits. This lets the
interface match getTypeInfo more closely.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86447
- `-cl-fp32-correctly-rounded-divide-sqrt` is already handled in a
per-instruction manner by annotating the accuracy required. There's no
need to add that fn-attr. So far, there's no in-tree backend handling
that attr and that OpenCL specific option.
- In case that out-of-tree backends are broken, this change could be
reverted if those backends could not be fixed.
Differential Revision: https://reviews.llvm.org/D88424
This reverts commit 55c4ff91bd.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.
rdar://62476022
Differential Revision: https://reviews.llvm.org/D88336
- `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option
and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL
at most.
Differential revision: https://reviews.llvm.org/D88303
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
After the recent discussion on cfe-dev 'Can indirect class parameters be
noalias?' [1], it seems like using using noalias is problematic for
current C++, but should be allowed for C-only code.
This patch introduces a new option to let the user indicate that it is
safe to mark indirect class parameters as noalias. Note that this also
applies to external callers, e.g. it might not be safe to use this flag
for C functions that are called by C++ functions.
In targets that allocate indirect arguments in the called function, this
enables more agressive optimizations with respect to memory operations
and brings a ~1% - 2% codesize reduction for some programs.
[1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D85473
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.
[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
Original summary:
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.
VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:
* Implicit casting between VLA <-> VLS types.
* Coercion of VLS types in function args/return.
* Mangling of VLS types.
Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.
Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.
The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:
__SVE_VLS<typename, unsigned>
where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:
#if __ARM_FEATURE_SVE_BITS==512
// Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
// Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
#endif
The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision. The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.
[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85743
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.
VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:
* Implicit casting between VLA <-> VLS types.
* Coercion of VLS types in function args/return.
* Mangling of VLS types.
Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.
Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.
The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:
__SVE_VLS<typename, unsigned>
where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:
#if __ARM_FEATURE_SVE_BITS==512
// Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
// Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
#endif
The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision. The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.
[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85743
Add address space to indirect abi info and use it for kernels.
Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.
Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).
This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.
This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
llvm function is marked nounwind
This fixes cases where an invoke is emitted, despite the called llvm
function being marked nounwind, because ConstructAttributeList failed to
add the attribute to the attribute list. llvm optimization passes turn
invokes into calls and optimize away the exception handling code, but
it's better to avoid emitting the code in the front-end if the called
function is known not to raise an exception.
Differential Revision: https://reviews.llvm.org/D83906
Currently, Clang previously diagnosed this code by default:
void f(int a[static 0]);
saying that "static has no effect on zero-length arrays", which was
accurate.
However, static array extents require that the caller of the function
pass a nonnull pointer to an array of *at least* that number of
elements, but it can pass more (see C17 6.7.6.3p6). Given that we allow
zero-sized arrays as a GNU extension and that it's valid to pass more
elements than specified by the static array extent, we now support
zero-sized static array extents with the usual semantics because it can
be useful in cases like:
void my_bzero(char p[static 0], int n);
my_bzero(&c+1, 0); //ok
my_bzero(t+k,n-k); //ok, pattern from actual code
The x86-64 "avx" feature changes how >128 bit vector types are passed,
instead of being passed in separate 128 bit registers, they can be
passed in 256 bit registers.
"avx512f" does the same thing, except it switches from 256 bit registers
to 512 bit registers.
The result of both of these is an ABI incompatibility between functions
compiled with and without these features.
This patch implements a warning/error pair upon an attempt to call a
function that would run afoul of this. First, if a function is called
that would have its ABI changed, we issue a warning.
Second, if said call is made in a situation where the caller and callee
are known to have different calling conventions (such as the case of
'target'), we instead issue an error.
Differential Revision: https://reviews.llvm.org/D82562
Summary:
On the process of moving the argument lowering handling for
half-precision floating point arguments and returns to the backend, this
patch removes the code that was responsible for handling the coercion of
those arguments in Clang's Codegen.
Reviewers: rjmccall, chill, ostannard, dnsampaio
Reviewed By: ostannard
Subscribers: stuij, kristof.beyls, dmgreen, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81451
Prevent IR-gen from emitting consteval declarations
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76420
When checking for an enum function attribute, use hasFnAttribute()
rather than hasAttribute() at FunctionIndex, because it is
significantly faster (and more concise to boot).
Check that getDebugInfo() is not null, as in the first revision, before
calling getDebugInfo()->addHeapAllocSiteMetadata().
Else would cause a crash with a new expression in a default arg.
---
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Differential Revision: https://reviews.llvm.org/D80966
With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo`
instead of `getDebugInfo()`,
to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test`
See CodeGenModule::CodeGenModule, `EmitGcovArcs || EmitGcovNotes` can
set `clang::CodeGen::CodeGenModule::DebugInfo`.
---
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Differential Revision: https://reviews.llvm.org/D80966
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80966
Canonicalize on storing FP options in LangOptions instead of
redundantly in CodeGenOptions. Incorporate -ffast-math directly
into the values of those LangOptions rather than considering it
separately when building FPOptions. Build IR attributes from
those options rather than a mix of sources.
We should really simplify the driver/cc1 interaction here and have
the driver pass down options that cc1 directly honors. That can
happen in a follow-up, though.
Patch by Michele Scandale!
https://reviews.llvm.org/D80315
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
Summary:
This is needed in Swift for C++ interop -- see here for the corresponding Swift change:
https://github.com/apple/swift/pull/30630
As part of this change, I've had to make some changes to the interface of CGCXXABI to return the additional parameters separately rather than adding them directly to a `CallArgList`.
Reviewers: rjmccall
Reviewed By: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79942
I've also made a stab at imposing some more order on where and how we add
attributes; this part should be NFC. I wasn't sure whether the CUDA use
case for libdevice should propagate CPU/features attributes, so there's a
bit of unnecessary duplication.
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
Summary:
- If the coerced type is still a pointer, it should be set with proper
parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist
that (pointer) parameter attribute setting so that the coerced pointer
parameter could be marked properly.
Depends on D79394
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79395
Summary:
- Skip copying function arguments and unnecessary casting by using them
directly.
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79394
When passing a value of a struct/union type from secure to non-secure
state (that is returning from a CMSE entry function or passing an
argument to CMSE-non-secure call), there is a potential sensitive
information leak via the padding bits in the structure. It is not
possible in the general case to ensure those bits are cleared by using
Standard C/C++.
This patch makes the compiler emit code to clear such padding
bits. Since type information is lost in LLVM IR, the code generation
is done by Clang.
For each interesting record type, we build a bitmask, in which all the
bits, corresponding to user declared members, are set. Values of
record types are returned by coercing them to an integer. After the
coercion, the coerced value is masked (with bitwise AND) and then
returned by the function. In a similar manner, values of record types
are passed as arguments by coercing them to an array of integers, and
the coerced values themselves are masked.
For union types, we effectively clear only bits, which aren't part of
any member, since we don't know which is the currently active one.
The compiler will issue a warning, whenever a union is passed to
non-secure state.
Values of half-precision floating-point types are passed in the least
significant bits of a 32-bit register (GPR or FPR) with the most
significant bits unspecified. Since this is also a potential leak of
sensitive information, this patch also clears those unspecified bits.
Differential Revision: https://reviews.llvm.org/D76369
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().
I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.
Differential Revision: https://reviews.llvm.org/D78882
This reverts commit 60c642e74b.
This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.
Differential Revision: https://reviews.llvm.org/D77925
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.
Subscribers: hiraditya, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77632
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
- y: SVE registers Z0-Z7
- Upl: One of the low eight SVE predicate registers (P0-P7)
- Upa: Full range of SVE predicate registers (P0-P15)
Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin
Reviewed By: efriedma
Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75690
Summary:
Right now we annotate C++'s `operator new` with `noalias` attribute,
which very much is healthy for optimizations.
However as per [[ http://eel.is/c++draft/basic.stc.dynamic.allocation | `[basic.stc.dynamic.allocation]` ]],
there are more promises on global `operator new`, namely:
* non-`std::nothrow_t` `operator new` *never* returns `nullptr`
* If `std::align_val_t align` parameter is taken, the pointer will also be `align`-aligned
* ~~global `operator new`-returned pointer is `__STDCPP_DEFAULT_NEW_ALIGNMENT__`-aligned ~~ It's more caveated than that.
Supplying this information may not cause immediate landslide effects
on any specific benchmarks, but it for sure will be healthy for optimizer
in the sense that the IR will better reflect the guarantees provided in the source code.
The caveat is `-fno-assume-sane-operator-new`, which currently prevents emitting `noalias`
attribute, and is automatically passed by Sanitizers ([[ https://bugs.llvm.org/show_bug.cgi?id=16386 | PR16386 ]]) - should it also cover these attributes?
The problem is that the flag is back-end-specific, as seen in `test/Modules/explicit-build-flags.cpp`.
But while it is okay to add `noalias` metadata in backend, we really should be adding at least
the alignment metadata to the AST, since that allows us to perform sema checks on it.
Reviewers: erichkeane, rjmccall, jdoerfert, eugenis, rsmith
Reviewed By: rsmith
Subscribers: xbolva00, jrtc27, atanasyan, nlopes, cfe-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D73380
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
if ((N & (N-1)) == 0)
return operator new(N, std::align_val_t(N));
else
return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.
That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?
Reviewers: rsmith, erichkeane
Reviewed By: erichkeane
Subscribers: cfe-commits, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73996
This brings back 2af74e27ed and reverts
eaabaf7e04.
The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
struct alignas(uint64_t) Wrapper { uint64_t P };
void f(uint64_t p);
void f(Wrapper p);
MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
Null-check and adjut a TypeLoc before casting it to a FunctionTypeLoc.
This fixes a crash in -fsanitize=nullability-return, and also makes the
location of the nonnull type available when the return type is adjusted.
rdar://59263039
Differential Revision: https://reviews.llvm.org/D74355
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.
rdar://58552124
Differential revision: https://reviews.llvm.org/D74094
If the return block is unreachable, clang removes it in
CodeGenFunction::FinishFunction(). This removal can leave dangling
references to values defined in the return block if the return block has
successors, which it /would/ if UBSan's return value check is emitted.
In this case, as the UBSan check wouldn't be reachable, it's better to
simply not emit it.
rdar://59196131
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.
This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.
The fix changes the behavior to always add the attribute if LangOptions
requests it.
Differential Revision: https://reviews.llvm.org/D73495
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned
However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.
The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
passed indirectly
Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.
There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors
For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.
For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.
Related to D71915 and D72110
Fixes most of PR44395
Reviewed By: rjmccall, craig.topper, erichkeane
Differential Revision: https://reviews.llvm.org/D72114
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.
Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73006
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.
Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment
Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.
Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73005
Summary: Just an NFC code cleanup i stumbled upon when stumbling through clang alignment attribute handling.
Reviewers: erichkeane, gchatelet, courbet, jdoerfert
Reviewed By: gchatelet
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72993
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.
As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.
Reviewers: erichkeane, jdoerfert, aaron.ballman
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72979
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.
Reviewers: rnk, dmajor, pcc, hans, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72167
Summary:
This is a follow up on https://reviews.llvm.org/D61634#1742154 to turn the clang driver -fno-builtin flag into an IR attribute.
I also investigated pushing the attribute earlier on (in Sema) but it looks like this patch is simple and will cover all function calls.
Reviewers: aaron.ballman, courbet
Subscribers: cfe-commits, tejohnson
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71193
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
ExpandTypeFromArgs
This fixes a bug in IRGen where a call to `llvm.objc.storeStrong` was
being emitted to initialize a __strong field of an uninitialized
temporary struct, which caused crashes at runtime.
rdar://problem/51807365
AggValueSlot
This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.
Original commit message:
This is needed for the pointer authentication work we plan to do in the
near future.
a63a81bd99/clang/docs/PointerAuthentication.rst
Cleanup handling of the denormal-fp-math attribute. Consolidate places
checking the allowed names in one place.
This is in preparation for introducing FP type specific variants of
the denormal-fp-mode attribute. AMDGPU will switch to using this in
place of the current hacky use of subtarget features for the denormal
mode.
Introduce a new header for dealing with FP modes. The constrained
intrinsic classes define related enums that should also be moved into
this header for uses in other contexts.
The verifier could use a check to make sure the denorm-fp-mode
attribute is sane, but there currently isn't one.
Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by
default in the one current user. Clang must be taught to start
emitting this attribute by default to avoid regressions when this is
switched to assume ieee behavior if the attribute isn't present.
Summary:
This is a follow up on https://reviews.llvm.org/D61634
This patch is simpler and only adds the no_builtin attribute.
Reviewers: tejohnson, courbet, theraven, t.p.northover, jdoerfert
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68028
This is a re-submit after it got reverted in https://reviews.llvm.org/rGbd8791610948 since the breakage doesn't seem to come from this patch.
Summary:
This is a follow up on https://reviews.llvm.org/D61634
This patch is simpler and only adds the no_builtin attribute.
Reviewers: tejohnson, courbet, theraven, t.p.northover, jdoerfert
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68028
The behavior from the original patch has changed, since we're no longer
allowing LLVM to just ignore the alignment. Instead, we're just
assuming the maximum possible alignment.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374562
The test fails on Windows, with
error: 'warning' diagnostics expected but not seen:
File builtin-assume-aligned.c Line 62: requested alignment
must be 268435456 bytes or smaller; assumption ignored
error: 'warning' diagnostics seen but not expected:
File builtin-assume-aligned.c Line 62: requested alignment
must be 8192 bytes or smaller; assumption ignored
llvm-svn: 374456
Code to handle __builtin_assume_aligned was allowing larger values, but
would convert this to unsigned along the way. This patch removes the
EmitAssumeAligned overloads that take unsigned to do away with this
problem.
Additionally, it adds a warning that values greater than 1 <<29 are
ignored by LLVM.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374450
When passing arguments using temporary allocas, we need to add the
appropriate lifetime markers so that the stack coloring passes can
re-use the stack space.
This patch keeps track of all the lifetime.start calls emited before the
codegened call, and adds the corresponding lifetime.end calls after the
call.
Differential Revision: https://reviews.llvm.org/D68611
llvm-svn: 374126
* Adds a TypeSize struct to represent the known minimum size of a type
along with a flag to indicate that the runtime size is a integer multiple
of that size
* Converts existing size query functions from Type.h and DataLayout.h to
return a TypeSize result
* Adds convenience methods (including a transparent conversion operator
to uint64_t) so that most existing code 'just works' as if the return
values were still scalars.
* Uses the new size queries along with ElementCount to ensure that all
supported instructions used with scalable vectors can be constructed
in IR.
Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen
Reviewed By: rovka, sdesmalen
Differential Revision: https://reviews.llvm.org/D53137
llvm-svn: 374042
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<RecordType> directly and if not assert will fire for us.
llvm-svn: 373584
has a constexpr destructor.
For constexpr variables, reject if the variable does not have constant
destruction. In all cases, do not emit runtime calls to the destructor
for variables with constant destruction.
llvm-svn: 373159
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
with '-mframe-pointer'
After D56351 and D64294, frame pointer handling is migrated to tri-state
(all, non-leaf, none) in clang driver and on the function attribute.
This patch makes the frame pointer handling cc1 option tri-state.
Reviewers: chandlerc, rnk, t.p.northover, MaskRay
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D56353
llvm-svn: 366645
Reason: this commit causes crashes in the clang compiler when building
LLVM Support with libc++, see https://bugs.llvm.org/show_bug.cgi?id=42665
for details.
llvm-svn: 366429
Summary:
This patch does mainly three things:
1. It fixes a false positive error detection in Sema that is similar to
D62156. The error happens when explicitly calling an overloaded
destructor for different address spaces.
2. It selects the correct destructor when multiple overloads for
address spaces are available.
3. It inserts the expected address space cast when invoking a
destructor, if needed, and therefore fixes a crash due to the unmet
assertion in llvm::CastInst::Create.
The following is a reproducer of the three issues:
struct MyType {
~MyType() {}
~MyType() __constant {}
};
__constant MyType myGlobal{};
kernel void foo() {
myGlobal.~MyType(); // 1 and 2.
// 1. error: cannot initialize object parameter of type
// '__generic MyType' with an expression of type '__constant MyType'
// 2. error: no matching member function for call to '~MyType'
}
kernel void bar() {
// 3. The implicit call to the destructor crashes due to:
// Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
// in llvm::CastInst::Create.
MyType myLocal;
}
The added test depends on D62413 and covers a few more things than the
above reproducer.
Subscribers: yaxunl, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64569
llvm-svn: 366422
A handful of C++ cases as reported in PR42352 didn't actually give an
error when always_inlining with a different target feature list. This
resulted in broken IR.
llvm-svn: 364109
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.
For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.
llvm-svn: 362652
[MS] Add metadata for __declspec(allocator)
Original summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.
Differential Revision: https://reviews.llvm.org/D60237
llvm-svn: 358307
Summary:
Emit !heapallocsite in the metadata for calls to functions marked with
__declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug
info in codeview.
Reviewers: rnk
Subscribers: jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60237
llvm-svn: 357928