Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.
Intended result:
For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:
- wraps the contained initializations in a control variable to ensure that
the inits only happen once, even if a module is imported many times by
imports of the main unit.
- calls module initializers for imported modules first. Note that the
order of module import is not significant, and therefore neither is the
order of imported module initializers.
- We then call initializers for the Global Module Fragment (if present)
- We then call initializers for the current module.
- We then call initializers for the Private Module Fragment (if present)
For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:
- Calls the initializers for any imported modules first.
- Then proceeds as normal with remaining inits.
For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.
Implementation:
- We provide the module pointer in the AST Context so that CodeGen can act
on it and its sub-modules.
- We need to account for module build lines like this:
` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`
- in order to do this, we add to ParseAST to set the module pointer in
the ASTContext, once we establish that this is a module build and we
know the module pointer. To be able to do this, we make the query for
current module public in Sema.
- In CodeGen, we determine if the current build requires a CXX20-style module
init and, if so, we defer any module initializers during the "Eagerly
Emitted" phase.
- We then walk the module initializers at the end of the TU but before
emitting deferred inits (which adds any hidden and static ones, fixing
https://github.com/llvm/llvm-project/issues/51873 ).
- We then proceed to emit the deferred inits and continue to emit the CXX
init function.
Differential Revision: https://reviews.llvm.org/D126189
Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The self (global
value) references inside `Prologue Data` is still pointing to the
original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.
This patch detaches the information from function `Prologue Data`
and attaches it to a function metadata node.
This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D115844
Add option -fhip-kernel-arg-name to emit kernel argument
name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Brian Sumner
Differential Revision: https://reviews.llvm.org/D128022
This reverts commits:
d3ddc251acd90eecff5c
It turned out there're some options turned on that leaks the memory
intentionally, which fires the asan builds after the patch being
applied. The issue has been fixed in
7bc00ce5cd, so reland it.
Below is the original commit message:
The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.
This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:
clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();
JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }
Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
This reverts commits:
d3ddc251acd90eecff5c
This relands below commit with asan fix:
The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.
This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:
clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();
JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }
Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D127730
RE-LAND (reverts a revert):
This reverts commit 8e1f47b596.
This patch adds generation of sanitizer metadata attributes (which were
added in D126100) to the clang frontend.
We still currently generate the llvm.asan.globals that's consumed by
the IR pass, but the plan is to eventually migrate off of that onto
purely debuginfo and these IR attributes.
Reviewed By: vitalybuka, kstoimenov
Differential Revision: https://reviews.llvm.org/D126929
This patch adds generation of sanitizer metadata attributes (which were
added in D126100) to the clang frontend.
We still currently generate the `llvm.asan.globals` that's consumed by
the IR pass, but the plan is to eventually migrate off of that onto
purely debuginfo and these IR attributes.
Reviewed By: vitalybuka, kstoimenov
Differential Revision: https://reviews.llvm.org/D126929
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).
Three values are provided for the option:
* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility
This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.
This relands commit: 8c8a2679a2
with fixes for the compile time and assert problems that were reported
by:
* making shouldMapVisibilityToDLLExport inline and provide an early return
in the case where no mapping is in effect (aka non-AIX platforms)
* don't try to export RTTI types which we will give internal linkage to
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126340
The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.
This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:
clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();
JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }
Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D126781
This caused assertions, see comment on the code review:
llvm/clang/lib/AST/Decl.cpp:1510:
clang::LinkageInfo clang::LinkageComputer::getLVForDecl(const clang::NamedDecl *, clang::LVComputationKind):
Assertion `D->getCachedLinkage() == LV.getLinkage()' failed.
> The option mdefault-visibility-export-mapping is created to allow
> mapping default visibility to an explicit shared library export
> (e.g. dllexport). Exactly how and if this is manifested is target
> dependent (since it depends on how they map dllexport in the IR).
>
> Three values are provided for the option:
>
> * none: the default and behavior without the option, no additional export linkage information is created.
> * explicit: add the export for entities with explict default visibility from the source, including RTTI
> * all: add the export for all entities with default visibility
>
> This option is useful for targets which do not export symbols as part of
> their usual default linkage behaviour (e.g. AIX), such targets
> traditionally specified such information in external files (e.g. export
> lists), but this mapping allows them to use the visibility information
> typically used for this purpose on other (e.g. ELF) platforms.
>
> Reviewed By: MaskRay
>
> Differential Revision: https://reviews.llvm.org/D126340
This reverts commit 8c8a2679a2.
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).
Three values are provided for the option:
* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility
This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126340
Refactor the code that handles the align clause of 'omp allocate' so
it can be used with globals as well as local variables.
Differential Revision: https://reviews.llvm.org/D126426
CUDA requires that static variables be visible to the host when
offloading. However, The standard semantics of a stiatc variable dictate
that it should not be visible outside of the current file. In order to
access it from the host we need to perform "externalization" on the
static variable on the device. This requires generating a semi-unique
name that can be affixed to the variable as to not cause linker errors.
This is currently done using the CUID functionality, an MD5 hash value
set up by the clang driver. This allows us to achieve is mostly unique
ID that is unique even between multiple compilations of the same file.
However, this is not always availible. Instead, this patch uses the
unique ID from the file to generate a unique symbol name. This will
create a unique name that is consistent between the host and device side
compilations without requiring the CUID to be entered by the driver. The
one downside to this is that we are no longer stable under multiple
compilations of the same file. However, this is a very niche use-case
and is not supported by Nvidia's CUDA compiler so it likely to be good
enough.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125904
The DXIL validator version option(/validator-version) decide the validator version when compile hlsl.
The format is major.minor like 1.0.
In normal case, the value of validator version should be got from DXIL validator. Before we got DXIL validator ready for llvm/main, DXIL validator version option is added first to set validator version.
It will affect code generation for DXIL, so it is treated as a code gen option.
A new member std::string DxilValidatorVersion is added to clang::CodeGenOptions.
Then CGHLSLRuntime is added to clang::CodeGenModule.
It is used to translate clang::CodeGenOptions::DxilValidatorVersion into a ModuleFlag under key "dx.valver" at end of clang code generation.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D123884
clang to emit DWARF information for global alias variable as
DW_TAG_imported_declaration. This change also handles nested
(recursive) imported declarations.
Reviewed by: dblaikie, aprantl
Differential Revision: https://reviews.llvm.org/D120989
This change merges code for emit of target and target_clones multiversion
resolver functions and, in doing so, corrects handling of target_clones
functions that are declared but not defined. Previously, a use of such
a target_clones function would result in an attempted emit of an ifunc
that referenced an undefined resolver function. Ifunc references to
undefined resolver functions are not allowed and, when the LLVM verifier
is not disabled (via '-disable-llvm-verifier'), resulted in the verifier
issuing a "IFunc resolver must be a definition" error and aborting the
compilation. With this change, ifuncs and resolver function definitions
are always emitted for used target_clones functions regardless of whether
the target_clones function is defined (if the function is defined, then
the ifunc and resolver are emitted regardless of whether the function is
used).
This change has the side effect of causing target_clones variants and
resolver functions to be emitted in a different order than they were
previously. This is harmless and is reflected in the updated tests.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D122958
Previously, GetOrCreateMultiVersionResolver() required the caller to provide
a GlobalDecl along with an llvm::type and FunctionDecl. The latter two can be
cheaply obtained from the first, and the llvm::type parameter is not always
used, so requiring the caller to provide them was unnecessary and created the
possibility that callers would pass an inconsistent set. This change simplifies
the interface to only require the GlobalDecl value.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D122956
We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: https://github.com/llvm/llvm-project/issues/54549
However, we were diagnosing this as 'not defined', because the
ifunc's attempt to look up its resolver would generate a declared IR
function.
Additionally, as background, the way we allow these static extern "C"
functions to work in inline assembly is by making an alias with the C
mangling in MOST situations to the version we emit with
internal-linkage/mangling.
The problem here was multi-fold: First- We generated the alias after the
ifunc was checked, so the function by that name didn't exist yet.
Second, the ifunc's generation caused a symbol to exist under the name
of the alias already (the declared function above), which suppressed the
alias generation.
This patch fixes all of this by moving the checking of ifuncs/CFE aliases
until AFTER we have generated the extern-C alias. Then, it does a
'fixup' around the GlobalIFunc to make sure we correct the reference.
Differential Revision: https://reviews.llvm.org/D122608
This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) before calling the builtin.
It will be used to implement std::source_location in libc++ in a
future change. The builtin is compatible with GCC's implementation,
and libstdc++'s usage. An intentional divergence is that GCC declares
the builtin's return type to be `const void*` (for
ease-of-implementation reasons), while Clang uses the actual type,
`const std::source_location::__impl*`.
In order to support this new functionality, I've also added a new
'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after
MSGuidDecl, and is used to represent a generic concept of an lvalue
constant with global scope, deduplicated by its value. It's possible
that MSGuidDecl itself, or some of the other similar sorts of things
in Clang might be able to be refactored onto this more-generic
concept, but there's enough special-case weirdness in MSGuidDecl that
I gave up attempting to share code there, at least for now.
Finally, for compatibility with libstdc++'s <source_location> header,
I've added a second exception to the "cannot cast from void* to T* in
constant evaluation" rule. This seems a bit distasteful, but feels
like the best available option.
Reviewers: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D120159
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D122504
CodeGenModule::DeferredDecls std::map::operator[] seem to be hot especially while code generating huge compilation units.
In such cases using DenseMap instead gives observable compile time improvement. Patch was tested on Linux build with default config acting as benchmark.
Build was performed on isolated CPU cores in silent x86-64 Linux environment following: https://llvm.org/docs/Benchmarking.html#linux rules.
Compile time statistics diff produced by perf and time before and after change are following:
instructions -0.15%, cycles -0.7%, max-rss +0.65%.
Using StringMap instead DenseMap doesn't bring any visible gains.
Differential Revision: https://reviews.llvm.org/D118169
Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as a member function)
causes the wrong name to be emitted for one of the variants/resolver,
since the name is cached. Make sure we invalidate the cache in
cpu-dispatch/cpu-specific modes, like we previously did for just target
multiversioning.
Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.
This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.
Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Depends on D108478
Reviewed By: pcc, rjmccall
Differential Revision: https://reviews.llvm.org/D108479
See discussion in D51650, this change was a little aggressive in an
error while doing a 'while we were here', so this removes that error
condition, as it is apparently useful.
This reverts commit bb4934601d.
As discussed here: https://lwn.net/Articles/691932/
GCC6.0 adds target_clones multiversioning. This functionality is
an odd cross between the cpu_dispatch and 'target' MV, but is compatible
with neither.
This attribute allows you to list all options, then emits a separately
optimized version of each function per-option (similar to the
cpu_specific attribute). It automatically generates a resolver, just
like the other two.
The mangling however, is... ODD to say the least. The mangling format
is:
<normal_mangling>.<option string>.<option ordinal>.
Differential Revision:https://reviews.llvm.org/D51650
This patch removes the assumption propagation that was added in D110655
primarily to get assumption informatino on opaque call sites for
optimizations. The analysis done in D111445 allows us to do this more
intelligently in the back-end.
Depends on D111445
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111463
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
Pass a LangAS instead of a target address space to
GetOrCreateLLVMGlobal, to remove a place where the frontend assumes that
target address space 0 is special.
Differential Revision: https://reviews.llvm.org/D108445
As it was discovered in post-commit feedback
for 0aa0458f14,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.
While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.
Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.
Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.
Differential Revision: https://reviews.llvm.org/D89909
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.
This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.
Differential Revision: https://reviews.llvm.org/D96109
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.
In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.
Differential Revision: https://reviews.llvm.org/D81678
explicitly emitting retainRV or claimRV calls in the IR
This reapplies ed4718eccb, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5f.
Original commit message:
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This caused miscompiles of Chromium tests for iOS due clobbering of live
registers. See discussion on the code review for details.
> Background:
>
> This fixes a longstanding problem where llvm breaks ARC's autorelease
> optimization (see the link below) by separating calls from the marker
> instructions or retainRV/claimRV calls. The backend changes are in
> https://reviews.llvm.org/D92569.
>
> https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
>
> What this patch does to fix the problem:
>
> - The front-end adds operand bundle "clang.arc.attachedcall" to calls,
> which indicates the call is implicitly followed by a marker
> instruction and an implicit retainRV/claimRV call that consumes the
> call result. In addition, it emits a call to
> @llvm.objc.clang.arc.noop.use, which consumes the call result, to
> prevent the middle-end passes from changing the return type of the
> called function. This is currently done only when the target is arm64
> and the optimization level is higher than -O0.
>
> - ARC optimizer temporarily emits retainRV/claimRV calls after the calls
> with the operand bundle in the IR and removes the inserted calls after
> processing the function.
>
> - ARC contract pass emits retainRV/claimRV calls after the call with the
> operand bundle. It doesn't remove the operand bundle on the call since
> the backend needs it to emit the marker instruction. The retainRV and
> claimRV calls are emitted late in the pipeline to prevent optimization
> passes from transforming the IR in a way that makes it harder for the
> ARC middle-end passes to figure out the def-use relationship between
> the call and the retainRV/claimRV calls (which is the cause of
> PR31925).
>
> - The function inliner removes an autoreleaseRV call in the callee if
> nothing in the callee prevents it from being paired up with the
> retainRV/claimRV call in the caller. It then inserts a release call if
> claimRV is attached to the call since autoreleaseRV+claimRV is
> equivalent to a release. If it cannot find an autoreleaseRV call, it
> tries to transfer the operand bundle to a function call in the callee.
> This is important since the ARC optimizer can remove the autoreleaseRV
> returning the callee result, which makes it impossible to pair it up
> with the retainRV/claimRV call in the caller. If that fails, it simply
> emits a retain call in the IR if retainRV is attached to the call and
> does nothing if claimRV is attached to it.
>
> - SCCP refrains from replacing the return value of a call with a
> constant value if the call has the operand bundle. This ensures the
> call always has at least one user (the call to
> @llvm.objc.clang.arc.noop.use).
>
> - This patch also fixes a bug in replaceUsesOfNonProtoConstant where
> multiple operand bundles of the same kind were being added to a call.
>
> Future work:
>
> - Use the operand bundle on x86-64.
>
> - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
> calls with the operand bundles.
>
> rdar://71443534
>
> Differential Revision: https://reviews.llvm.org/D92808
This reverts commit ed4718eccb.
An global value in the `llvm.used` list does not have GC root semantics on ELF targets.
This will be changed in a subsequent backend patch.
Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to
prevent undesired GC root semantics.
Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets.
GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections",
which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior).
For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that
the future LLD change will not affect them.
rnk kindly categorized the changes:
```
ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac.
MS C++ ABI stuff: wants GC root semantics, no change
OpenMP: unsure, but GC root semantics probably don't hurt
CodeGenModule: affected in this patch to *not* use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics
Coverage: Probably want GC root semantics
CGExpr.cpp: refers to LTO, wants GC root
CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run)
CGDecl.cpp: Changed in this patch for __attribute__((used))
```
Differential Revision: https://reviews.llvm.org/D97446