Slow definition generators may suspend lookups to temporarily release the
session lock, allowing unrelated lookups to proceed.
Using this functionality is discouraged: it is best to make definition
generation fast, rather than suspending the lookup. As a last resort where
this is not possible, suspension may be used.
An API to wrap ExecutionSession::lookup, this allows C API clients to use async
lookup.
The immediate motivation for adding this is to simplify upcoming
definition-generator unit tests.
As we're adding more tests that need to convert between C and C++ flag values
this commit adds helper functions to support this. This patch also updates the
CAPIDefinitionGenerator to use these new utilities.
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
ELF-based platforms currently support defining multiple static
initializer table sections with differing priorities, for example
.init_array.0 or .init_array.100; the default .init_array corresponds
to a priority of 65535. When building a shared library or executable,
the system linker normally sorts these sections and combines them into
a single .init_array section. This change adds the capability to
recognize ELF static initializers with priorities other than the
default, and to properly sort them by priority, to Orc and the Orc
runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127056
This change enables integrating orc::LLJIT with the ORCv2
platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc
runtime. Changes include:
- Adding SPS wrapper functions for the orc runtime's dlfcn emulation
functions, allowing initialization and deinitialization to be invoked
by LLJIT.
- Changing the LLJIT code generation default to add UseInitArray so
that .init_array constructors are generated for ELF platforms.
- Integrating the ORCv2 Platforms into lli, and adding a
PlatformSupport implementation to the LLJIT instance used by lli which
implements initialization and deinitialization by calling the new
wrapper functions in the runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D126492
Adds the aarch64 support in ELFNixPlatform. These are few simple changes, but it allows us to use the orc runtime in ELF/AARCH64 backend. It succesfully run the static initializers of stdlibc++ iostream so that "cout << Hello world" testcase starts to work.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127060
This code previously used cantFail, but both steps (resolution and emission)
can fail if the resource tracker associated with the
AbsoluteSymbolsMaterializationUnit is removed. Checking these errors is
necessary for correct error propagation.
Idiomatic llvm::Error usage can result in a FailedToMaterialize error tearing
down an ExecutionSession instance. Since the FailedToMaterialize error holds
SymbolStringPtrs and JITDylib references this leads to crashes when accessing
or logging the error.
This patch modifies FailedToMaterialize to retain the SymbolStringPool and
JITDylibs involved in the failure so that we can safely report an error message
to the client, even if the error tears down the session.
The contract for JITDylibs allows the getName method to be used even after the
session has been torn down, but no other JITDylib fields should be accessed via
the FailedToMaterialize error if the ssesion has been torn down. Logging the
error is guaranteed to be safe in all cases.
Clients are required to call ExecutionSession::endSession before destroying the
ExecutionSession. Failure to do so can lead to memory leaks and other difficult
to debug issues. Enforcing this requirement by assertion makes it easy to spot
or debug situations where the contract was not followed.
This changes the ELFNix platform Orc runtime to use, when available,
the __unw_add_dynamic_eh_frame_section interface provided by libunwind
for registering .eh_frame sections loaded by JITLink. When libunwind
is not being used for unwinding, the ELFNix platform detects this and
defaults to the __register_frame interface provided by libgcc_s.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114961
This adds resolver, indirection and trampoline stubs for riscv64,
allowing lazy compilation to work.
It assumes hard float extension exists. I don't know the proper way to detect it as Triple doesn't provide the interface to check riscv +f +d abi.
I am also not sure if orclazy tests should be enabled because lli needs an additional -codemodel=melany for tests to pass.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D122543
Removes a bogus dyn_cast_or_null that was breaking cast-expression handling when
parsing llvm.global_ctors.
The intent of this code was to identify Functions nested within cast
expressions, but the offending dyn_cast_or_null was actually blocking that:
Since a function is not a cast expression, we would set FuncC to null and break
the loop without finding the Function. The cast was not necessary either:
Functions are already Constants, and we didn't need to do anything
ConstantExpr-specific with FuncC, so we could just drop the cast.
Thanks to Jonas Hahnfeld for tracking this down.
http://llvm.org/PR54797
This patch removes the unintended resolution of locally scoped absolute symbols
(which was causing unexpected definition errors).
It stops using the JITSymbolFlags::Absolute flag (it isn't set or used elsewhere,
and causes mismatch-flags asserts), and adds JITSymbolFlags::Exported to default
scoped absolute symbols.
Finally, we now set the scope of absolute symbols correctly in
MachOLinkGraphBuilder.
Without this, EPCIndirectionUtils::getResolverBlockAddr (and lazy compilation
via EPC) won't work.
No test case: lli is still using LocalLazyCallThroughManager. I'll revisit this
soon when I look at adding lazy compilation support to the ORC runtime.
This patch updates the MachO platform (both the ORC MachOPlatform class and the
ORC-Runtime macho_platform.* files) to use allocation actions, rather than EPC
calls, to transfer the initializer information scraped from each linked object.
Interactions between the ORC and ORC-Runtime sides of the platform are
substantially redesigned to accomodate the change.
The high-level changes in this patch are:
1. The MachOPlatform::setupJITDylib method now calls into the runtime to set up
a dylib name <-> header mapping, and a dylib state object (JITDylibState).
2. The MachOPlatformPlugin builds an allocation action that calls the
__orc_rt_macho_register_object_platform_sections and
__orc_rt_macho_deregister_object_platform_sections functions in the runtime
to register the address ranges for all "interesting" sections in the object
being allocated (TLS data sections, initializers, language runtime metadata
sections, etc.).
3. The MachOPlatform::rt_getInitializers method (the entry point in the
controller for requests from the runtime for initializer information) is
replaced by MachOPlatform::rt_pushInitializers. The former returned a data
structure containing the "interesting" section address ranges, but these are
now handled by __orc_rt_macho_register_object_platform_sections. The new
rt_pushInitializers method first issues a lookup to trigger materialization
of the "interesting" sections, then returns the dylib dependence tree rooted
at the requested dylib for dlopen to consume. (The dylib dependence tree is
returned by rt_pushInitializers, rather than being handled by some dedicated
call, because rt_pushInitializers can alter the dependence tree).
The advantage of these changes (beyond the performance advantages of using
allocation actions) is that it moves more information about the materialized
portions of the JITDylib into the executor. This tends to make the runtime
easier to reason about, e.g. the implementation of dlopen in the runtime is now
recursive, rather than relying on recursive calls in the controller to build a
linear data structure for consumption by the runtime. This change can also make
some operations more efficient, e.g. JITDylibs can be dlclosed and then
re-dlopened without having to pull all initializers over from the controller
again.
In addition to the high-level changes, there are some low-level changes to ORC
and the runtime:
* In ORC, at ExecutionSession teardown time JITDylibs are now destroyed in
reverse creation order. This is on the assumption that the ORC runtime will be
loaded into an earlier dylib that will be used by later JITDylibs. This is a
short-term solution to crashes that arose during testing when the runtime was
torn down before its users. Longer term we will likely destroy dylibs in
dependence order.
* toSPSSerializable(Expected<T> E) is updated to explicitly initialize the T
value, allowing it to be used by Ts that have explicit constructors.
* The ORC runtime now (1) attempts to track ref-counts, and (2) distinguishes
not-yet-processed "interesting" sections from previously processed ones. (1)
is necessary for standard dlopen/dlclose emulation. (2) is intended as a step
towards better REPL support -- it should enable future runtime calls that
run only newly registered initializers ("dlopen_more", "dlopen_additions",
...?).
Calls to JITDylib's getDFSLinkOrder and getReverseDFSLinkOrder methods (both
static an non-static versions) are now valid to make on defunct JITDylibs, but
will return an error if any JITDylib in the link order is defunct.
This means that platforms can safely lookup link orders by name in response to
jit-dlopen calls from the ORC runtime, even if the call names a defunct
JITDylib -- the call will just fail with an error.
This is a counterpart to Platform::setupJITDylib, and is called when JITDylib
instances are removed (via ExecutionSession::removeJITDylib).
Upcoming MachOPlatform patches will use this to clear per-JITDylib data when
JITDylibs are removed.
runFinalizeActions takes an AllocActions vector and attempts to run its finalize
actions. If any finalize action fails then all paired dealloc actions up to the
failing pair are run, and the error(s) returned. If all finalize actions succeed
then a vector containing the dealloc actions is returned.
runDeallocActions takes a vector<WrapperFunctionCall> containing dealloc action
calls and runs them all, returning any error(s).
These helpers are intended to simplify the implementation of
JITLinkMemoryManager::InFlightAlloc::finalize and
JITLinkMemoryManager::deallocate overrides by taking care of execution (and
potential roll-back) of allocation actions.
This re-applies 133f86e954, which was reverted in
c5965a411c while I investigated bot failures.
The original failure contained an arithmetic conversion think-o (on line 419 of
EHFrameSupport.cpp) that could cause failures on 32-bit platforms. The issue
should be fixed in this patch.
This adds a GetObjectFileInterface callback member to
StaticLibraryDefinitionGenerator, and adds an optional argument for initializing
that member to StaticLibraryDefinitionGenerator's named constructors. If not
supplied, it will default to getObjectFileInterface from ObjectFileInterface.h.
To enable testing a `-hidden-l<x>` option is added to the llvm-jitlink tool.
This allows archives to be loaded with all contained symbol visibilities demoted
to hidden.
The ObjectLinkingLayer::setOverrideObjectFlagsWithResponsibilityFlags method is
(belatedly) hooked up, and enabled in llvm-jitlink when `-hidden-l<x>` is used
so that the demotion is also applied at symbol resolution time (avoiding any
"mismatched symbol flags" crashes).
Also moves object interface building functions out of Mangling.h and in to the
new ObjectFileInterfaces.h header, and updates the llvm-jitlink tool to use
custom object interfaces rather than a custom link layer.
ObjectLayer::add overloads are added to match the old signatures (which
do not take a MaterializationUnit::Interface). These overloads use the
standard getObjectFileInterface function to build an interface.
Passing a MaterializationUnit::Interface explicitly makes it easier to alter
the effective interface of the object file being added, e.g. by changing symbol
visibility/linkage, or renaming symbols (in both cases the changes will need to
be mirrored by a JITLink pass at link time to update the LinkGraph to match the
explicit interface). Altering interfaces in this way can be useful when lazily
compiling (e.g. for renaming function bodies) or emulating linker options (e.g.
demoting all symbols to hidden visibility to emulate -load_hidden).
If all symbols in a lookup match before we reach the end of the search order
then bail out of the search-order loop early.
This should reduce unnecessary contention on the session lock and improve
readability of the debug logs.
Most of `MemoryBuffer` interfaces expose a `RequiresNullTerminator` parameter that's being used to:
* determine how to open a file (`mmap` vs `open`),
* assert newly initialized buffer indeed has an implicit null terminator.
This patch adds the paramater to the `SmallVectorMemoryBuffer` constructors, meaning:
* null terminator can now be added to `SmallVector`s that didn't have one before,
* `SmallVectors` that had a null terminator before keep it even after the move.
In line with existing code, the new parameter is defaulted to `true`. This patch makes sure all calls to the `SmallVectorMemoryBuffer` constructor set it to `false` to preserve the current semantics.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115331