This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
With D135642 ignoring unregistered symbols, isTransitiveUsedByMetadataOnly added
by D101512 is no longer needed (the operation is potentially slow). There is a
`.addrsig_sym` directive for an only-used-by-metadata symbol but it does not
emit an entry.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D138362
This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad70152.
This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:
First, no exception can move in or out of _try region., i.e., no "potential faulty
instruction can be moved across _try boundary.
Second, the order of exceptions for instructions 'directly' under a _try must be preserved
(not applied to those in callees).
Finally, global states (local/global/heap variables) that can be read outside of _try region
must be updated in memory (not just in register) before the subsequent exception occurs.
The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding process.
Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad70152).
Part-2 : LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.htmlhttps://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D102817/new/
Extends the Asm reader/writer to support reading and writing the
'.memtag' directive (including allowing it on internal global
variables). Also add some extra tooling support, including objdump and
yaml2obj/obj2yaml.
Test that the sanitize_memtag IR attribute produces the expected asm
directive.
Uses the new Aarch64 MemtagABI specification
(https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst)
to identify symbols as tagged in object files. This is done using a
R_AARCH64_NONE relocation that identifies each tagged symbol, and these
relocations are tagged in a special SHT_AARCH64_MEMTAG_GLOBALS_STATIC
section. This signals to the linker that the global variable should be
tagged.
Reviewed By: fmayer, MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D128958
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.llvm.org/D138665
The existing behaviors and callbacks were overlapping and had very
confusing semantics: beginBasicBlock/endBasicBlock were not always
called, beginFragment/endFragment seemed like they were meant to mean
the same thing, but were slightly different, etc. This resulted in
confusing semantics, virtual method overloads, and control flow.
Remove the above, and replace with new beginBasicBlockSection and
endBasicBlockSection callbacks. And document them.
These are always called before the first and after the last blocks in
a function, even when basic-block-sections are disabled.
Previously, we used SHA-1 for hashing the CodeView type records.
SHA-1 in `GloballyHashedType::hashType()` is coming top in the profiles. By simply replacing with BLAKE3, the link time is reduced in our case from 15 sec to 13 sec. I am only using MSVC .OBJs in this case. As a reference, the resulting .PDB is approx 2.1GiB and .EXE is approx 250MiB.
Differential Revision: https://reviews.llvm.org/D137101
basic block section cases
MachineBlockPlacement pass sets an alignment attribute to the loop
header MBB and this attribute will lead to an alignment directive during
emitting asm. In the case of the basic block section, the alignment
directive is put before the section label, and thus the alignment is set
to the predecessor of the loop header, which is not what we expect and
increases the code size (both inserting nop and set section alignment).
Reviewed By: rahmanl
Differential Revision: https://reviews.llvm.org/D137535
As discussed in https://reviews.llvm.org/D136474 -fmessage-length
creates problems with reproduciability in the PDB files.
This patch just drops that argument when writing the PDB file.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D137322
Currently pseudo probe encoding for a function is like:
- For the first probe, a relocation from it to its physical position in the code body
- For subsequent probes, an incremental offset from the current probe to the previous probe
The relocation could potentially cause relocation overflow during link time. I'm now replacing it with an offset from the first probe to the function start address.
A source function could be lowered into multiple binary functions due to outlining (e.g, coro-split). Since those binary function have independent link-time layout, to really avoid relocations from .pseudo_probe sections to .text sections, the offset to replace with should really be the offset from the probe's enclosing binary function, rather than from the entry of the source function. This requires some changes to previous section-based emission scheme which now switches to be function-based. The assembly form of pseudo probe directive is also changed correspondingly, i.e, reflecting the binary function name.
Most of the source functions end up with only one binary function. For those don't, a sentinel probe is emitted for each of the binary functions with a different name from the source. The sentinel probe indicates the binary function name to differentiate subsequent probes from the ones from a different binary function. For examples, given source function
```
Foo() {
…
Probe 1
…
Probe 2
}
```
If it is transformed into two binary functions:
```
Foo:
…
Foo.outlined:
…
```
The encoding for the two binary functions will be separate:
```
GUID of Foo
Probe 1
GUID of Foo
Sentinel probe of Foo.outlined
Probe 2
```
Then probe1 will be decoded against binary `Foo`'s address, and Probe 2 will be decoded against `Foo.outlined`. The sentinel probe of `Foo.outlined` makes sure there's not accidental relocation from `Foo.outlined`'s probes to `Foo`'s entry address.
On the BOLT side, to be minimal intrusive, the pseudo probe re-encoding sticks with the old encoding format. This is fine since unlike linker, Bolt processes the pseudo probe section as a whole and it is free from relocation overflow issues.
The change is downwards compatible as long as there's no mixed use of the old encoding and the new encoding.
Reviewed By: wenlei, maksfb
Differential Revision: https://reviews.llvm.org/D135912
Differential Revision: https://reviews.llvm.org/D135914
Differential Revision: https://reviews.llvm.org/D136394
When a CU attaches some ranges for a subprogram or an inlined code, the CU should be that of the subprogram/inlined code that was emitted.
If not, then these emitted ranges will use the incorrect base of the CU in `emitRangeList`.
A reproducible example is:
When linking these two LLVM IRs, dsymutil will report no mapping for range or inconsistent range data warnings.
`foo.swift`
```swift
import AppKit.NSLayoutConstraint
public class Foo {
public var c: Int {
get {
Int(NSLayoutConstraint().constant)
}
set {
}
}
}
```
`main.swift`
```swift
// no mapping for range
let f: Foo! = nil
// inconsistent range data
//let l: Foo = Foo()
```
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D136039
This was scanning through def operands looking for the
symbol operand. This is pointless because the symbol is always
the first operand as enforced by the verifier, and all operands
are implicit.
Regression from D131400: cross-language LTO causes a crash in the
compiler on the NULL deref of Scope in `isa` call when Rust IR is
involved. Presumably, this might affect other languages too, and
even Rust itself without cross-language LTO when the Rust compiler
switched to LLVM 16.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D134616
Sometimes when a function is inlined into a different CU, `llvm-dwarfdump --verify` would find an inlined subroutine with an invalid abstract origin. This is because `DwarfUnit::addDIEEntry()` will incorrectly assume the inlined subroutine and the abstract origin are from the same CU if it can't find the CU for the inlined subroutine.
In the added test, the inlined subroutine for `bar()` is created before the CU for `B.swift` is created, so it tries to point to `goo()` in the wrong CU. Interestingly, if we swap the order of the two functions then we don't see a crash since the module for `goo()` is created first.
The fix is to give a parent DIE to `ScopeDIE` before calling `addDIEEntry()` so that its CU can be found. Luckily, `constructInlinedScopeDIE()` is only called once so we can pass it the DIE of the scope's parent and give it a child just after it's created.
`constructInlinedScopeDIE()` should always return a DIE, so assert that it is not null.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D135114
The accessibility level of a typedef or using declaration in a
struct or class was being lost when producing debug information.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D134339
__declspec(safebuffers) is equivalent to
__attribute__((no_stack_protector)). This information is recorded in
CodeView.
While we are here, add support for strict_gs_check.
I'm planning to deprecate and eventually remove llvm::empty.
I thought about replacing llvm::empty(x) with std::empty(x), but it
turns out that all uses can be converted to x.empty(). That is, no
use requires the ability of std::empty to accept C arrays and
std::initializer_list.
Differential Revision: https://reviews.llvm.org/D133677
Fixes#50098
LLVM uses X19 as the frame base pointer, if it needs to. Meaning you
can get warnings if you clobber that with inline asm.
However, it doesn't explain why. The frame base register is not part
of the ABI so it's pretty confusing why you get that warning out of the blue.
This adds a method to explain a reserved register with X19 as the first one.
The logic is the same as getReservedRegs.
I could have added a return parameter to isASMClobberable and friends
but found that there's a lot of things that call isReservedReg in various
ways.
So while one more method on the pile isn't great design, it is simpler
right now to do it this way and only pay the cost if you are actually using
a reserved register.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D133213
Interpret MD_pcsections in AsmPrinter emitting the requested metadata to
the associated sections. Functions and normal instructions are handled.
Differential Revision: https://reviews.llvm.org/D130879
This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment.
The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections.
This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op.
Differential Revision: https://reviews.llvm.org/D132708
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
While this does not matter for most targets, when building for Arm Morello,
we have to mark the symbol as a function and add size information, so that
LLD can correctly evaluate relocations against the local symbol.
Since Morello is an out-of-tree target, I tried to reproduce this with
in-tree backends and with the previous reviews applied this results in
a noticeable difference when targeting Thumb.
Background: Morello uses a method similar Thumb where the encoding mode is
specified in the LSB of the symbol. If we don't mark the target as a
function, the relocation will not have the LSB set and calls will end up
using the wrong encoding mode (which will almost certainly crash).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D131429
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Relands 67504c9549 with a fix for
32-bit builds.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code. Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.
Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.
So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.
Discovered while trying to sort out related stuff on D102817.
Differential Revision: https://reviews.llvm.org/D124697
Static variables declared within a routine or lexical block should
be emitted with a non-qualified name. This allows the variables to
be visible to the Visual Studio watch window.
Differential Revision: https://reviews.llvm.org/D131400
When no landing pads exist for a function, `@LPStart` is undefined and must be omitted.
EH table is generally not emitted for functions without landing pads, except when the personality function is uknown (`!isNoOpWithoutInvoke(classifyEHPersonality(Per))`). In that case, we must omit `@LPStart` even when machine function splitting is enabled.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D131626
DebugLocEntry assumes that it either contains 1 item that has no fragment
or many items that all have fragments (see the assert in addValues).
When EXPENSIVE_CHECKS is enabled, _GLIBCXX_DEBUG is defined. On a few machines
I've checked, this causes std::sort to call the comparator even
if there is only 1 item to sort. Perhaps to check that it is implemented
properly ordering wise, I didn't find out exactly why.
operator< for a DbgValueLoc will crash if this happens because the
optional Fragment is empty.
Compiler/linker/optimisation level seems to make this happen
or not. So I've seen this happen on x86 Ubuntu but the buildbot
for release EXPENSIVE_CHECKS did not have this issue.
Add an explicit check whether we have 1 item.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D130156
When using a ptrtoint to a size larger than the pointer width in a
global initializer, we currently create a ptr & low_bit_mask style
MCExpr, which will later result in a relocation error during object
file emission.
This patch rejects the constant expression already during
lowerConstant(), which results in a much clearer error message
that references the constant expression at fault.
This fixes https://github.com/llvm/llvm-project/issues/56400,
for certain definitions of "fix".
Differential Revision: https://reviews.llvm.org/D130366
Move this out of the switch, so that different branches can
indicate an error by breaking out of the switch. This becomes
important if there are more than the two current error cases.