llvm-project

Commit Graph

Author	SHA1	Message	Date
Eli Friedman	cfd2c5ce58	Untangle the mess which is MachineBasicBlock::hasAddressTaken(). There are two different senses in which a block can be "address-taken". There can be a BlockAddress involved, which means we need to map the IR-level value to some specific block of machine code. Or there can be constructs inside a function which involve using the address of a basic block to implement certain kinds of control flow. Mixing these together causes a problem: if target-specific passes are marking random blocks "address-taken", if we have a BlockAddress, we can't actually tell which MachineBasicBlock corresponds to the BlockAddress. So split this into two separate bits: one for BlockAddress, and one for the machine-specific bits. Discovered while trying to sort out related stuff on D102817. Differential Revision: https://reviews.llvm.org/D124697	2022-08-16 16:15:44 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Kazu Hirata	ba0407ba86	[llvm] Use range-based for loops (NFC)	2022-08-07 00:16:21 -07:00
Bill Wendling	8f2ec974d1	[X86] Move target-generic code into CodeGen [NFC] This code is the same for all platforms. Differential Revision: https://reviews.llvm.org/D124566	2022-04-27 15:37:28 -07:00
Matt Arsenault	3659780d58	MachineModuleInfo: Remove UsesMorestackAddr This is x86 specific, and adds statefulness to MachineModuleInfo. Instead of explicitly tracking this, infer if we need to declare the symbol based on the reference previously inserted. This produces a small change in the output due to the move from AsmPrinter::doFinalization to X86's emitEndOfAsmFile. This will now be moved relative to other end of file fields, which I'm assuming doesn't matter (e.g. the __morestack_addr declaration is now after the .note.GNU-split-stack part) This also produces another small change in code if the module happened to define/declare __morestack_addr, but I assume that's invalid and doesn't really matter.	2022-04-20 11:10:20 -04:00
Matt Arsenault	d7938b1a81	MachineModuleInfo: Move HasSplitStack handling to AsmPrinter This is used to emit one field in doFinalization for the module. We can accumulate this when emitting all individual functions directly in the AsmPrinter, rather than accumulating additional state in MachineModuleInfo. Move the special case behavior predicate into MachineFrameInfo to share it. This now promotes it to generic behavior. I'm assuming this is fine because no other target implements adjustForSegmentedStacks, or has tests using the split-stack attribute.	2022-04-20 10:54:29 -04:00
Fangrui Song	ac6878b330	[X86] Set frame-setup/frame-destroy on prologue/epilogue CFI instructions This approach is used by AArch64/RISCV to make frame-setup/frame-destroy instructions contiguous instead of being interleaved by CFI instructions. Code checking `MBBI->getFlag(MachineInstr::FrameSetup) \|\| MBBI->isCFIInstruction()` can be simplified to just check FrameSetup. This helps locate all CFI instructions in the prologue, which can be handy to use .cfi_remember_state/.cfi_restore_state to decrease unwind table size (D114545). Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122541	2022-03-31 23:04:50 -07:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Bill Wendling	74aa44a887	[X86] Zero out the 32-bit GPRs explicitly This should ensure that only the 32-bit xors are emitted, and not the 64-bit xors. Differential Revision: https://reviews.llvm.org/D119523	2022-02-10 23:09:00 -08:00
Reid Kleckner	f3481f43bb	[X86] Only force FP usage in the presence of pushf/popf on Win64 This ensures that the Windows unwinder will work at every instruction boundary, and allows other targets to read and write flags without setting up a frame pointer. Fixes GH-46875 Differential Revision: https://reviews.llvm.org/D119391	2022-02-09 18:23:16 -08:00
Bill Wendling	d295a53a92	[X86] Specify Undef for the registers we xor Fixes expensive check failures from D110869.	2022-02-09 02:06:12 -08:00
Bill Wendling	deaf22bc0e	[X86] Implement -fzero-call-used-regs option The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attribute: zero_call_used_regs. The two upper categories are: - "used": Zero out used registers. - "all": Zero out all registers, whether used or not. The individual options are: - "skip": Don't zero out any registers. This is the default. - "used": Zero out all used registers. - "used-arg": Zero out used registers that are used for arguments. - "used-gpr": Zero out used registers that are GPRs. - "used-gpr-arg": Zero out used GPRs that are used as arguments. - "all": Zero out all registers. - "all-arg": Zero out all registers used for arguments. - "all-gpr": Zero out all GPRs. - "all-gpr-arg": Zero out all GPRs used for arguments. This is used to help mitigate Return-Oriented Programming exploits. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110869	2022-02-08 17:42:54 -08:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
Alexander Shaposhnikov	22430ede7e	[CodeGen] Rename emitCalleeSavedFrameMoves This diff renames emitCalleeSavedFrameMoves to avoid conflicts with non-virtual methods of derived classes having the same name but different semantics. E.g. the class AArch64FrameLowering used to have (non-virtual) "emitCalleeSavedFrameMoves" but it started to override TargetFrameLowering::emitCalleeSavedFrameMoves after https://github.com/llvm/llvm-project/commit/c3e6555616 though its usage and semantics didn't change. P.S. for x86 there was no conflict because the signature of non-virtual X86FrameLowering::emitCalleeSavedFrameMoves is different Test plan: make check-all Differential revision: https://reviews.llvm.org/D114140	2022-01-10 01:33:04 +00:00
Erik Desjardins	a8ac117d98	[X86] add dwarf information for loop stack probe This patch is based on https://reviews.llvm.org/D99585. While inside the stack probing loop, temporarily change the CFA to be based on r11/eax, which are already used to hold the loop bound. The stack pointer cannot be used for CFI here as it changes during the loop, so it does not have a constant offset to the CFA. Co-authored-by: YangKeao <keao.yang@yahoo.com> Reviewed By: nagisa Differential Revision: https://reviews.llvm.org/D116628	2022-01-07 15:02:59 +08:00
Erik Desjardins	a390c9905d	[X86] Improve selection of the mov instruction in FrameLowering MOV64ri results in a significantly longer encoding, and use of this operator is fairly avoidable as we can always check the size of the immediate we're using. This is an updated version of D99045. Co-authored-by: Simonas Kazlauskas <git@kazlauskas.me> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116458	2022-01-03 11:10:16 -08:00
Kazu Hirata	483499670e	[Target] Use llvm::reverse (NFC)	2021-12-12 08:34:24 -08:00
Jeremy Morse	7093c81010	[DebugInfo][InstrRef][X86] Instrument expanded DYN_ALLOCAs If we have a DYN_ALLOCA_* instruction, it will eventually be expanded to a stack probe and subtract-from-SP. Add debug-info instrumentation to X86FrameLowering::emitStackProbe so that it can redirect debug-info for the DYN_ALLOCA to the lowered stack probe. In practice, this means putting an instruction number label either the call instruction to _chkstk for win32, or more commonly on the subtract from SP instruction. The two tests added cover both of these cases. Differential Revision: https://reviews.llvm.org/D114452	2021-11-30 11:50:05 +00:00
Jeremy Morse	bfadc5dcbf	[DebugInfo][InstrRef] Cope with win32 calls changing SP in LiveDebugValues Almost all of the time, call instructions don't actually lead to SP being different after they return. An exception is win32's _chkstk, which which implements stack probes. We need to recognise that as modifying SP, so that copies of the value are tracked as distinct vla pointers. This patch adds a target frame-lowering hook to see whether stack probe functions will modify the stack pointer, store that in an internal flag, and if it's true then scan CALL instructions to see whether they're a stack probe. If they are, recognise them as defining a new stack-pointer value. The added test exercises this behaviour: two calls to _chkstk should be considered as producing two different values. Differential Revision: https://reviews.llvm.org/D114443	2021-11-24 19:56:21 +00:00
Kazu Hirata	ea5421bd0d	[llvm] Use range-based for loops (NFC)	2021-11-21 19:24:15 -08:00
Luo, Yuanke	c4dba47196	[X86][AMX] Don't emit tilerelease for old AMX instrisic. We should avoid mixing old AMX instrinsic with new AMX intrinsic. For old AMX intrinsic, user is responsible for invoking tile release. This patch is to check if there is any tile config generated by compiler. If so it emit tilerelease instruction, otherwise it don't emit the instruction. Differential Revision: https://reviews.llvm.org/D114066	2021-11-18 09:28:32 +08:00
Kazu Hirata	6fe949c4ed	[Target, Transforms] Use StringRef::contains (NFC)	2021-10-22 08:52:33 -07:00
Doug Gregor	a773db7d76	Add a command-line flag to control the Swift extended async frame info. Introduce a new command-line flag `-swift-async-fp={auto\|always\|never}` that controls how code generation sets the Swift extended async frame info bit. There are three possibilities: * `auto`: which determines how to set the bit based on deployment target, either statically or dynamically via `swift_async_extendedFramePointerFlags`. * `always`: the default, always set the bit statically, regardless of deployment target. * `never`: never set the bit, regardless of deployment target. Patch by Doug Gregor <dgregor@apple.com> Reviewed By: doug.gregor Differential Revision: https://reviews.llvm.org/D109392	2021-09-16 06:57:45 -07:00
Tim Northover	5d070c8259	SwiftAsync: use runtime-provided flag for extended frame if back-deploying When back-deploying Swift async code we can't always toggle the flag showing an extended frame is present because it will confuse unwinders on systems released before this feature. So in cases where the code might run there, we `or` in a mask provided by the runtime (as an absolute symbol) telling us whether the unwinders can cope. When deploying only for newer OSs, we can still hard-code the bit-set for greater efficiency.	2021-09-13 13:54:46 +01:00
Elliot Saba	ae8507b0df	[X86] Don't clobber EBX in stackprobes On X86, the stackprobe emission code chooses the `R11D` register, which is illegal on i686. This ends up wrapping around to `EBX`, which does not get properly callee-saved within the stack probing prologue, clobbering the register for the callers. We fix this by explicitly using `EAX` as the stack probe register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D109203	2021-09-07 15:00:44 -04:00
Fangrui Song	76529b4468	[X86] Simplify condition guarding emitCalleeSavedFrameMoves. NFC	2021-09-06 15:54:02 -07:00
Fangrui Song	4f1e410a1b	[X86] Simplify two hasFP(F). NFC	2021-09-06 15:47:40 -07:00
Tim Northover	d7b0e5525a	X86: fix frame offset calculation with mandatory tail calls If there's a region of the stack reserved for potential tail call arguments (only the case when we guarantee tail calls will be honoured), this is right next to the incoming stored return address, not necessarily next to the callee-saved area, so combining the two into a single figure leads to incorrect offsets in some edge cases.	2021-08-04 10:02:42 +01:00
Saleem Abdulrasool	17bdc0ff6f	X86: balance the frame prologue and epilogue on Win64 This was broken in `ba1509da7b`. The Win64 frame would not perform the setup of the Swift async context parameter but would tear down the setup in the epilogue resulting in crashes. This ensures that we do the full setup when we do the tear down. Although this is non-conforming to the Win64 calling convention, it corrects the setup and exposes the actual issue that the change introduced: incorrect frame setup. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D104246	2021-06-15 20:13:52 -07:00
Tim Northover	ba1509da7b	Recommit X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86). Recommited with a fix for unwind info when i386 pc-rel thunks are adjacent to a prologue.	2021-05-18 15:19:05 +01:00
Mitch Phillips	6791a6b309	Revert "X86: support Swift Async context" This reverts commit `747e5cfb9f`. Reason: New frame layout broke the sanitizer unwinder. Not clear why, but seems like some of the changes aren't always guarded by Swyft checks. See https://reviews.llvm.org/rG747e5cfb9f5d944b47fe014925b0d5dc2fda74d7 for more information.	2021-05-17 12:44:57 -07:00
Tim Northover	747e5cfb9f	X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86).	2021-05-17 11:56:16 +01:00
Simon Pilgrim	9ca2c50b36	[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies (REAPPLIED). NFCI. Reapply rG5ed56a821c06 (after reverted by rG7aa89c4a22fd) - don't take reference from struct that will be erased in X86FrameLowering::eliminateCallFramePseudoInstr	2021-05-15 13:23:28 +01:00
Mitch Phillips	7aa89c4a22	Revert "[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI." This reverts commit `5ed56a821c`. Reason: Broke the MSan buildbots. See Phabricator for more info (https://reviews.llvm.org/rG5ed56a821c0622869739a3ae752eea97a1ee1f48).	2021-05-14 14:30:57 -07:00
Simon Pilgrim	5ed56a821c	[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.	2021-05-14 11:14:18 +01:00
YangKeao	1c268a8ff4	[X86] add dwarf annotation for inline stack probe While probing stack, the stack register is moved without dwarf information, which could cause panic if unwind the backtrace. This commit only add annotation for the inline stack probe case. Dwarf information for the loop case should be done in another patch and need further discussion. Reviewed By: nagisa Differential Revision: https://reviews.llvm.org/D99579	2021-04-01 00:32:50 +03:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Wang, Pengfei	a5d9e0c79b	[X86] Fix tile config register spill issue. This is an optimized approach for D94155. Previous code build the model that tile config register is the user of each AMX instruction. There is a problem for the tile config register spill. When across function, the ldtilecfg instruction may be inserted on each AMX instruction which use tile config register. This cause all tile data register clobber. To fix this issue, we remove the model of tile config register. Instead, we analyze the AMX instructions between one call to another. We will insert ldtilecfg after the first call if we find any AMX instructions. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D95136	2021-01-30 12:53:57 +08:00
Reid Kleckner	988a5334ed	[Win64] Ensure all stack frames are 8 byte aligned The unwind info format requires that all adjustments are 8 byte aligned, and the bottom three bits are masked out. Most Win64 calling conventions have 32 bytes of shadow stack space for spilling parameters, and I believe that constructing these fixed stack objects had the side effect of ensuring an alignment of 8. However, the Intel regcall convention does not have this shadow space, so when using that convention, it was possible to make a 4 byte stack frame, which was impossible to describe with unwind info. Fixes pr48867	2021-01-25 10:39:27 -08:00
Luo, Yuanke	64132f541e	Revert "[X86][AMX] Fix tile config register spill issue." This reverts commit `20013d02f3`.	2021-01-21 18:11:43 +08:00
Luo, Yuanke	20013d02f3	[X86][AMX] Fix tile config register spill issue. Previous code build the model that tile config register is the user of each AMX instruction. There is a problem for the tile config register spill. When across function, the ldtilecfg instruction may be inserted on each AMX instruction which use tile config register. This cause all tile data register clobber. To fix this issue, we remove the model of tile config register. We analyze the regmask of call instruction and insert ldtilecfg if there is any tile data register live across the call. Inserting the sttilecfg before the call is unneccessary, because the tile config doesn't change and we can just reload the config. Besides we also need check tile config register interference. Since we don't model the config register we should check interference from the ldtilecfg to each tile data register def. ldtilecfg / \ BB1 BB2 / \ call BB3 / \ %1=tileload %2=tilezero We can start from the instruction of each tile def, and backward to ldtilecfg. If there is any call instruction, and tile data register is not preserved, we should insert ldtilecfg after the call instruction. Differential Revision: https://reviews.llvm.org/D94155	2021-01-21 16:01:50 +08:00
Kazu Hirata	cfeecdf7b6	[llvm] Use llvm::all_of (NFC)	2021-01-06 18:27:36 -08:00
Luo, Yuanke	f80b29878b	[X86] AMX programming model. This patch implements amx programming model that discussed in llvm-dev (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html). Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet. This patch implemeted 7 components. 1. The c interface to end user. 2. The AMX intrinsics in LLVM IR. 3. Transform load/store <256 x i32> to AMX intrinsics or split the type into two <128 x i32>. 4. The Lowering from AMX intrinsics to AMX pseudo instruction. 5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx intruction. 6. The register allocation for tile register. 7. Morph AMX pseudo instruction to AMX real instruction. Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0 Differential Revision: https://reviews.llvm.org/D87981	2020-12-10 17:01:54 +08:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Fangrui Song	96b0b9a5e3	[X86] Enable shrink-wrapping for no-frame-pointer non-nounwind functions on platforms not using compact unwind The current compact unwind scheme does not work when the prologue is not at the start (the instructions before the prologue cannot be described). (Technically this is fixable, but it requires multiple compact unwind descriptors for one function.) rL255175 chose to not perform shrink-wrapping for no-frame-pointer functions not marked as nounwind to work around PR25614. This is overly limited, as platforms not supporting compact unwind (all non-Darwin) does not need the workaround. This patch restricts the limitation to compact unwind platforms. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D89930	2020-11-04 16:51:48 -08:00
Simon Pilgrim	e0cbcf96ce	[X86] Make the X86FrameSortingComparator operator const. NFCI. Fixes a cppcheck remark.	2020-10-31 12:16:49 +00:00
Scott Constable	dc3dba7dbd	[X86] Move findDeadCallerSavedReg() into X86RegisterInfo The findDeadCallerSavedReg() function has utility outside of X86FrameLowering.cpp Differential Revision: https://reviews.llvm.org/D88924	2020-10-07 18:33:48 -07:00
serge-sans-paille	f2c6bfa350	Fix interaction between stack alignment and inline-asm stack clash protection As reported in https://github.com/rust-lang/rust/issues/70143 alignment is not taken into account when doing the probing. Fix that by adjusting the first probe if the stack align is small, or by extending the dynamic probing if the alignment is large. Differential Revision: https://reviews.llvm.org/D84419	2020-10-02 16:51:49 +02:00

1 2 3 4 5 ...

517 Commits