llvm-project

Commit Graph

Author	SHA1	Message	Date
zhoujingya	650f1199e9	[VENTUS][fix] Refactor the tp&sp stack size and frame object's offset calculation In traditional llvm framework, the function stack just contains one single kind stack pointer, for RISCV, they are sp&s0 registers, in ventus, because of the existence of perthread private memory, we design a new perthread stack, which will also use the apis of MachineFrameInfo, so the frame objects' offsets calculation will result in error if we follow the official RISCV way, this patch will identify ID for every different frame object, and then only calculate stack offset for same identity stack object regardless of other stack object	2023-12-11 17:43:04 +08:00
zhoujingya	f9a20984b5	[VENTUS][fix] Comment out illegal fmv.w.x instruction and change vmv instructions' format https://github.com/THU-DSP-LAB/llvm-project/issues/30	2023-10-09 14:04:55 +08:00
zhoujingya	dc3ffe70cf	[VENTUS][fix] Fix getStackSize calculation bugs	2023-09-14 15:52:08 +08:00
zhoujing	6491bdfb02	Revert "[VENTUS][fix] No need to spill/restore callee saved registers for kernel function" This reverts commit `85df9000bb`.	2023-08-27 15:47:26 +08:00
zhoujing	85df9000bb	[VENTUS][fix] No need to spill/restore callee saved registers for kernel function	2023-08-21 14:44:45 +08:00
zhoujing	826c4cb599	Revert "[VENTUS][fix] Insert barrier instruction for function calling" This reverts commit `7e4b7a6ae1`.	2023-08-16 14:50:42 +08:00
zhoujing	50b23dc21a	[VENTUS][fix] Deprecating vmv.s.x and use vmv.v.x instead As required, vmv.s.x instruction may will later be deprecated	2023-08-01 13:25:24 +08:00
zhoujing	7e4b7a6ae1	[VENTUS][fix] Insert barrier instruction for function calling Stack space is shared between different warps, if two warps are executing different functions, then the access to the return address will conflict, which will lead the warp executing faster can not find the return address, so we would like to add a barrier instruction after the lw and before the ret, to ensure that the warps have the same scope of the sp pointer	2023-07-31 11:01:14 +08:00
zhoujing	98474922a4	[VENTUS][fix] Add LDS/PDS calculation Later need to fix the local data declaration calculation	2023-07-27 11:58:31 +08:00
zhoujing	623ca8b4ba	[VENTUS][RISCV][fix] Fix stack size calculation bug	2023-07-21 18:02:33 +08:00
zhoujing	24dbcd9b0e	[VENTUS][RISCV][NFC] Define interfaces for VENTUS Our previous design has two stacks, TP&SP, but we only need to store ra to sp, and restore it from sp, this make it inconvenient to calculate stack offset for two stack frame offset, Here we just define interfaces, but we do not really implement it, if needed, we need to remove callee saved registers, and modify the related overrided functions	2023-06-28 11:19:29 +08:00
zhoujing	7b8402802a	[VENTUS][RISCV][fix] Fix calling convention	2023-06-25 22:03:04 +08:00
zhoujing	f494e20d44	[VENTUS][RISCV][fix] Fix private memory access instructions' codegen errors We changed the private memory access' encoding in this commit ``6da666856b``, this commit is to fix the codegen bugs by that commit	2023-06-25 10:59:21 +08:00
Aries	e6b7935c89	[Ventus] ABI and stack adjustment. Remove all SGPRs(except ra) from callee saved register set, as they are mainly used in kernel function. Unify the stack to use TP only, we will emit customized instructions for SP use which should not be considered as stack according to LLVM codegen infrastructure(only 1 stack is allowed). By unifying the stack to TP based, it is much easiler for the backend codegen.	2023-06-21 13:08:02 +08:00
zhoujing	513412bb33	[VENTUS][RISCV][fix] Fix building libclc errors	2023-06-16 17:42:22 +08:00
zhoujing	6636793f64	Merge libclc-vector-support	2023-06-16 09:41:08 +08:00
zhoujing	c30c837caa	[VENTUS][RISCV][fix] Fix SP stack size calculation error	2023-06-15 18:12:34 +08:00
zhoujing	c60810b243	[VENTUS][RISCV][feat] Modify SP stack size calculation Add initial SP stack size calculation support, still remains many issues	2023-06-12 13:27:55 +08:00
zhoujing	faf6a0bcd9	[VENTUS][RISCV][fix] Add initial Tp stack size calculation Cause there are two stacks in Ventus, we need to seperate TP stack and SP stack, this commit just add very initial support for TP stack size calculation	2023-06-11 12:18:39 +08:00
zhoujing	033505de1d	[VENTUS][RISCV][fix] Modify calling convention	2023-06-05 17:11:25 +08:00
zhoujing	967cb725c8	[VENTUS][RISCV][feat] Set ventus kernel for OpenCL kernel functions	2023-06-05 13:10:35 +08:00
zhoujingya	9d9283fa7b	[VENTUS][RISCV][fix] Fix ventus abi and calling convention Kernel functions use sp as GPRs spill stack slots Non-kernel functions use tp as VGPRs spill stack slots	2023-04-20 15:27:52 +08:00
zhoujingya	f28e6c5e38	[VENTUS][RISCV][feat] Add vararg backend support in ventus We adjust the stack growing direction early months for OpenCL, in order to be compatible with current architecture, we need to do some modification to support vararg	2023-04-18 10:03:53 +08:00
Aries	438f1c92c4	Fix some build warnings	2023-01-19 09:45:27 +08:00
Aries	a173844ae5	Grow Ventus GPGPU stack upwards instead of downwards	2023-01-04 10:29:53 +08:00
Aries	9925e4e511	Define callee saved registers for Ventus GPGPU. Initially implemented 2 stacks support for sGPR spill/restore stack and per-thread stack, but stack size calculation is computed as a sum of 2 stacks(this works but wastes lot of spaces). Now TP register is used as per-thread stack pointer, SP register is used for sGPR spill/restore. Clean up RVV related stack frame code etc.	2022-12-28 16:37:38 +08:00
Aries	424ea45e4f	Update Ventus GPGPU ABI: X4 as stack pointer, V0-V31 as arguments registers etc	2022-12-28 13:11:22 +08:00
Aries	228be521e5	Add initial different stack frame support for sALU and vALU. FIXME: The stack pointer RISCV::X4 for vALU is not yet correctly used, but related infrastructure should work(MFI.isEntryFunction() is used to check RISCV::X2 or RISCV::X4 to be used as stack pointer).	2022-12-27 18:28:51 +08:00
Aries	8c531048c2	Initially add vector load/store instruction and related codegen	2022-12-21 16:27:39 +08:00
Philip Reames	14d993435b	[RISCV] Inline RISCVFrameLowering::adjustReg out of existance [nfc] This was requested by a reviewer in D138926.	2022-11-30 11:07:45 -08:00
Philip Reames	c0692c08ee	[RISCV] Adjust code to fallthrough to a single adjustReg callsite [nfc] Note that we have to now pass alignment to that callsite because the wrapper previously did that for us for fixed offsets.	2022-11-30 10:45:55 -08:00
Philip Reames	1f04ac54f9	[RISCV] Merge two versions of adjustReg on TRI [nfc] After `ac1ec9e`, the version with the StackOffset param has a strict superset of behavior. As a result, we can switch callers to use it, and then inline the other version into the now-single caller.	2022-11-30 10:12:40 -08:00
Philip Reames	80fcf992b7	[RISCV] Reuse and generalize adjustReg from another spot in frame lowering [nfc] Differential Revision: https://reviews.llvm.org/D138926	2022-11-30 09:43:14 -08:00
Philip Reames	ac1ec9e290	[RISCV] Share code for fixed offsets adjustRegs (thus materializing fewer constants) This reuses the existing optimized implementation of adjustReg, and commons up code. This has the effect of enabling two code changes for the new caller. First, we enable the "split andi" lowering (with no alignment requirement), and second we use a sub with smaller constant in register instead of a add with negative constant in register. Differential Revision: https://reviews.llvm.org/D132839	2022-11-30 09:28:29 -08:00
Philip Reames	1a5be5265c	[RISCV] Move implementation of adjustReg from frame lowering to register info [nfc] Putting both variants of this function in the same place, in advance of code resuse. Note that I tweaked the API slightly in advance of additional callers without the alignment requirement. Some of the existing callers may also be okay with weaker alignment requirements, but that should be it's own set of changes.	2022-11-28 12:41:00 -08:00
Philip Reames	06e2b44c46	[RISCV] Optimize scalable frame setup when VLEN is precisely known If we know the exact value of VLEN, the frame offset adjustment for scalable stack slots becomes a fixed constant. This avoids the need to read vlenb, and may allow the offset to be folded into the immediate field of an add/sub. We could go further here, and fold the offset into a single larger frame adjustment - instead of having a separate scalable adjustment step - but that requires a bit more code reorganization. I may (or may not) return to that in a future patch. Differential Revision: https://reviews.llvm.org/D137593	2022-11-18 15:30:39 -08:00
Craig Topper	2c82080f09	[MachineFrameInfo][RISCV] Call ensureStackAlignment for objects created with scalable vector stack id. This is an alternative to fix PR57939 for RISC-V. It definitely can be argued that the stack temporaries for RISC-V are being created with an unnecessarily large alignment. But ignoring the alignment in MachineFrameInfo also seems bad. Looking at the test update that go with the current ID==0 check, it was intending to exclude things like the NoAlloc stackid. So I'm not sure if scalable vectors are intentionally being excluded. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135913	2022-10-20 14:05:46 -07:00
Craig Topper	31bca38ad1	[RISCV] Pass the destination register to getVLENFactoredAmount instead of returning it. NFC This is a refactor for another patch. For now we move the vreg creation to the caller. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D135008	2022-10-03 10:59:35 -07:00
ZHU Zijia	9c85382ade	[RISCV] Handle register spill in branch relaxation In branch relaxation pass, `j`'s with offset over 1MiB will be relaxed to `jump` pseudo-instructions. This patch allocates a stack slot for functions with a size greater than 1MiB. If the register scavenger cannot find a scratch register for `jump`, spill a register to the slot before the jump and restore it after the jump. .mbb: foo j .dest_bb bar bar bar .dest_bb: baz The above code will be relaxed to the following code. .mbb: foo sd s11, 0(sp) jump .restore_bb, s11 bar bar bar j .dest_bb .restore_bb: ld s11, 0(sp) .dest_bb: baz Depends on D129999. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D130560	2022-08-24 13:27:56 +08:00
Kazu Hirata	f5a68feab3	Use llvm::none_of (NFC)	2022-08-14 16:25:39 -07:00
Alex Bradbury	5ad59c9e59	[RISCV][NFCI] Set TransientStackAlignment and rely on it rather than RVV-specific logic on RVV-less functions * TargetFrameLowering has a TransientStackAlignment field that "returns the number of bytes to which the stack pointer must be aligned at all times, even between calls. * As explained in the [RISC-V calling convention](https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc), the stack pointer must remain fully aligned throughout execution for compliant code. This is important for embedded targets that might avoid realigning the stack pointer for interrupt service routines. Systems running full OSes may always realign the stack anyway. * TransientStackAlignment is used in estimateStackSize in MachineFrameInfo and in PEI::calculateFrameObjectOffsets. * estimateStackSize is only used in the RISC-V backend for scavenging slots. It may be possible to craft a function where the difference is observable, but it wouldn't be a meaningful test. * calculateFrameObjectOffsets makes use of TransientStackAlignment, but then sets the stack alignment to the max of that alignment and MaxAlign, which is unconditionally set to 16 in RISCVFrameLowering::processFunctionBeforeFrameFinalized * I've changed this logic to only set MaxAlign if there are RVV frame objects. There should be no functional change here for either RVV targets (MaxAlign is set as before) or non-RVV targets (TransientStackAlign is now 16 anyway). Differential Revision: https://reviews.llvm.org/D130068	2022-08-02 09:46:06 +01:00
Fraser Cormack	b336cf856e	[RISCV] Add early-exit to RVV stack computation. NFCI. This patch was split off from D126465, where an early-exit is necessary as it checks the VLEN and that asserts that V instructions are present. Since this makes logical sense on its own, I think it's worth landing regardless of D126465. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D129617	2022-07-13 08:50:08 +01:00
luxufan	0f45eaf0da	[RISCV] Add a scavenge spill slot when use ADDI to compute scalable stack offset Computing scalable offset needs up to two scrach registers. We add scavenge spill slots according to the result of `RISCV::isRVVSpill` and `RVVStackSize`. Since ADDI is not included in `RISCV::isRVVSpill`, PEI doesn't add scavenge spill slots for scrach registers when using ADDI to get scalable stack offsets. The ADDI instruction has a destination register which can be used as a scrach register. So one scavenge spil slot is sufficient for computing scalable stack offsets. Differential Revision: https://reviews.llvm.org/D128188	2022-07-03 20:18:13 +08:00
Yeting Kuo	5744b9cb79	[RISCV] Restore "Enable shrink wrap by default" This reverts commit `7af3d4ab3d`. RISC-V reverted the shrink wrap patch for bug 53662. Since the bug is fixed by D123679, the commit re-enable it. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D128965	2022-07-02 11:13:13 +08:00
Craig Topper	d63b66840f	[RISCV] Move some methods out of RISCVInstrInfo and into RISCV namespace. These methods don't access any state from RISCVInstrInfo. Make them free functions in the RISCV namespace. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D127583	2022-06-12 10:47:21 -07:00
Kito Cheng	4b11f90903	[RISCV] Fix missing stack pointer recover In order to make sure the stack point is right through the EH region, we also need to restore stack pointer from the frame pointer if we don't preserve stack space within prologue/epilogue for outgoing variables, normally it's just checking the variable sized object is present or not is enough, but we also don't preserve that at prologue/epilogue when have vector objects in stack. Example to show what happened: ``` try { sp adjust for outgoing args. // 1. Sp changed. func_call // 2. Exception raised sp restore // Oh, not restored } catch { // 3. And now we are here. } // 4. Prepare to return!, restore return address from stack, but...sp is wrong. // 5. Screw up! ``` Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D126861	2022-06-09 23:38:50 +08:00
Craig Topper	1b2de79ff4	[RISCV] Use two ADDIs to do some stack pointer adjustments. If the adjustment doesn't fit in 12 bits, try to break it into two 12 bit values before falling back to movImm+add/sub. This is based on a similar idea from isel. Reviewed By: luismarques, reames Differential Revision: https://reviews.llvm.org/D126392	2022-05-31 10:25:28 -07:00
Philip Reames	d58cc0839e	[RISCV] reorganize getFrameIndexReference to reduce code duplication [nfc] This change reorganizes the majority of frame index resolution into a two strep process. Step 1 - Select which base register we're going to use. Step 2 - Compute the offset from that base register. The key point is that this allows us to share the step 2 logic for the SP case. This reduces the code duplication, and (I think) makes the code much easier to follow. I also went ahead and added assertions into phase 2 to catch errors where we select an illegal base pointer. In general, we can't index from a base register to a stack location if that requires crossing a variable and unknown region. In practice, we have two such cases: dynamic stack realign and var sized objects. Note that crossing the scalable region is fine since while variable, it's a known variability which can be expressed in the offset. Differential Revision: https://reviews.llvm.org/D126403	2022-05-26 09:44:58 -07:00
Philip Reames	dd336b6891	[RISCV] Restructure comment and add clarifying assert to getFrameIndexReference [NFC] Differential Revision: https://reviews.llvm.org/D126088	2022-05-25 07:59:27 -07:00
Fraser Cormack	fd93736657	[RISCV] Replace untested code with assert We found untested code where negative frame indices were ostensibly handled despite it being in a block guarded by !MFI.isFixedObjectIndex. While the implementation of MachineFrameInfo::isFixedObjectIndex suggests this is possible (i.e., if a frame index was more negative - less than the number of fixed objects), I couldn't find any test in tree -- for any target -- where a negative frame index wasn't also a fixed object offset. I couldn't find a way of creating such a object with the public MachineFrameInfo creation APIs. Even MachineFrameInfo::getObjectIndexBegin starts counting at the negative number of fixed objects, so such frame indices wouldn't be covered by loops using the provided begin/end methods. Given all this, an assert that any object encountered in the block is non-negative seems reasonable. Reviewed By: StephenFan, kito-cheng Differential Revision: https://reviews.llvm.org/D126278	2022-05-25 05:03:53 +01:00

1 2 3

130 Commits