Commit Graph

898 Commits

Author SHA1 Message Date
zhoujing 7b8402802a [VENTUS][RISCV][fix] Fix calling convention 2023-06-25 22:03:04 +08:00
zhoujing a6e8ff959a [VENTUS][RISCV][fix] Add missing flags for building libclc
The flags information are lost by previous merge commit `Merge libclc-vector-support`
2023-06-21 10:59:48 +08:00
zhoujing 615705a6c6 [VENTUS][RISCV][fix] Fix calling convention 2023-06-19 17:21:04 +08:00
zhoujing 513412bb33 [VENTUS][RISCV][fix] Fix building libclc errors 2023-06-16 17:42:22 +08:00
zhoujing 6636793f64 Merge libclc-vector-support 2023-06-16 09:41:08 +08:00
zhoujing e54daab265 [VENTUS][RISCV][fix] Fix function call calling convention 2023-06-15 13:36:12 +08:00
zhoujing 53a932e665 [VENTUS][RISCV][fix] Modify calling convention for non-kernel function arguments based on private memory address
In our previous calling convention design, all non-kernel arguments are passed
by VGPRS or TP stack, but when the arguments point to private memory address
space, the wrong memory access instructions will be generated, because private
memory based address is scalar register
2023-06-14 21:26:53 +08:00
zhoujing e5e7a0047a [VENTUS][RISCV][fix] Fix local memory access error in kernel function 2023-06-12 16:22:45 +08:00
zhoujing 940da111a3 [VENTUS][RISCV][fix] Fix divergent analysis bug for store node 2023-06-12 14:50:55 +08:00
zhoujing faf6a0bcd9 [VENTUS][RISCV][fix] Add initial Tp stack size calculation
Cause there are two stacks in Ventus, we need to seperate TP stack and SP stack,
this commit just add very initial support for TP stack size calculation
2023-06-11 12:18:39 +08:00
zhoujing 033505de1d [VENTUS][RISCV][fix] Modify calling convention 2023-06-05 17:11:25 +08:00
zhoujing 967cb725c8 [VENTUS][RISCV][feat] Set ventus kernel for OpenCL kernel functions 2023-06-05 13:10:35 +08:00
zhoujingya ad23baaa51 [VENTUS][RISCV][feat] Add more floating point instructions pattern
Signed-off-by: zhoujing <jing.zhou@terapines.com>
2023-05-25 14:48:30 +08:00
zhoujingya 9d9283fa7b [VENTUS][RISCV][fix] Fix ventus abi and calling convention
Kernel functions use sp as GPRs spill stack slots
Non-kernel functions use tp as VGPRs spill stack slots
2023-04-20 15:27:52 +08:00
zhoujingya f28e6c5e38 [VENTUS][RISCV][feat] Add vararg backend support in ventus
We adjust the stack growing direction early months for OpenCL, in order to be
compatible with current architecture, we need to do some modification to
support vararg
2023-04-18 10:03:53 +08:00
Aries 438f1c92c4 Fix some build warnings 2023-01-19 09:45:27 +08:00
zhoujing 7e701d4ba1 Add support for float point trunc instruction match 2023-01-09 18:06:39 +08:00
Aries 0b43b70327 Fix bug in addressing space mapping 2023-01-03 10:45:58 +08:00
zhoujing 1fab7b80f3 Legalize operation for SETCC 2022-12-29 17:13:49 +08:00
Aries 17adb707e6 Fix bug in kernel arg memory offset calculation 2022-12-29 11:53:29 +08:00
Aries 424ea45e4f Update Ventus GPGPU ABI: X4 as stack pointer, V0-V31 as arguments registers etc 2022-12-28 13:11:22 +08:00
Aries e8368c07e1 Fix kernel argument lowering alignment bug. 2022-12-27 17:00:46 +08:00
Aries 3a9c32a024 Add initial vector support(calling convention fix). 2022-12-27 16:35:12 +08:00
Aries da5006ca8d Add support to lowering BITCAST and Constant Pool for zfinx etc 2022-12-27 13:39:46 +08:00
Aries 9be2c54215 Add initial vGPR + sGPRF32 (zfinx) support 2022-12-27 12:00:30 +08:00
Aries 7d7ef235fd Support f32 return type in VGPR 2022-12-27 11:21:08 +08:00
Aries 2f946d86ad Fix basicblock insertion ordering for ISD::SELECT lowering. 2022-12-22 17:47:03 +08:00
Aries cb6f30fbd7 Add initial support to lower ISD::SELECT into branch instructions in divergent execution path. 2022-12-22 17:17:02 +08:00
Aries b9da010dd5 [NFC] Refactor messy switch...case 2022-12-22 14:50:13 +08:00
Aries beb878e97c Add OpenCL addressing space mapping to RISCVAS.
Add kernel argument lowering.
Clean up a few unrelated RVV code.
2022-12-20 17:08:08 +08:00
Aries dee3135130 Drafting divergent related code, not working yet. 2022-12-19 18:11:34 +08:00
Aries c6b68cbedb Support move between vGPR and sGPR.
Fix a few bugs in calling convention related lowering functions.
2022-12-19 14:21:26 +08:00
Aries 4e0cd22745 Add vALU conditional branch instructions 2022-12-19 13:09:00 +08:00
Aries 894931f522 More clean up and fix build error. 2022-12-19 10:10:28 +08:00
Aries 521e83631d Roughly cleaned RVV instruction selection. 2022-12-19 09:40:05 +08:00
Aries 35633e31e3 In the middle of removing RVV code. 2022-12-16 18:04:43 +08:00
Aries f1eff7fcfe Very very early step to remove RVV features from code base. 2022-12-16 17:33:54 +08:00
Kazu Hirata 3c09ed006a [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:12:44 -08:00
Fangrui Song b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Kazu Hirata 20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek 864aaa21b4 TargetLowering: convert Optional to std::optional 2022-12-01 16:19:10 -08:00
Philip Reames 7d82c99403 [RISCV][TTI] Account for constant materialization cost when costing arithmetic operations
At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads.

We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant.

We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not.

Differential Revision: https://reviews.llvm.org/D138941
2022-11-30 07:20:51 -08:00
Philip Reames b25672ba82 [RISCV] Separate out helper for checking if vector splat supported for operand [nfc] 2022-11-29 11:05:46 -08:00
Kazu Hirata 2f61c6c639 [RISCV] Use std::optional in RISCVISelLowering.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 23:04:58 -08:00
LiaoChunyu aa14f002d5 [RISCV] Branchless lowering for (select (x < 0), TrueConstant, FalseConstant) and (select (x >= 0), TrueConstant, FalseConstant)
This patch reduces the number of unpredictable branches

(select (x < 0), y, z)  -> x >> (XLEN - 1) & (y - z) + z
(select (x >= 0), y, z) -> x >> (XLEN - 1) & (z - y) + y

Reviewed By: craig.topper, reames

Differential Revision: https://reviews.llvm.org/D137949
2022-11-25 20:18:30 +08:00
wangpc 241accea2a [RISCV] Lower unmasked zero-stride vector load to (scalar load + splat)
So we have the opportunity to fold splat into .vx instruction as what
D101138 has done. If failed, we can select zero-stride vector load
again.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138101
2022-11-24 11:09:45 +08:00
WuXinlong 219417b2c6 [RISCV] Add CodeGen support and MC testcase of RISCV Zca Extension
This patch add the support of RISCV Zca ext

`Zca` is a subset of C extension instructions that are compatible with the Zc extension.

So this patch implements Zca code generation with reference to the C extension and sets the 2-byte alignment for the Zca extension, just like C extension does.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130483
2022-11-22 17:22:26 +08:00
Han-Kuan Chen 7e6dbfcd9d [RISCV] Make lowerVECTOR_SHUFFLEAsVSlidedown follow source until not EXTRACT_SUBVECTOR.
Current lowerVECTOR_SHUFFLEAsVSlidedown only seeks whether input are
EXTRACT_SUBVECTOR and their source are same. The commit will make the
function seek input and their source until they are not
EXTRACT_SUBVECTOR.

Differential Revision: https://reviews.llvm.org/D138025
2022-11-17 22:32:53 -08:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Craig Topper 7e15ea102f [RISCV] Add a DAG combine to pre-promote (i1 (truncate (i32 (srl X, Y)))) with Zbs on RV64.
Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW,
which will prevent us from using a BEXT instruction.

This is similar to what we do for (i32 (and (srl X, Y), 1)).
2022-11-16 19:07:33 -08:00