llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	9adc00a9d0	[RISCV] Add a continue to reduce nesting. NFC	2022-07-23 17:36:12 -07:00
Kazu Hirata	1cc7f5bede	Use static_assert instead of assert (NFC) Identified with misc-static-assert.	2022-07-23 09:22:27 -07:00
Craig Topper	add17fc8e4	[RISCV] Combine (select_cc (srl (and X, 1<<C), C), 0, eq/ne, true, fale) (srl (and X, 1<<C), C) is the form we receive for testing bit C. An earlier combine removed the setcc so it wasn't there to match when we created the SELECT_CC. This doesn't happen for BR_CC because generic DAG combine rebuilds the setcc if it is used by BRCOND. We can shift X left by XLen-1-C to put the bit to be tested in the MSB, and use a signed compare with 0 to test the MSB.	2022-07-20 22:32:11 -07:00
Craig Topper	7dda6c71b1	[RISCV] Refactor the common combines for SELECT_CC and BR_CC into a helper function. The only difference between the combines were the calls to getNode that include the true/false values for SELECT_CC or the chain and branch target for BR_CC. Wrap the rest of the code into a helper that reads LHS, RHS, and CC and outputs new values and a bool if a new node needs to be created.	2022-07-20 21:18:07 -07:00
Craig Topper	8983db15a3	[RISCV] Optimize (brcond (seteq (and X, 1 << C), 0)) If C > 10, this will require a constant to be materialized for the And. To avoid this, we can shift X left by XLen-1-C bits to put the tested bit in the MSB, then we can do a signed compare with 0 to determine if the MSB is 0 or 1. Thanks to @reames for the suggestion. I've implemented this inside of translateSetCCForBranch which is called when setcc+brcond or setcc+select is converted to br_cc or select_cc during lowering. It doesn't make sense to do this for general setcc since we lack a sgez instruction. I've tested bit 10, 11, 31, 32, 63 and a couple bits betwen 11 and 31 and between 32 and 63 for both i32 and i64 where applicable. Select has some deficiencies where we receive (and (srl X, C), 1) instead. This doesn't happen for br_cc due to the call to rebuildSetCC in the generic DAGCombiner for brcond. I'll explore improving select in a future patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D130203	2022-07-20 18:40:49 -07:00
ksyx	3198364e6e	[RISCV][Clang] Add support for Zmmul extension This patch implements recently ratified extension Zmmul, a subextension of M (Integer Multiplication and Division) consisting only multiplication part of it. Differential Revision: https://reviews.llvm.org/D103313 Reviewed By: craig.topper, jrtc27, asb	2022-07-18 20:26:08 -04:00
Craig Topper	0b02752899	[RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1) (and X, 0xffffffff) requires 2 shifts in the base ISA. Since we know the result is being used by a compare, we can use a sext_inreg instead of an AND if we also modify C1 to have 33 sign bits instead of 32 leading zeros. This can also improve the generated code for materializing C1. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D129980	2022-07-18 10:54:45 -07:00
Simon Pilgrim	259c36e7c1	[DAG] Add asserts to isDesirableToCommuteWithShift overrides to ensure its being called from a shift. NFC.	2022-07-18 13:11:24 +01:00
jacquesguan	2b11174079	[RISCV][NFC] Use more Arrayref in TargetLowering functions. This patch replaces some foreach with Arrayref, and abstract some same literal array with a variable. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125656	2022-07-18 03:33:45 +00:00
Fangrui Song	d955497112	[RISCV] Simplify lowerGlobalAddress. NFC	2022-07-17 15:42:45 -07:00
Craig Topper	decf385c27	[RISCV] Teach targetShrinkDemandedConstant to handle OR and XOR. We were only handling AND before, but SimplifyDemandedBits can also call it for OR and XOR.	2022-07-17 12:36:33 -07:00
Craig Topper	257755530a	[RISCV] Fold (sra (sext_inreg (shl X, C1), i32), C2) -> (sra (shl X, C1+32), C2+32). The former pattern will select as slliw+sraiw while the latter will select as slli+srai. This can enable the slli+srai to be compressed. Differential Revision: https://reviews.llvm.org/D129688	2022-07-13 14:34:17 -07:00
Philip Reames	dde2a7fb6d	[RISCV] Exploit fact that vscale is always power of two to replace urem sequence When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale. vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.) We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.) It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic. Differential Revision: https://reviews.llvm.org/D129609	2022-07-13 10:54:47 -07:00
Craig Topper	c5be6a8308	[RISCV] Use X0 in place of VLMaxSentinel in lowering. I thought I had already fixed all of these, but I guess I missed one.	2022-07-11 23:29:04 -07:00
Craig Topper	c3c17b1695	[RISCV] Use MVT for the argument to getMaskTypeFor. NFC Only one caller didn't already have an MVT and that was easy to fix. Since the return type is MVT and it uses MVT::getVectorVT, taking an MVT as input makes the most sense.	2022-07-11 15:14:44 -07:00
Craig Topper	1a2bd44b77	[RISCV] Make shouldConvertConstantLoadToIntImm return true unless enableUnalignedScalarMem is true. This restores the old behavior before D129402 when enableUnalignedScalarMem is false. This fixes a regression spotted by @asb. To fix this correctly, we need to consider alignment of the load we'd be replacing, but that's not possible in the current interface.	2022-07-11 09:40:08 -07:00
LiaoChunyu	3f68f0f816	[RISCV] Optimize 2x SELECT for floating-point types Including the following opcode: Select_FPR16_Using_CC_GPR Select_FPR32_Using_CC_GPR Select_FPR64_Using_CC_GPR Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D127871	2022-07-11 14:10:27 +08:00
Craig Topper	35ec8a423d	[RISCV] Teach shouldConvertConstantLoadToIntImm that constant materialization can use constant pools. I think it only makes sense to return true here if we aren't going to turn around and create a constant pool for the immmediate. I left out the check for useConstantPoolForLargeInts() thinking that even if you don't want the commpiler to create a constant pool you might still want to avoid materializing an integer that is already available in a global variable. Test file was copied from AArch64/ARM and has not been commited yet. Will post separate review for that. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D129402	2022-07-10 14:10:17 -07:00
Lian Wang	9cfb28d672	[RISCV] Change VECTOR_SPLICE mask operation from expand to promote Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D128717	2022-07-08 06:20:22 +00:00
Diego Caballero	bf1758c3dc	Revert "[RISCV] Optimize 2x SELECT for floating-point types" This reverts commit `1178992c72`.	2022-07-07 22:54:00 +00:00
Craig Topper	51d672946e	[RISCV] Fold (sra (add (shl X, 32), C1), 32 - C) -> (shl (sext_inreg (add X, C1), C) Similar for a subtract with a constant left hand side. (sra (add (shl X, 32), C1<<32), 32) is the canonical IR from InstCombine for (sext (add (trunc X to i32), 32) to i32). For RISCV, we should lower this as addiw which means turning it into (sext_inreg (add X, C1)). There is an existing DAG combine to convert back to (sext (add (trunc X to i32), 32) to i32), but it requires isTruncateFree to return true and for i32 to be a legal type as it used sign_extend and truncate nodes. So that doesn't work for RISCV. If the outer sra happens be used by a shl by constant, it will be folded and the shift amount of the sra will be changed before we can do our own DAG combine. This requires us to match the more general pattern and restore the shl. I had wanted to do this as a separate (add (shl X, 32), C1<<32) -> (shl (add X, C1), 32) combine, but that hit an infinite loop for some values of C1. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D128869	2022-06-30 09:01:24 -07:00
Craig Topper	9ace5af049	[RISCV] DAG combine (sra (shl X, 32), 32 - C) -> (shl (sext_inreg X, i32), C). The sext_inreg can often be folded into an earlier instruction by using a W instruction. The sext_inreg also works better with our ABI. This is one of the steps to improving the generated code for this https://godbolt.org/z/hssn6sPco Reviewed By: asb Differential Revision: https://reviews.llvm.org/D128843	2022-06-30 09:01:24 -07:00
Philip Reames	860c62f53c	[RISCV] Refine known bits for READ_VLENB This implements known bits for READ_VALUE using any information known about minimum and maximum VLEN. There's an additional assumption that VLEN is a power of two. The motivation here is mostly to remove the last use of getMinVLen, but while I was here, I decided to also fix the bug for VLEN < 128 and handle max from command line generically too. Differential Revision: https://reviews.llvm.org/D128758	2022-06-28 15:42:14 -07:00
Lian Wang	96ab083622	[RISCV] Support VECTOR_REVERSE mask operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D128627	2022-06-28 07:48:51 +00:00
LiaoChunyu	1178992c72	[RISCV] Optimize 2x SELECT for floating-point types Including the following opcode: Select_FPR16_Using_CC_GPR Select_FPR32_Using_CC_GPR Select_FPR64_Using_CC_GPR Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D127871	2022-06-28 12:02:05 +08:00
Craig Topper	ea1b861278	[RISCV] Fix misleading formatting and remove a dead getNode call. NFC	2022-06-27 18:49:57 -07:00
Philip Reames	0533b6e2f6	[RISCV] Remove a use of getMinVLen in favor of getRealMinVLen The later is possibly greater than the former, and thus the assert was overly strong when a wider VLEN was set at the command line.	2022-06-27 12:52:24 -07:00
Philip Reames	a0443dd47c	[RISCV] Simplify 16 bit index handling in lowerVECTOR_REVERSE [nfc] getRealMaxVLen returns an upper bound on the value of VLEN. We can use this upper bound (which unless explicitly set at command line is going to result in a e8 MaxVLMax of much greater than 256) instead of explicitly handling the unknown case separately from the bounded by number greater than 256 case. Note as well that this code already implicitly depends on a capped value for VLEN. If infinite VLEN were possible, than 16 bit indices wouldn't be enough.	2022-06-24 13:08:39 -07:00
Philip Reames	f1e1c3ce77	[RISCV] Replace two calls to getMinRVVVectorSizeInBits in fixed length lowering [nfc] Both of these are only reached if useRVVForFixedLengthVectors is true. Given that, we know that getRealMinVLen() == getMinRVVVectorSizeInBits().	2022-06-24 13:00:57 -07:00
Craig Topper	c579ab53bd	[RISCV] Move vfma_vl+fneg_vl matching to DAG combine. This patch adds 3 new _VL RISCVISD opcodes to represent VFMA_VL with different portions negated. It also adds a DAG combine to peek through FNEG_VL to create these new opcodes. This is modeled after similar code from X86. This makes the isel patterns more regular and reduces the size of the isel table by ~37K. The test changes look like regressions, but they point to a bug that was already there. We aren't able to commute a masked FMA instruction to improve register allocation because we always use a mask undisturbed policy. Prior to this patch we matched two multiply operands in a different order and hid this issue for these test cases, but a different test still could have encountered it. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D128310	2022-06-24 00:00:37 -07:00
Craig Topper	8b10ffabae	[RISCV] Disable <vscale x 1 x > types with Zve32x or Zve32f. According to the vector spec, mf8 is not supported for i8 if ELEN is 32. Similarily mf4 is not suported for i16/f16 or mf2 for i32/f32. Since RVVBitsPerBlock is 64 and LMUL is calculated as ((MinNumElements ElementSize) / RVVBitsPerBlock) this means we need to disable any type with MinNumElements==1. For generic IR, these types will now be widened in type legalization. For RVV intrinsics, we'll probably hit a fatal error somewhere. I plan to work on disabling the intrinsics in the riscv_vector.h header. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D128286	2022-06-23 08:49:18 -07:00
Craig Topper	f912d21e67	[RISCV] Add RISCVISD opcodes for the rest of getAddr. This adds RISCVISD opccodes for LA, LA_TLS_IE, and LA_TLS_GD to remove creation of MachineSDNodes form getAddr. This makes the code consistent with the previous patches that added RISCVISD::HI, ADD_LO, LLA, and TPREL_ADD. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D128325	2022-06-22 09:21:07 -07:00
Craig Topper	0efbf5bfbb	[RISCV] Move the passthru operand for RISCVISD::VRGATHER*_VL nodes. NFC Put it before the VL instead of as the first operand. I want to add passthru to more operands, but the commutable ones like VADD_VL require the commutable operands to be operand 0 and 1. So we can't have the passthru as operand 0 for those.	2022-06-21 14:01:02 -07:00
Craig Topper	e01353f816	[RISCV] Add RISCVISD opcode for PseudoAddTPRel. Use it along with RISCVISD::HI and ADD_LO to avoid emitting MachineSDNodes during lowering.	2022-06-20 20:56:52 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Craig Topper	16d3a82de5	[RISCV] Add merge operand to RISCVISD::VRGATHER_VL nodes. Use it in place of VSELECT_VL+VRGATHER_VL. This simplifies the isel patterns. Overall, I think trying to match select+op to create masked instructions in isel doesn't scale. We either need to do it in DAG combine, pre-isel peepole, or post-isel peephole. I don't yet know which is the right answer, but for this case it seemed best to be able to request the masked form directly from lowering. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D128023	2022-06-20 18:58:24 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Craig Topper	545a71c0d6	[RISCV] Pre-promote v1i1/v2i1/v4i1->i1/i2/i4 bitcasts before type legalization Type legalization will convert the bitcast into a vector store and scalar load. Instead this patch widens the vector to v8i1 with undef, and bitcasts it to i8. v8i1->i8 has custom handling for type legalization already to bitcast to a v1i8 vector and use an extract_element. The code here was lifted from X86's avx512 support. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D128099	2022-06-18 11:06:45 -07:00
Craig Topper	cbf6737cc4	[RISCV] Use RVVBitsPerBlock instead of hardcoding multiples of 64. NFC	2022-06-17 14:10:39 -07:00
Craig Topper	9d7b01dc95	[RISCV] Implement RISCVTargetLowering::getTargetConstantFromLoad. This allows computeKnownBits to see the constant being loaded. This recovers the rv64zbp test case changes from D127520. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127679	2022-06-16 15:11:18 -07:00
Craig Topper	5afdceb82b	[RISCV] Add RISCVISD opcode for PseudoLLA. Rather than emitting a MachineSDNode from lowering. Let isel match it. This is consistent with the RISCVISD::HI and ADD_LO nodes that were also added. Having them both the same will make D127679 consistent. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127714	2022-06-16 15:11:03 -07:00
Craig Topper	4191de262f	[RISCV] Don't emit LUI/ADDI MachineSDNodes from getAddr Instead add RISCVISD opcodes that will be selected to LUI/ADDI during isel. I'm looking into maybe moving doPeepholeLoadStoreADDI into isel. Having the ADDI as a RISCVISD node will make it visible to isel. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127713	2022-06-16 14:56:07 -07:00
Craig Topper	e4062522d3	[RISCV] Disable matchSplatAsGather for i1 vectors to prevent creating illegal nodes. We were incorrectly creating a VRGATHER node with i1 vector type. We could support this by promoting the mask to i8 and truncating it, but for now I want to prevent the crash. Fixes PR56007. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127681	2022-06-13 13:41:39 -07:00
Craig Topper	cef03e3dcd	[RISCV] Move creation of constant pools from isel to lowering. This simplifies the isel code by removing the manual load creation. It also improves our ability to use 0 strided loads for vector splats. There is an assumption here that Mask and ShiftedMask constants are cheap enough that they don't become constant pool loads so that our isel optimizations involving And still work. I believe those constants are 3 instructions in the worst case. The rv64zbp-intrinsic.ll changes is a regression caused by intrinsics being expanded to RISCVISD also occuring during lowering. So the optimizations were only happening during the last DAGCombine, which can't see through the load. I believe we can fix this test by implementing TargetLowering::getTargetConstantFromLoad for RISC-V or by adding the intrinsic to computeKnownBitsForTargetNode to enable earlier DAG combine. Since Zbp is not a ratified extension, I don't view these as blocking this patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127520	2022-06-13 09:07:57 -07:00
Craig Topper	e91051184c	[RISCV] Mark FSIN and other math functions as Expand for scalable vectors. This prevents them from being assumed legal by the cost model. This matches what is done for AArch64 SVE. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123799	2022-06-10 08:40:07 -07:00
Shao-Ce SUN	862f30a428	[RISCV] Add ISD::EH_DWARF_CFA Based on D24038. LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D126181	2022-06-08 22:03:30 +08:00
Craig Topper	aeb27f133a	[RISCV] Fix i64<->f64 and i32<->f32 bitcasts with VLS vectors enabled. We enable a custom handler to optimize conversions between scalars and fixed vectors. Unfortunately, the custom handler picks up scalar to scalar conversions as well. If the scalar types are both legal, we wouldn't match any of the fixed vector cases and would return SDValue() causing the LegalizeDAG to expand the bitcast through memory. This patch fixes this by checking if it's a scalar to scalar conversion and returns `Op` if both types are legal. Differential Revision: https://reviews.llvm.org/D126739	2022-06-01 08:13:49 -07:00
Craig Topper	b09e54541a	[RISCV] Use template version of SignExtend64 for constant extends. NFC We were inconsistent about which one we used.	2022-05-27 13:11:15 -07:00
Craig Topper	d0f65eaa85	[RISCV] Remove unused variables. NFC	2022-05-27 12:13:45 -07:00
Craig Topper	aaad507546	[RISCV] Return false from isOffsetFoldingLegal instead of reversing the fold in lowering. When lowering GlobalAddressNodes, we were removing a non-zero offset and creating a separate ADD. It already comes out of SelectionDAGBuilder with a separate ADD. The ADD was being removed by DAGCombiner. This patch disables the DAG combine so we don't have to reverse it. Test changes all look to be instruction order changes. Probably due to different DAG node ordering. Differential Revision: https://reviews.llvm.org/D126558	2022-05-27 11:05:18 -07:00

1 2 3 4 5 ...

720 Commits