llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	7cf5581712	Analysis: Update some tests for opaque pointers StackSafetyAnalysis/lifetime.ll had one bitcast removed that may have mattered. The concluded lifetime is longer based on the underlying alloca, instead of the bitcasted pointer so left that as a pointless cast. local.ll memintrin.ll needed some manual fixes	2022-12-02 18:47:43 -05:00
Matt Arsenault	81c163e3e1	StackSafetyAnalysis: Don't use anonymous values in test	2022-12-02 18:47:43 -05:00
Matt Arsenault	a74c5707be	Fix some test files with executable permissions	2022-12-02 17:12:03 -05:00
Bjorn Pettersson	a11faeed44	[test] Switch to use -passes syntax in various test cases	2022-12-01 21:25:59 +01:00
Philip Reames	73eacf94e0	[RISCV] Incorporate LMUL into costs for arithmetic and shuffles This reuses the routine implemented in `0e6f0b7` to implement several existing TODOs. Many of the operations scale linearly with LMUL; this change represents that in the cost model. Differential Revision: https://reviews.llvm.org/D139039	2022-12-01 10:46:27 -08:00
Roman Lebedev	7850ab2112	[NFC] Port an assortment of tests that invoke SROA to new pass manager	2022-12-01 21:17:18 +03:00
Philip Reames	7d82c99403	[RISCV][TTI] Account for constant materialization cost when costing arithmetic operations At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads. We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant. We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not. Differential Revision: https://reviews.llvm.org/D138941	2022-11-30 07:20:51 -08:00
Paul Robinson	3558da3d89	[Sanitizers] Fix test that never ran anywhere Incorrect REQUIRES clause. Also fixed the incorrect 'opt' line and removed a redundant -mtriple option.	2022-11-30 07:20:27 -08:00
David Green	f2a92db29e	[AArch64] Don't treat SVE scalable extends as free widening instructions The logic in isWideningInstruction handles instructions like uaddw and smull, where 'add(x, zext(y))' or 'mul(sext(x), sext(y))' can be converted to single instructions, making the extends free. This doesn't apply the same to SVE instructions though. https://godbolt.org/z/695d3nhGd (There are instructions like SMULLT/B, but they require top/bottom lane interleaving. That is similar to MVE instructions, which required a special pass to perform the lane interleaving). This patch just bails out of the call to isWideningInstruction if the vector is scalable, getting a more accurate cost. Differential Revision: https://reviews.llvm.org/D138591	2022-11-30 13:09:48 +00:00
ShihPo Hung	0e6f0b7cc3	[RISCV] Add cost model for fixed broadcast shuffle This patch adds basic broadcast shuffle costs in order to enable SLP vectorization. And adds `getLMULCost` to consider reciprocal throughput for different LMUL. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D137276	2022-11-30 04:58:52 -08:00
Philip Reames	3c9d247112	[RISCV] Add test coverage for vector constant materialization costs on arithmetic instructions	2022-11-29 12:00:58 -08:00
Philip Reames	e726c5879a	[RISCV] Add cost model coverage for vector arithmetic	2022-11-29 11:50:52 -08:00
Mateja Marjanovic	595a08847a	[AMDGPU] Add support for new LLVM vector types Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384. Differential Revision: https://reviews.llvm.org/D138205	2022-11-29 17:02:04 +01:00
David Green	57dc4a8cab	[AArch64] Extend testing for widening conditions under SVE. NFC	2022-11-29 15:53:39 +00:00
Slava Zakharin	5bd8175dd7	[AA] A global cannot escape through nocapture/nocallback call. When an internal global is passed to a 'nocallback' call as a 'nocapture' pointer, it cannot escape through this call and be indirectly referenced in this module. So it must not alias with any pointer in the module. This may provide some remedy for Fortran module-private array descriptors that are usually passed by address to some runtime functions (e.g. to allocation/deallocation functions). In general, a good aliasing information derived from Fortran language rules would solve the same issue, but I think this change may be beneficial as-is (given that nocapture, nocallback attributes are properly set). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D138336	2022-11-28 12:50:31 -08:00
Philip Reames	db07d79ab0	[RISCV] Add cost model for integer and float vector arithmetic instructions. This patch implements getArithmeticInstrCost for RISCV, supports cost model for integer and float vector arithmetic instructions. Differential Revision: https://reviews.llvm.org/D133552 (Original patch by jacquesguan. Subset by me with todos added.)	2022-11-28 09:04:38 -08:00
Matt Arsenault	8c58a9ace0	DivergenceAnalysis: Convert tests to opaque pointers	2022-11-28 08:42:38 -05:00
Zain Jaffal	6e4cea55f0	[AArch64] Fix cost model for `udiv` instruction when one of the operands is a uniform constant Currently the model over estimates the cost of a udiv instruction with one constant. The correct cost for a udiv instruction is insert_cost * extract_cost * num_elements Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D135991	2022-11-28 10:38:17 +02:00
Max Kazantsev	06c4103d41	[Test] Add couple more tests where we can compute symbolic max exit count (fixed)	2022-11-25 14:40:32 +07:00
Max Kazantsev	eb95ab5745	Revert "[Test] Add couple more tests where we can compute symbolic max exit count" This reverts commit `7e3373c9e1`. Some changes that were not supposed to be commited came with it.	2022-11-25 13:37:24 +07:00
Max Kazantsev	7e3373c9e1	[Test] Add couple more tests where we can compute symbolic max exit count	2022-11-25 13:35:16 +07:00
Max Kazantsev	b9c1d73725	[Test] Add test showing that SCEV fails to evaluate symbolic max for 'and' conditions	2022-11-25 11:45:10 +07:00
Max Kazantsev	4496d553bd	[SCEV] Fix misplaced \n in printout of max symbolic exit counts	2022-11-25 11:41:36 +07:00
Florian Hahn	ae852750b3	[MemoryLocation] Support memcpy_chk in getForArgument. Similar to `9f9e8ba114`, add support for memcyp_chk to MemoryLocation::getForArgument. The size argument for memcpy_chk is an upper bound for the size of the pointer argument. memcpy_chk may read/write less than the specified length, if it exceeds the specified max size and aborts. Reviewed By: xbolva00, jdoerfert Differential Revision: https://reviews.llvm.org/D138613	2022-11-24 19:17:48 +00:00
Max Kazantsev	e5fa7eb120	[SCEV] Add printout of symbolic max backedge-taken and block exit count We do compute it and use in optimizations, but never print it out. We need to do it in order to be able to track improvements in its computation.	2022-11-24 19:29:58 +07:00
Max Kazantsev	211d941188	[SCEV] Rename max backedge-taken count -> constant max backedge taken-count in printout This is a preparatory step for introducing symbolic max backedge-taken count.	2022-11-24 18:43:42 +07:00
Florian Hahn	4b4cbbd7fb	[BasicAA] Add tests with __memcpy_chk.	2022-11-23 22:09:53 +00:00
Haohai Wen	1215e86a0e	[CostModel][X86] Fix permute latency cost Avx512 permute latency should be 3 instead of 1. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D138427	2022-11-23 19:17:16 +08:00
Haohai Wen	2dfe76e989	[CostModel][X86] Add CostKinds test coverage for shufflevector instruction Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D138485	2022-11-23 10:30:48 +08:00
Florian Hahn	5dad4c6788	[SCEV] Iteratively compute ranges for deeply nested expressions. At the moment, getRangeRef may overflow the stack for very deeply nested expressions. This patch introduces a new getRangeRefIter function, which first builds a worklist of N-ary expressions and phi nodes, followed by their operands iteratively. getRangeRef has been extended to also take a Depth argument and it switches to use getRangeRefIter once the depth reaches a certain threshold. This ensures compile-time is not impacted in general. Note that the iterative algorithm may lead to a slightly different evaluation order, which could result in slightly worse ranges for cyclic phis. https://llvm-compile-time-tracker.com/compare.php?from=23c3eb7cdf3478c9db86f6cb5115821a8f0f5f40&to=e0e09fa338e77e53242bfc846e1484350ad79773&stat=instructions Fixes #49579. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D130728	2022-11-21 21:56:14 +00:00
Florian Hahn	535c2da58d	[SCEV] Add range test with phi and division. Extra test coverage for D130728.	2022-11-21 19:58:43 +00:00
Yeting Kuo	ed9638c44b	[VP][RISCV] Add vp.nearbyint and RISC-V support. nearbyint has the property to execute without exception. For not modifying fflags, the patch added new machine opcode PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair of frflags and fsflags. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137685	2022-11-16 14:05:35 +08:00
Yeting Kuo	5c3ca10b09	[VP][RISCV] Add vp.bswap and RISC-V support. The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137928	2022-11-16 11:36:38 +08:00
Roman Lebedev	11abb7fedb	[NFC][X86][Costmodel] Drop reduntant interleaved cost test coverage These are already covered by the more general tests i've added.	2022-11-15 21:30:06 +03:00
Roman Lebedev	8e37b53360	[X86] Rewrite `getScalarizationOverhead()` All of our insert/extract ops work on 128-bit lanes. For `Insert`, we need to extract affected 128-bit lane, unless it's being fully overwritten (FIXME: do we need to be careful about legalization-induced padding that we obviously don't demand?), perform insertions, and then insert the 128-bit lane back. But hold on. If we are operating on an 256-bit legal vector, and thus have two 128-bit subvectors, and are fully overwriting them both, we don't actually need to insert both subvectors, only the second one, into the implicitly-widened first one. Also, `Insert` wasn't actually querying the costs, but just assuming them to be `1`. `getShuffleCost(TTI::SK_ExtractSubvector)` notes: ``` // Note that in general, the insertion starting at the beginning of a vector // isn't free, because we need to preserve the rest of the wide vector. ``` ... so as far as i can tell, we didn't account for that. I was hoping this would allow vectorization at a higher VF at one case i looked at, but the subvector insertion cost is still dis-advising that. The change for `Extract` is NFC, and is for consistency only, i wanted to get rid of of that weird explicit discounting of insertion of 0'th element, since the general code should already deal with that. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D137913	2022-11-15 21:07:12 +03:00
Nikita Popov	458ae539df	[AST] Remove legacy AliasSetPrinter pass A NewPM version of this pass exists, drop the legacy version of this testing-only pass.	2022-11-14 15:50:38 +01:00
Matt Arsenault	583450fa09	AMDGPU: Fix DivergenceAnalysis for llvm.read_register This was treating all calls as uniform by default, which is wrong if used to read a VGPR.	2022-11-07 10:42:35 -08:00
Matt Arsenault	541041d1ea	AMDGPU: Fix faulty divergence analysis tests These were supposed to be checking that atomics were treated as divergence sources. However, they were using function arguments which are always treated as divergent, so they could have been found divergent for the wrong reason.	2022-11-06 22:14:12 -08:00
Matt Arsenault	f72416e974	AMDGPU: Fix missing divergence tests for csub intrinsics	2022-11-06 22:14:12 -08:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Philip Reames	73482b457e	[RISCV] Fix cost of legal fixed length masked load and stores We can cost them the same way as a scalable masked load/store. By hitting the default path, we were costing them as if they were being scalarized. This is a significant over estimate. Differential Revision: https://reviews.llvm.org/D137218	2022-11-02 07:24:38 -07:00
Nikita Popov	5fe9273c73	[BasicAA] Re-enable cs-cs-arm.ll test (PR58738) Fixes https://github.com/llvm/llvm-project/issues/58738.	2022-11-02 14:22:44 +01:00
Paul Robinson	9a4aa37dbf	Patch up attributes on a newly enabled test	2022-11-01 14:14:40 -07:00
Paul Robinson	4f0a1201a4	[lit][REQUIRES] Fix some tests with incorrect REQUIRES clauses These weren't running anywhere because of bad specifications. One test has bit-rotted and had to be XFAILed, the rest are okay. Differential Revision: https://reviews.llvm.org/D136612	2022-11-01 13:49:23 -07:00
Nikita Popov	6aa672f141	[IR] Take operand bundles into account for call argument readonly/writeonly We currently only take operand bundle effects into account when querying the function-level memory attributes. However, I believe that we also need to do the same for parameter attributes. For example, a call with deopt bundle to a function with readnone parameter attribute cannot treat that parameter as readnone, because the deopt bundle may read it. Differential Revision: https://reviews.llvm.org/D136834	2022-11-01 09:30:03 +01:00
Yeting Kuo	71e4e35581	[VP][RISCV] Add vp.rint and RISC-V support. FRINT uses dynamic rounding mode instead of static rounding mode. The patch rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136662	2022-11-01 14:52:47 +08:00
Patrick Walton	01859da84b	[AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory(). The pointsToConstantMemory() method returns true only if the memory pointed to by the memory location is globally invariant. However, the LLVM memory model also has the semantic notion of locally-invariant: memory that is known to be invariant for the life of the SSA value representing that pointer. The most common example of this is a pointer argument that is marked readonly noalias, which the Rust compiler frequently emits. It'd be desirable for LLVM to treat locally-invariant memory the same way as globally-invariant memory when it's safe to do so. This patch implements that, by introducing the concept of a ModRefInfo mask. A ModRefInfo mask is a bound on the Mod/Ref behavior of an instruction that writes to a memory location, based on the knowledge that the memory is globally-constant memory (in which case the mask is NoModRef) or locally-constant memory (in which case the mask is Ref). ModRefInfo values for an instruction can be combined with the ModRefInfo mask by simply using the & operator. Where appropriate, this patch has modified uses of pointsToConstantMemory() to instead examine the mask. The most notable optimization change I noticed with this patch is that now redundant loads from readonly noalias pointers can be eliminated across calls, even when the pointer is captured. Internally, before this patch, AliasAnalysis was assigning Ref to reads from constant memory; now AA can assign NoModRef, which is a tighter bound. Differential Revision: https://reviews.llvm.org/D136659	2022-10-31 13:03:41 -07:00
Patrick Walton	81767f2d18	[test][AliasAnalysis] Add some baseline tests in preparation for getModRefInfoMask(). This commit adds some tests in preparation for D136659, which allows alias analysis to treat locally-invariant memory pointed to by readonly noalias pointers the same as globally-invariant memory in some cases. The existing behavior for these tests is marked as expected and will be changed when that diff lands. Differential Revision: https://reviews.llvm.org/D136993	2022-10-29 15:08:54 -07:00
Patrick Walton	f3d49dbcb1	[test] Remove readonly from some parameters that are written through in tests. In D136659 I found a few tests that write through readonly parameters: * Analysis/BasicAA/pr18573.ll: @foo1 writes through %arr.ptr, but declares it readonly. I removed the readonly annotation. * CodeGen/ARM/ParallelDSP/aliasing.ll: @restrict writes through the readonly %arg3, @store_alias_arg3_illegal_1 writes through the readonly %arg3, and @store_alias_arg3_illegal_2 writes through the readonly %arg3. I removed readonly from all three. Also, I added some CHECK-LABEL directives to make it harder for FileCheck output to be mixed up. * Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll: @gather_nxv4i32_ind64_stride2 writes through the readonly %a. I removed the readonly attribute. * Transforms/LoopVectorize/interleaved-accesses.ll: @load_gap_reverse writes through the readonly %P1 and %P2. Also, the corresponding C code in the comment didn't match the test. I removed the readonly attribute from both parameters and corrected the C code. Differential Revision: https://reviews.llvm.org/D136880	2022-10-29 15:05:20 -07:00
Craig Topper	e94dc58dff	[RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven. This avoids the call overhead as well as the the save/restore of fflags and the snan handling in the libm function. The save/restore of fflags and snan handling are needed to be correct for -ftrapping-math. I think we can ignore them in the default environment. The inline sequence will generate an invalid exception for nan and an inexact exception if fractional bits are discarded. I've used a custom inserter to explicitly create the control flow around the float->int->float conversion. We can probably avoid the final fsgnj after the conversion for no signed zeros FMF, but I'll leave that for future work. Note the comparison constant is slightly different than glibc uses. They use 1<<53 for double, I'm using 1<<52. I believe either are valid. Numbers >= 1<<52 can't have any fractional bits. It's ok to do the float->int->float conversion on numbers between 1<<53 and 1<<52 since they will all fit in 64. We only have a problem if the double can't fit in i64 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136508	2022-10-26 14:36:49 -07:00

1 2 3 4 5 ...

3689 Commits