llvm-project

Commit Graph

Author	SHA1	Message	Date
Tres Popp	a003f26539	[llvm] Prevent infinite loop in InstCombine of select statements This fixes an issue where the RHS and LHS the comparison operation creating the predicate were swapped back and forth forever. Differential Revision: https://reviews.llvm.org/D94934	2021-01-19 10:31:48 +01:00
Nikita Popov	5238e7b302	[InstCombine] Replace one-use select operand based on condition InstCombine already performs a fold where X == Y ? f(X) : Z is transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However, if f(X) only has one use, then we can always directly replace the use inside the instruction. To actually be profitable, limit it to the case where Y is a non-expr constant. This could be further extended to replace uses further up a one-use instruction chain, but for now this only looks one level up. Among other things, this also subsumes D94860. Differential Revision: https://reviews.llvm.org/D94862	2021-01-16 23:25:02 +01:00
Nikita Popov	17863614da	[InstCombine] Fold select -> and/or using impliesPoison We can fold a ? b : false to a & b if is_poison(b) implies that is_poison(a), at which point we're able to reuse all the usual fold on ands. In particular, this covers the very common case of icmp X, C && icmp X, C'. The same applies to ors. This currently only has an effect if the -instcombine-unsafe-select-transform=0 option is set. Differential Revision: https://reviews.llvm.org/D94550	2021-01-13 17:45:40 +01:00
Nikita Popov	4a16c507cb	[InstCombine] Disable unsafe select transform behind a flag This disables the poison-unsafe select -> and/or transform behind a flag (we continue to perform the fold by default). This is intended to simplify evaluation and testing while we teach various passes to directly recognize the select pattern. This only disables the main select -> and/or transform. A number of related ones are instead changed to canonicalize to the a ? b : false and a ? true : b forms which represent and/or respectively. This requires a bit of care to avoid infinite loops, as we do not want !a ? b : false to be converted into a ? false : b. The basic idea here is the same as D93065, but keeps the change behind a flag for now. Differential Revision: https://reviews.llvm.org/D93840	2020-12-28 22:43:52 +01:00
Roman Lebedev	f8079355c6	[InstCombine] canonicalizeAbsNabs(): don't propagate NSW flag for NABS patter As Nuno is noting in post-commit review in https://reviews.llvm.org/D87188#2467915 it is not correct to keep NSW for negated abs pattern, so don't do that.	2020-12-24 00:06:09 +03:00
Congzhe Cao	c60a58f8d4	[InstCombine] Add check of i1 types in select-to-zext/sext transformation When doing select-to-zext/sext transformations, we should not handle TrueVal and FalseVal of i1 type otherwise it would result in zext/sext i1 to i1. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93272	2020-12-21 18:46:24 -05:00
Roman Lebedev	897c985e1e	[InstCombine] Canonicalize SPF to abs intrinsic This patch enables canonicalization of SPF_ABS and SPF_ABS to the abs intrinsic. This is a recommit, the original try was `05d4c4ebc2`, but it was reverted due to an apparent miscompile, which since then has just been fixed by the previous commit. Differential Revision: https://reviews.llvm.org/D87188	2020-12-18 21:18:14 +03:00
Jun Ma	ffe84d90e9	[InstCombine][NFC] Change cast of FixedVectorType to dyn_cast.	2020-12-15 20:36:57 +08:00
Jun Ma	2ac58e21a1	[InstCombine] Remove scalable vector restriction when fold SelectInst Differential Revision: https://reviews.llvm.org/D93083	2020-12-15 20:36:57 +08:00
Roman Lebedev	e6f2a79d7a	[InstCombine] canonicalizeSaturatedAdd(): last fold is only valid for strict comparison (PR48390) We could create uadd.sat under incorrect circumstances if a select with -1 as the false value was canonicalized by swapping the T/F values. Unlike the other transforms in the same function, it is not invariant to equality. Some alive proofs: https://alive2.llvm.org/ce/z/emmKKL Based on original patch by David Green! Fixes https://bugs.llvm.org/show_bug.cgi?id=48390 Differential Revision: https://reviews.llvm.org/D92717	2020-12-09 18:19:09 +03:00
Simon Pilgrim	0fe91ad463	[InstCombine] foldSelectFunnelShift - block poison in funnel shift value As raised by @nlopes on D90382 - if this is not a rotate then the select was blocking poison from the 'shift-by-zero' non-TVal, but a funnel shift won't - so freeze it.	2020-11-08 12:58:30 +00:00
Simon Pilgrim	538fdb0189	[InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift This is the last of the rotate->funnel shift InstCombine generalizations for PR46896 We still have foldGuardedRotateToFunnelShift to deal with in AggressiveInstCombine Differential Revision: https://reviews.llvm.org/D90382	2020-10-31 12:32:34 +00:00
Layton Kifer	d49911c282	[InstCombine][NFC] Use ConstantExpr::getBinOpIdentity Delete duplicate implementation getSelectFoldableConstant and replace with ConstantExpr::getBinOpIdentity. Differential Revision: https://reviews.llvm.org/D89839	2020-10-22 20:44:57 +02:00
Simon Pilgrim	981fdf01d5	[InstCombine] foldSelectRotate - canonicalize to OR(SHL,LSHR). NFCI. Match the canonicalization code that was added to matchFunnelShift at rG02295e6d1a15	2020-10-16 13:18:53 +01:00
Juneyoung Lee	9b3c2a72e4	[ValueTracking] Use assume's noundef operand bundle This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89219	2020-10-14 20:16:33 +09:00
Nikita Popov	3641d375f6	[InstCombine] Handle GEP inbounds in select op replacement (PR47730) When retrying the "simplify with operand replaced" select optimization without poison flags, also handle inbounds on GEPs. Of course, this particular example would also be safe to transform while keeping inbounds, but the underlying machinery does not know this (yet).	2020-10-05 21:13:02 +02:00
Nikita Popov	9d1c8c0ba9	[InstCombine] Fix select operand simplification with undef (PR47696) When replacing X == Y ? f(X) : Z with X == Y ? f(Y) : Z, make sure that Y cannot be undef. If it may be undef, we might end up picking a different value for undef in the comparison and the select operand.	2020-10-01 21:15:48 +02:00
Nikita Popov	13e19d2e7c	Revert "[InstCombine] Canonicalize SPF_ABS to abs intrinc" This reverts commit `05d4c4ebc2`. mstorsjo reports a miscompile after this change in https://reviews.llvm.org/D87188#2281093. Reverting until I can investigate this.	2020-09-18 09:38:26 +02:00
Nikita Popov	05d4c4ebc2	[InstCombine] Canonicalize SPF_ABS to abs intrinc Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic. To be conservative, the one-use check on the comparison is retained, this may be relaxed if all goes well. It's pretty likely that this will uncover places that missing handling for the abs() intrinsic. Please report any seen performance regressions. Differential Revision: https://reviews.llvm.org/D87188	2020-09-17 22:28:34 +02:00
Nikita Popov	222bf3ffbc	Reapply [InstCombine] Simplify select operand based on equality condition Reapply after fixing SimplifyWithOpReplaced() to never return the original value, which would lead to an infinite loop in this transform. ----- For selects of the type X == Y ? A : B, check if we can simplify A by using the X == Y equality and replace the operand if that's possible. We already try to do this in InstSimplify, but will only fold if the result of the simplification is the same as B, in which case the select can be dropped entirely. Here the select will be retained, just one operand simplified. As we are performing an actual replacement here, we don't have problems with refinement / poison values. Differential Revision: https://reviews.llvm.org/D87480	2020-09-16 20:53:58 +02:00
Benjamin Kramer	b768546fe0	Revert "[InstCombine] Simplify select operand based on equality condition" This reverts commit `cfff88c03c`. Sends instcombine into an infinite loop. ``` define i1 @foo(i32 %arg, i32 %arg1) { bb: %tmp = udiv i32 %arg, %arg1 %tmp2 = mul nsw i32 %tmp, %arg1 %tmp3 = icmp eq i32 %tmp2, %arg %tmp4 = select i1 %tmp3, i32 %tmp, i32 undef %tmp5 = icmp sgt i32 %tmp4, 255 ret i1 %tmp5 } ```	2020-09-15 12:22:47 +02:00
Nikita Popov	cfff88c03c	[InstCombine] Simplify select operand based on equality condition For selects of the type X == Y ? A : B, check if we can simplify A by using the X == Y equality and replace the operand if that's possible. We already try to do this in InstSimplify, but will only fold if the result of the simplification is the same as B, in which case the select can be dropped entirely. Here the select will be retained, just one operand simplified. As we are performing an actual replacement here, we don't have problems with refinement / poison values. Differential Revision: https://reviews.llvm.org/D87480	2020-09-14 20:07:06 +02:00
Nikita Popov	36e2e2e12e	[InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322) This is a followup to D86834, which partially fixed this issue in InstSimplify. However, InstCombine repeats the same transform while dropping poison flags -- which does not cover cases where poison is introduced in some other way. The fix here is a bit more comprehensive, because things are quite entangled, and it's hard to only partially address it without regressing optimization. There are really two changes here: * Export the SimplifyWithOpReplaced API from InstSimplify, with an added AllowRefinement flag. For replacements inside the TrueVal we don't actually care whether refinement occurs or not, the replacement is always legal. This part of the transform is now done in InstSimplify only. (It should be noted that the current AllowRefinement check is not sufficient -- that's an issue we need to address separately.) * Change the InstCombine fold to work by temporarily dropping poison generating flags, running the fold and then restoring the flags if it didn't work out. This will ensure that the InstCombine fold is correct as long as the InstSimplify fold is correct. Differential Revision: https://reviews.llvm.org/D87445	2020-09-12 14:45:06 +02:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Simon Pilgrim	f13e92d4b2	[InstCombine] Use CreateVectorSplat(ElementCount) variant directly This was introduced at rGe20223672100, and the CreateVectorSplat(unsigned NumElements) variant calls it internally	2020-08-08 19:26:02 +01:00
Juneyoung Lee	b6d9add71b	[InstCombine] Optimize select(freeze(icmp eq/ne x, y), x, y) This patch adds an optimization that folds select(freeze(icmp eq/ne x, y), x, y) to x or y. This was needed to resolve slowdown after D84940 is applied. I tried to bake this logic into foldSelectInstWithICmp, but it wasn't clear. This patch conservatively writes the pattern in a separate function, foldSelectWithFrozenICmp. The output does not need freeze; https://alive2.llvm.org/ce/z/X49hNE (from @nikic) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85533	2020-08-08 15:22:29 +09:00
Vitaly Buka	89051ebace	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Sanjay Patel	750f4c591d	[InstCombine] allow peeking through zext of shift amount to match rotate idioms (PR45701) We might want to also allow trunc of the shift amount, but that seems less likely? define i32 @src(i32 %x, i1 %y) { %0: %rem = and i1 %y, 1 %cmp = icmp eq i1 %rem, 0 %sh_prom = zext i1 %rem to i32 %sub = sub nsw nuw i1 0, %rem %sh_prom1 = zext i1 %sub to i32 %shr = lshr i32 %x, %sh_prom1 %shl = shl i32 %x, %sh_prom %or = or i32 %shl, %shr %r = select i1 %cmp, i32 %x, i32 %or ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %t = zext i1 %y to i32 %r = fshl i32 %x, i32 %x, i32 %t ret i32 %r } Transformation seems to be correct! https://alive2.llvm.org/ce/z/xgMvE3 http://bugs.llvm.org/PR45701	2020-07-20 16:18:11 -04:00
Max Kazantsev	c989881078	[InstCombine] Fix replace select with Phis when branch has the same labels ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = select i1 %cond, i32 123, i32 456 ret i32 %result } ``` In this test, after applying transformation of replacing select with Phis, the result will be: ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = i32 phi [123, %exit], [123, %exit] ret i32 %result } ``` That is, select is transformed into an invalid Phi, which will then be reduced to 123 and the second value will be lost. But it is worth noting that this problem will arise only if select is in the InstCombine worklist will be before the branch. Otherwise, InstCombine will replace the branch condition with false and transformation will not be applied. The fix is to check the target labels in the branch condition for equality. Patch By: Kirill Polushin Differential Revision: https://reviews.llvm.org/D84003 Reviewed By: mkazantsev	2020-07-17 14:04:58 +07:00
Max Kazantsev	e808cab824	[InstCombine] Improve select -> phi canonicalization: consider more blocks We can try to replace select with a Phi not in its parent block alone, but also in blocks of its arguments. We benefit from it when select's argument is a Phi. Differential Revision: https://reviews.llvm.org/D83284 Reviewed By: nikic	2020-07-13 11:40:32 +07:00
Roman Lebedev	c3b8bd1eea	[InstCombine] Always try to invert non-canonical predicate of an icmp Summary: The actual transform i was going after was: https://rise4fun.com/Alive/Tp9H ``` Name: zz Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0 %t0 = and i8 %x, C0 %r = icmp eq i8 %t0, C1 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 Name: zz Pre: isPowerOf2(C0) %t0 = and i8 %x, C0 %r = icmp ne i8 %t0, 0 => %t = icmp eq i8 %t0, 0 %r = xor i1 %t, -1 ``` but as it can be seen from the current tests, we already canonicalize most of it, and we are only missing handling multi-use non-canonical icmp predicates. If we have both `!=0` and `==0`, even though we can CSE them, we end up being stuck with them. We should canonicalize to the `==0`. I believe this is one of the cleanup steps i'll need after `-scalarizer` if i end up proceeding with my WIP alloca promotion helper pass. Reviewers: spatel, jdoerfert, nikic Reviewed By: nikic Subscribers: zzheng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83139	2020-07-04 18:12:04 +03:00
Sanjay Patel	63774642af	[InstCombine] add one-use check to cast+select narrowing transform Prevent increasing the instruction count.	2020-07-03 11:54:09 -04:00
Max Kazantsev	1eeb714787	[InstCombine] Combine select & Phi by same condition This patch transforms ``` p = phi [x, y] s = select cond, z, p ``` with ``` s = phi[x, z] ``` if we can prove that the Phi node takes values basing on select's condition. Differential Revision: https://reviews.llvm.org/D82072 Reviewed By: nikic	2020-06-25 10:44:10 +07:00
Max Kazantsev	9bff376e5c	[InstCombine] Replace selects with Phis We can sometimes replace a select with a Phi node if all of its values are available on respective incoming edges. Differential Revision: https://reviews.llvm.org/D82005 Reviewed By: nikic	2020-06-23 12:12:59 +07:00
Sanjay Patel	192cb71836	[InstCombine] avoid crashing on select-shuffle detection As mentioned in the post-commit comments of D81013 - the mask check API has to assume the shuffle is not length-changing, but we have not ruled that out in this code. Use the ShuffleVectorInst call instead.	2020-06-04 17:27:14 -04:00
Sanjay Patel	8a96c1f627	[InstCombine] move vector select ahead of select-shuffle select Cond, (shuf_sel X, Y), X --> shuf_sel X, (select Cond, Y, X) A select of a select-shuffle ("blend" in x86 lingo) can be reversed so that the select is done first. This is a more limited version of what I was trying in D80658, but it enables existing demanded bits transforms to catch some of the motivating cases. The tricky bit in that seems to be that by moving the shuffle later, we can always guarantee that poison is correctly inhibited by the shuffle mask in the final value. Alive2 checks for the basic tests: http://volta.cs.utah.edu:8080/z/Qqd3RK http://volta.cs.utah.edu:8080/z/S4wchM http://volta.cs.utah.edu:8080/z/wf9zPL http://volta.cs.utah.edu:8080/z/wJeEGk Differential Revision: https://reviews.llvm.org/D81013	2020-06-04 14:29:13 -04:00
Sanjay Patel	26ebe936f3	[InstCombine] fix use of base VectorType; NFC SimplifyDemandedVectorElts() bails out on ScalableVectorType anyway, but we can exit faster with the external check. Move this to a helper function because there are likely other vector folds that we can try here.	2020-06-01 14:28:31 -04:00
Sanjay Patel	7eed772a27	[PatternMatch] abbreviate vector inst matchers; NFC Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.	2020-05-24 09:19:47 -04:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Sanjay Patel	7fa150203f	[InstCombine] fix miscompile from multi-use cttz/ctlz transform PR45762: https://bugs.llvm.org/show_bug.cgi?id=45762	2020-05-01 13:52:24 -04:00
Benjamin Kramer	cc035d475f	Upgrade users of 'new ShuffleVectorInst' to pass indices as an int array No functionality change intended.	2020-04-15 14:29:43 +02:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Nikita Popov	b7fe795e5b	[InstCombine] Use replaceOperand() in some select transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-03-31 22:10:55 +02:00
Nikita Popov	26fa33755f	[InstCombine] Simplify select of cmpxchg transform Rather than converting to a dummy select with equal true and false ops, just directly return the resulting value. As a side-effect, this fixes missing DCE of the previously replaced operand.	2020-03-29 18:57:32 +02:00
Nikita Popov	1e363023b8	[InstCombine] Use replaceOperand() in a few more places To make sure the old operands get DCEd. NFC apart from worklist order changes.	2020-03-29 18:01:00 +02:00
Simon Moll	d871ef4e6a	[instcombine] remove fsub to fneg hacks; only emit fneg Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75467	2020-03-10 16:57:02 +01:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Nikita Popov	c9540fe59b	[InstCombine] Fix multi-use handling in cttz transform The select-of-cttz transform can currently duplicate cttz intrinsics and zext/trunc ops. The cause is that it unnecessarily duplicates the intrinsic and the zext/trunc when setting the "undef_on_zero" flag to false. However, it's always legal to set the flag from true to false, so we can make this replacement even if there are extra users. Differential Revision: https://reviews.llvm.org/D74685	2020-02-18 17:55:00 +01:00
Nikita Popov	5a8819b216	[InstCombine] Use replaceOperand() in more places This is a followup to D73803, which uses the replaceOperand() helper in more places. This should be NFC apart from changes to worklist order. Differential Revision: https://reviews.llvm.org/D73919	2020-02-11 17:38:23 +01:00
Sanjay Patel	62ce7e650a	[InstCombine] fix use check when canonicalizing abs/nabs We were checking for extra uses of the negated operand even if we were not going to create it as part of this canonicalization. This was showing up as a regression when we limit EarlyCSE as proposed in D74285.	2020-02-10 14:57:37 -05:00
Nikita Popov	a148b9e990	[InstCombine] Fix infinite min/max canonicalization loop (PR44541) While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541, it does so in a more roundabout manner and there might be other loopholes to trigger the same issue. This is a more direct fix, that prevents the transform if the min/max is based on a non-canonical sub X, 0 instruction. Differential Revision: https://reviews.llvm.org/D73849	2020-02-08 20:42:17 +01:00
Nikita Popov	9d03b7d0d0	[InstCombine] Use swapValues(); NFC Less code, and makes it more obvious that these operands do not need to be added back to the worklist.	2020-02-08 16:57:28 +01:00
Nikita Popov	e6c9ab4fb7	[InstCombine] Rename worklist methods; NFC This renames Worklist.AddDeferred() to Worklist.add() and Worklist.Add() to Worklist.push(). The intention here is that Worklist.add() should be the go-to method for explicit worklist management, while the raw Worklist.push() is mostly for InstCombine internals. I will then migrate uses of Worklist.push() to Worklist.add() in followup changes. As suggested by spatel on D73411 I'm also changing the remaining method names to lowercase first character, in line with current coding standards. Differential Revision: https://reviews.llvm.org/D73745	2020-02-03 18:56:51 +01:00
Sanjay Patel	7bee94410c	[InstCombine] form copysign from select of FP constants (PR44153) This should be the last step needed to solve the problem in the description of PR44153: https://bugs.llvm.org/show_bug.cgi?id=44153 If we're casting an FP value to int, testing its signbit, and then choosing between a value and its negated value, that's a complicated way of saying "copysign": (bitcast X) < 0 ? -TC : TC --> copysign(TC, X) Differential Revision: https://reviews.llvm.org/D72643	2020-01-20 10:51:14 -05:00
David Green	59b56e5c57	[InstCombine] Expand usub_sat patterns to handle constants The constants come through as add %x, -C, not a sub as would be expected. They need some extra matchers to canonicalise them towards usub_sat. Differential Revision: https://reviews.llvm.org/D69514	2019-11-30 16:58:01 +00:00
David Green	3a1bef5616	[InstCombine] Adjust usub_sat fold one use checks This adjusts the one use checks in the the usub_sat fold code to not increase instruction count, but otherwise do the fold. Reviewed as a part of D69514.	2019-11-30 16:58:00 +00:00
David Green	08390c52a2	[InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.sat Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats. Differential Revision: https://reviews.llvm.org/D69753	2019-11-17 10:45:11 +00:00
David Green	03fce6b12e	[InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.sat This adds to D69245, adding extra signed patterns for folding from a sadd_with_overflow to a sadd_sat. These are more complex than the unsigned patterns, as the overflow can occur in either direction. For the add case, the positive overflow can only occur if both of the values are positive (same for both the values being negative). So there is an extra select on whether to use the positive or negative overflow limit. Differential Revision: https://reviews.llvm.org/D69252	2019-11-17 10:42:39 +00:00
Sanjay Patel	3d6b53980c	[InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select As noted by the FIXME comment, this is not correct based on our current FMF semantics. We should be propagating FMF from the final value in a sequence (in this case the 'select'). So the behavior even without this patch is wrong, but we did not allow FMF on 'select' until recently. But if we do the correct thing right now in this patch, we'll inevitably introduce regressions because we have not wired up FMF propagation for 'phi' and 'select' in other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a better incremental way to make progress. That said, the potential extra damage over the existing wrong behavior from this patch is very limited. AFAIK, the only way to have different FMF on IR in the same function is if we have LTO inlined IR from 2 modules that were compiled using different fast-math settings. As seen in the tests, we may actually see some improvements with this patch because adding the FMF to the 'select' allows matching to min/max intrinsics that were previously missed (in the common case, the 'fcmp' and 'select' should have identical FMF to begin with). Next steps in the transition: Make similar changes in instcombine as needed. Enable phi-to-select FMF propagation in SimplifyCFG. Remove dependencies on fcmp with FMF. Deprecate FMF on fcmp. Differential Revision: https://reviews.llvm.org/D69720	2019-11-13 10:38:42 -05:00
Sanjay Patel	a2240f57e7	[InstCombine] simplify fcmp+select canonicalization; NFCI We had 2 blocks of code that are nearly identical. Existing regression tests should cover both of the patterns.	2019-10-31 13:13:32 -04:00
David Green	a5f7bc0de7	[InstCombine] Canonicalize uadd.with.overflow to uadd.sat This adds some patterns to transform uadd.with.overflow to uadd.sat (with usub.with.overflow to usub.sat too). The patterns selects from UINTMAX (or 0 for subs) depending on whether the operation overflowed. Signed patterns are a little more involved (they can wrap in two directions), but can be added here in a followup patch too. Differential Revision: https://reviews.llvm.org/D69245	2019-10-31 12:45:38 +00:00
David Green	bf21f0d489	[InstCombine] Extra combine for uadd_sat This is an extra fold for a canonical form of uadd_sat, as shown in D68651. It essentially selects uadd from an add and a select. Differential Revision: https://reviews.llvm.org/D69244	2019-10-28 15:21:16 +00:00
David Green	186155b89c	[InstCombine] Signed saturation patterns This adds an instcombine matcher for code that attempts to perform signed saturating arithmetic by casting to a higher type. Unsigned cases are already matched, this adds extra matches for the more complex signed cases, which involves matching the min(max(add a b)) nodes with proper extends to ensure legality. Differential Revision: https://reviews.llvm.org/D68651 llvm-svn: 375505	2019-10-22 15:39:47 +00:00
Sanjay Patel	eb8d39e113	[InstCombine] allow icmp+binop folds before min/max bailout (PR43310) This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 llvm-svn: 372510	2019-09-22 14:31:53 +00:00
Simon Pilgrim	284118ce3b	InstCombiner::visitSelectInst - rename Pred to MinMaxPred to stop shadow variable warning. NFCI. We have a lot of Predicate variables, all similarly named.... llvm-svn: 370207	2019-08-28 14:05:38 +00:00
David Bolvansky	0c2692108c	[InstCombine] Fold select with ctlz to cttz Summary: Handle pattern [0]: int ctz(unsigned int a) { int c = __clz(a & -a); return a ? 31 - c : c; } In reality, the compiler can generate much better code for cttz, so fold away this pattern. https://godbolt.org/z/c5kPtV [0] https://community.arm.com/community-help/f/discussions/2114/count-trailing-zeros Reviewers: spatel, nikic, lebedev.ri, dmgreen, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66308 llvm-svn: 370037	2019-08-27 10:22:40 +00:00
Roman Lebedev	9cf08c6de1	[Constant] Add 'isElementWiseEqual()' method Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison(). Return true if this constant and a constant 'Y' are element-wise equal. This is identical to just comparing the pointers, with the exception that for vectors, if only one of the constants has an `undef` element in some lane, the constants still match. llvm-svn: 369842	2019-08-24 06:49:51 +00:00
Roman Lebedev	2c75fe7f2a	[InstCombine] Try to reuse constant from select in leading comparison Summary: If we have e.g.: ``` %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 ``` the constants `65535` and `65536` are suspiciously close. We could perform a transformation to deduplicate them: ``` Name: ult %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 => %t.inv = icmp ugt i32 %x, 65535 %r = select i1 %t.inv, i32 65535, i32 %y ``` https://rise4fun.com/Alive/avb While this may seem esoteric, this should certainly be good for vectors (less constant pool usage) and for opt-for-size - need to have only one constant. But the real fun part here is that it allows further transformation, in particular it finishes cleaning up the `clamp` folding, see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`. We start with e.g. ``` %dont_need_to_clamp_positive = icmp sle i32 %X, 32767 %dont_need_to_clamp_negative = icmp sge i32 %X, -32768 %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767 %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit ``` without this patch we currently produce ``` %1 = icmp slt i32 %X, 32768 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1, i32 %3, i32 32767 ``` which isn't really a `clamp` - both comparisons are performed on the original value, this patch changes it into ``` %1.inv = icmp sgt i32 %X, 32767 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1.inv, i32 32767, i32 %3 ``` and then the magic happens! Some further transform finishes polishing it and we finally get: ``` %t1 = icmp sgt i32 %X, -32768 %t2 = select i1 %t1, i32 %X, i32 -32768 %t3 = icmp slt i32 %t2, 32767 %R = select i1 %t3, i32 %t2, i32 32767 ``` which is beautiful and just what we want. Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization: https://rise4fun.com/Alive/THl Proofs for the fold itself: https://rise4fun.com/Alive/THl Reviewers: spatel, dmgreen, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66232 llvm-svn: 369840	2019-08-24 06:49:25 +00:00
Sanjay Patel	f99d254aae	[InstCombine] simplify min/max of min/max with same operands (PR35607) This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386	2019-08-20 13:39:17 +00:00
Sanjay Patel	39eb2324f7	[InstCombine] canonicalize a scalar-select-of-vectors to vector select This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140	2019-08-16 18:51:30 +00:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Sanjay Patel	9ce5f41851	[InstCombine] fold cmp+select using select operand equivalence As discussed in PR42696: https://bugs.llvm.org/show_bug.cgi?id=42696 ...but won't help that case yet. We have an odd situation where a select operand equivalence fold was implemented in InstSimplify when it could have been done more generally in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand. Here's an example: https://rise4fun.com/Alive/Xplr %cmp = icmp eq i32 %x, 2147483647 %add = add nsw i32 %x, 1 %sel = select i1 %cmp, i32 -2147483648, i32 %add => %sel = add i32 %x, 1 I've left the InstSimplify code in place for now, but my guess is that we'd prefer to remove that as a follow-up to save on code duplication and compile-time. Differential Revision: https://reviews.llvm.org/D65576 llvm-svn: 367695	2019-08-02 17:39:32 +00:00
Roman Lebedev	0efeaa8162	[IR] SelectInst: add swapValues() utility Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547	2019-08-01 12:31:35 +00:00
David Bolvansky	9f0d718c66	[InstCombine] Disable fold from D64285 for non-integer types llvm-svn: 365959	2019-07-12 21:14:21 +00:00
David Bolvansky	af1b3185f5	[InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y)) Summary: (select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y)) -> ashr (X, Y)) (select (icmp slt x, 1), ashr (X, Y), lshr (X, Y)) -> ashr (X, Y)) Fixes PR41173 Alive proof by @lebedev.ri (thanks) Name: PR41173 %cmp = icmp slt i32 %x, 1 %shr = lshr i32 %x, %y %shr1 = ashr i32 %x, %y %retval.0 = select i1 %cmp, i32 %shr1, i32 %shr => %retval.0 = ashr i32 %x, %y Optimization: PR41173 Done: 1 Optimization is correct! Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: nikic, craig.topper, llvm-commits, lebedev.ri Tags: #llvm Differential Revision: https://reviews.llvm.org/D64285 llvm-svn: 365893	2019-07-12 11:31:16 +00:00
Sanjay Patel	706b48251f	[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics This is the opposite direction of D62158 (we have to choose 1 form or the other). Now that we have FMF on the select, this becomes more palatable. And the benefits of having a single IR instruction for this operation (less chances of missing folds based on extra uses, etc) overcome my previous comments about the potential advantage of larger pattern matching/analysis. Differential Revision: https://reviews.llvm.org/D62414 llvm-svn: 364721	2019-06-30 13:40:31 +00:00
Sanjay Patel	9650c95b7e	[InstCombine] allow unordered preds when canonicalizing to fabs() We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954	2019-06-10 15:39:00 +00:00
Sanjay Patel	85de9634e6	[InstCombine] fix bug in canonicalization to fabs() Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945	2019-06-10 14:57:45 +00:00
Sanjay Patel	8b6d9f60ed	[InstCombine] change canonicalization to fabs() to use FMF on fsub Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943	2019-06-10 14:46:36 +00:00
Sanjay Patel	8cd8c5784b	[InstCombine] allow unordered preds when canonicalizing to fabs() PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937	2019-06-10 14:14:51 +00:00
Sanjay Patel	87cd16a86e	[InstCombine] change canonicalization to fabs() to use FMF on fneg This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fneg (fsub) because they have the same operand. This works around the most glaring FMF logical inconsistency cited in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 llvm-svn: 362909	2019-06-09 16:22:01 +00:00
Sanjay Patel	a6019d5164	[InstCombine] sink FP negation of operands through select We don't always get this: Cond ? -X : -Y --> -(Cond ? X : Y) ...even with the legacy IR form of fneg in the case with extra uses, and we miss matching with the newer 'fneg' instruction because we are expecting binops through the rest of the path. Differential Revision: https://reviews.llvm.org/D61604 llvm-svn: 360075	2019-05-06 20:34:05 +00:00
Sanjay Patel	a64bd09ec4	[InstCombine] reduce code duplication; NFC llvm-svn: 360059	2019-05-06 17:39:18 +00:00
Nikita Popov	7462303e06	[InstCombine] Use uadd.sat and usub.sat for canonicalization Start using the uadd.sat and usub.sat intrinsics for the existing canonicalizations. These intrinsics should optimize better than expanded IR, have better handling in the X86 backend and should be no worse than expanded IR in other backends, as far as we know. rL357012 already introduced use of uadd.sat for the add+umin pattern. Differential Revision: https://reviews.llvm.org/D58872 llvm-svn: 357103	2019-03-27 17:56:15 +00:00
Sanjay Patel	1f65903dc1	[InstCombine] move add after smin/smax Follow-up to rL355221. This isn't specifically called for within PR14613, but we'll get there eventually if it's not already requested in some other bug report. https://rise4fun.com/Alive/5b0 Name: smax Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i8 %x, C0 %cond = icmp sgt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp sgt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nsw i8 %u2, C0 Name: smin Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i32 %x, C0 %cond = icmp slt i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp slt i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nsw i32 %u2, C0 llvm-svn: 355272	2019-03-02 16:45:10 +00:00
Sanjay Patel	6e1e7e1c3e	[InstCombine] move add after umin/umax In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221	2019-03-01 19:42:40 +00:00
Sanjay Patel	e8bf0f79bd	[InstCombine] canonicalize more unsigned saturated add with 'not' Yet another pattern variation suggested by: https://bugs.llvm.org/show_bug.cgi?id=14613 There are 8 more potential commuted patterns here on top of the 8 that were already handled (rL354221, rL354276, rL354393). We have the obvious commute of the 'add' + commute of the cmp predicate/operands (ugt/ult) + commute of the select operands: Name: base %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %x, %y %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %y, %x %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %y, %x %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt + commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %x, %y %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/den llvm-svn: 354887	2019-02-26 15:18:49 +00:00
Sanjay Patel	c1e0184317	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354393	2019-02-19 22:14:21 +00:00
Sanjay Patel	dcb93c0dda	[InstCombine] rearrange saturated add folds; NFC This is no-functional-change-intended, but that was also true when it was part of rL354276, and I managed to lose 2 predicates for the fold with constant...causing much bot distress. So this time I'm adding a couple of negative tests to avoid that. llvm-svn: 354384	2019-02-19 21:46:13 +00:00
Sanjay Patel	8a35d339c9	Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op" This reverts commit `079b610c29`. Bots are failing after this change on a stage 2 compile of clang. llvm-svn: 354277	2019-02-18 16:04:22 +00:00
Sanjay Patel	079b610c29	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276	2019-02-18 15:21:39 +00:00
Sanjay Patel	b341ee7071	[InstCombine] reduce more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224	2019-02-17 16:48:50 +00:00
Sanjay Patel	bee2073542	[InstCombine] reduce unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 (The matching here is incomplete. Trying to take minimal steps to make sure we don't induce infinite looping from existing canonicalizations of the 'select'.) llvm-svn: 354221	2019-02-17 15:58:48 +00:00
Sanjay Patel	18db56209c	[InstCombine] canonicalize cmp/select form of uadd saturate with constant I'm circling back around to a loose end from D51929. The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants. Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier. Pre: C2 == ~C1 %a = add i32 %x, C1 %c = icmp ugt i32 %x, C2 %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, C1 %c2 = icmp ult i32 %x, C2 %r = select i1 %c2, i32 %a, i32 -1 https://rise4fun.com/Alive/pkH Differential Revision: https://reviews.llvm.org/D57352 llvm-svn: 352536	2019-01-29 20:02:45 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Sanjay Patel	d023dd60e9	[InstCombine] canonicalize another raw IR rotate pattern to funnel shift This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672	2019-01-08 22:39:55 +00:00
Nikita Popov	65038515ee	[InstCombine] Relax cttz/ctlz with select on zero The cttz/ctlz intrinsics have a parameter specifying whether the result is undefined for zero. cttz(x, false) can be relaxed to cttz(x, true) if x is known non-zero, and in fact such an optimization is already performed. However, this currently doesn't work if x is non-zero as a result of a select rather than an explicit branch. This patch adds handling for this case, thus allowing x != 0 ? cttz(x, false) : y to simplify to x != 0 ? cttz(x, true) : y. Differential Revision: https://reviews.llvm.org/D55786 llvm-svn: 350463	2019-01-05 09:48:16 +00:00
Sanjay Patel	654e6aabb9	[InstCombine] canonicalize raw IR rotate patterns to funnel shift The final piece of IR-level analysis to allow this was committed with: rL350188 Using the intrinsics should improve transforms based on cost models like vectorization and inlining. The backend should be prepared too, so we can now canonicalize more sequences of shift/logic to the intrinsics and know that the end result should be equal or better to the original code even if the target does not have an actual rotate instruction. llvm-svn: 350199	2019-01-01 21:51:39 +00:00
Sanjay Patel	d802270808	[InstSimplify] fold select with implied condition This is an almost direct move of the functionality from InstCombine to InstSimplify. There's no reason not to do this in InstSimplify because we never create a new value with this transform. (There's a question of whether any dominance-based transform belongs in either of these passes, but that's a separate issue.) I've changed 1 of the conditions for the fold (1 of the blocks for the branch must be the block we started with) into an assert because I'm not sure how that could ever be false. We need 1 extra check to make sure that the instruction itself is in a basic block because passes other than InstCombine may be using InstSimplify as an analysis on values that are not wired up yet. The 3-way compare changes show that InstCombine has some kind of phase-ordering hole. Otherwise, we would have already gotten the intended final result that we now show here. llvm-svn: 347896	2018-11-29 18:44:39 +00:00
Mandeep Singh Grang	0905fc77c1	[InstCombine] Remove a couple of asserts based on incorrect assumptions Summary: These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same. This fixes PR39595. Reviewers: craig.topper, spatel, dmgreen Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54359 llvm-svn: 346874	2018-11-14 17:55:07 +00:00
Sanjay Patel	f8f12272e8	[InstCombine] canonicalize rotate patterns with cmp/select The cmp+branch variant of this pattern is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 ...and as discussed there, we probably can't transform that without a rotate intrinsic. We do have that now via funnel shift, but we're not quite ready to canonicalize IR to that form yet. The case with 'select' should already be transformed though, so that's this patch. The sequence with negation followed by masking is what we use in the backend and partly in clang (though that part should be updated). https://rise4fun.com/Alive/TplC %cmp = icmp eq i32 %shamt, 0 %sub = sub i32 32, %shamt %shr = lshr i32 %x, %shamt %shl = shl i32 %x, %sub %or = or i32 %shr, %shl %r = select i1 %cmp, i32 %x, i32 %or => %neg = sub i32 0, %shamt %masked = and i32 %shamt, 31 %maskedneg = and i32 %neg, 31 %shl2 = lshr i32 %x, %masked %shr2 = shl i32 %x, %maskedneg %r = or i32 %shl2, %shr2 llvm-svn: 346807	2018-11-13 22:47:24 +00:00
Sanjay Patel	1440107821	[InstSimplify] fold select (fcmp X, Y), X, Y This is NFCI for InstCombine because it calls InstSimplify, so I left the tests for this transform there. As noted in the code comment, we can allow this fold more often by using FMF and/or value tracking. llvm-svn: 346169	2018-11-05 21:51:39 +00:00
Sanjay Patel	87aa10062c	[InstCombine] loosen FP 0.0 constraint for fcmp+select substitution It looks like we correctly removed edge cases with 0.0 from D50714, but we were a bit conservative because getBinOpIdentity() doesn't distinguish between +0.0 and -0.0 and 'nsz' is effectively always true for fcmp (see discussion in: https://bugs.llvm.org/show_bug.cgi?id=38086 Without this change, we would get regressions by canonicalizing to +0.0 in all fcmp, and that's a step towards solving: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346143	2018-11-05 16:50:44 +00:00
Sanjay Patel	747feb28e4	[InstCombine] use 'match' to handle vectors and simplify code This is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345036	2018-10-23 15:05:12 +00:00
Sanjay Patel	ad76c682c7	[InstCombine] swap select profile metadata when swapping select ops llvm-svn: 345034	2018-10-23 14:43:31 +00:00
Neil Henning	57f5d0a885	[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle. The IRBuilder CreateIntrinsic method wouldn't allow you to specify the types that you wanted the intrinsic to be mangled with. To fix this I've: - Added an ArrayRef<Type > member to both CreateIntrinsic overloads. - Used that array to pass into the Intrinsic::getDeclaration call. - Added a CreateUnaryIntrinsic to replace the most common use of CreateIntrinsic where the type was auto-deduced from operand 0. - Added a bunch more unit tests to test CreateIntrinsic calls that weren't being tested (including the FMF flag that wasn't checked). This was suggested as part of the AMDGPU specific atomic optimizer review (https://reviews.llvm.org/D51969). Differential Revision: https://reviews.llvm.org/D52087 llvm-svn: 343962	2018-10-08 10:32:33 +00:00
Craig Topper	2b3f5df73a	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163. Reviewers: spatel, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52070 llvm-svn: 342797	2018-09-22 05:53:27 +00:00
Craig Topper	8fc05ce340	[InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible. This allows the xor to be removed completely. This might help with recomitting r341674, but seems good regardless. Coincidentally fixes PR38915. Differential Revision: https://reviews.llvm.org/D51964 llvm-svn: 342163	2018-09-13 18:52:58 +00:00
Sanjay Patel	37e464876b	[InstCombine] remove checks for IsFreeToInvert() I accidentally committed this diff with rL342147 because I had applied D51964. We probably do need those checks, but D51964 has tests and more discussion/motivation, so they should be re-added with that patch. llvm-svn: 342149	2018-09-13 16:18:12 +00:00
Sanjay Patel	6f00fc3317	[InstCombine] reorder folds to reduce chance of infinite loops I don't have a test case for this, but it's motivated by the discussion in D51964, and I've added TODO comments for the better fix - move simplifications into instsimplify because that's more efficient and reduces risk of infinite loops in instcombine caused by transforms trying to do the opposite folds. In this case, we know that the transform that tries to move 'not' through min/max can be fooled by the multiple uses of a value in another min/max, so try to squash the foldSPFofSPF() patterns first. llvm-svn: 342147	2018-09-13 16:04:06 +00:00
Alina Sbirlea	116caa2920	[InstCombine] Partially revert rL341674 due to PR38897. Summary: Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897). Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved. Working on a smaller reproducer for PR38897. Reviewers: craig.topper, spatel Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D51897 llvm-svn: 341883	2018-09-10 23:47:21 +00:00
Craig Topper	040c2b0acf	[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min. I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not. Differential Revision: https://reviews.llvm.org/D51398 llvm-svn: 341674	2018-09-07 16:19:50 +00:00
Florian Hahn	e32ff4b28a	[InstCombine] Do not fold scalar ops over select with vector condition. If OtherOpT or OtherOpF have scalar types and the condition is a vector, we would create an invalid select. Reviewers: spatel, john.brawn, mssimpso, craig.topper Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D51781 llvm-svn: 341666	2018-09-07 14:40:06 +00:00
Craig Topper	2bcb1eeee1	[InstCombine] Replace two calls to getNumUses() with !hasNUsesOrMore We were calling getNumUses to check for 1 or 2 uses. But getNumUses is linear in the number of uses. We can instead use !hasNUsesOrMore(3) which will stop the linear scan as soon as it determines there are at least 3 uses even if there are more. llvm-svn: 340939	2018-08-29 17:09:21 +00:00
Sanjay Patel	c615910be5	[InstCombine] fix formatting; NFC llvm-svn: 340790	2018-08-27 23:01:10 +00:00
David Bolvansky	43b0e25847	[InstCombine] Fold Select with binary op - FP opcodes Summary: Follow up for https://reviews.llvm.org/rL339520 and https://reviews.llvm.org/rL338300 Alive: ``` %A = fcmp oeq float %x, 0.0 %B = fadd nsz float %x, %z %C = select i1 %A, float %B, float %y => %C = select i1 %A, float %z, float %y ---------- %A = fcmp oeq float %x, 0.0 %B = fadd nsz float %x, %z %C = select %A, float %B, float %y => %C = select %A, float %z, float %y Done: 1 Optimization is correct %A = fcmp une float %x, -0.0 %B = fadd nsz float %x, %z %C = select i1 %A, float %y, float %B => %C = select i1 %A, float %y, float %z ---------- %A = fcmp une float %x, -0.0 %B = fadd nsz float %x, %z %C = select %A, float %y, float %B => %C = select %A, float %y, float %z Done: 1 Optimization is correct ``` Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50714 llvm-svn: 340538	2018-08-23 15:22:15 +00:00
Michael Berg	0b838deddc	extend binop folds for selects to include true and false binops flag intersection Summary: This change address bug 38641 Reviewers: spatel, wristow Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D50996 llvm-svn: 340222	2018-08-20 22:26:58 +00:00
Craig Topper	24674ca773	[InstCombine] Move some variable declarations into a more appropriate scope. NFC llvm-svn: 340150	2018-08-20 05:35:12 +00:00
Michael Berg	ed89d069f4	add a missed case for binary op FMF propagation under select folds llvm-svn: 339938	2018-08-16 20:59:45 +00:00
David Bolvansky	01d98cc03f	[InstCombine] Fold Select with binary op - non-commutative opcodes Summary: Basic version was merged - https://reviews.llvm.org/D49954 This adds support for FP & non-commutative opcodes Precommited tests: https://reviews.llvm.org/rL338727 Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: jfb Differential Revision: https://reviews.llvm.org/D50190 llvm-svn: 339520	2018-08-12 17:30:07 +00:00
Sanjay Patel	85e17bb195	[InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI This is a retry of rL339439 with a fix for the problem that caused the original commit to be reverted at rL339446. That problem was that the compare can be integer while the binop is FP or vice-versa, so we need to use the binop type when we ask for the identity constant. A test to guard against the problem was added at rL339453. llvm-svn: 339469	2018-08-10 20:30:35 +00:00
Sanjay Patel	c9cc86a5b3	[InstCombine] revert r339439 - rearrange code for foldSelectBinOpIdentity That was supposed to be NFC, but it exposed a logic hole somewhere that caused bots to fail. llvm-svn: 339446	2018-08-10 16:12:19 +00:00
Sanjay Patel	3b92a17526	[InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI This should make it easier to folow and to add the planned enhancements such as D50190. llvm-svn: 339439	2018-08-10 15:11:26 +00:00
David Bolvansky	6737b3a6a1	[InstCombine] Fold Select with binary op Summary: Fold %A = icmp eq i8 %x, 0 %B = xor i8 %x, %z %C = select i1 %A, i8 %B, i8 %y To %C = select i1 %A, i8 %z, i8 %y Fixes https://bugs.llvm.org/show_bug.cgi?id=38345 Proof: https://rise4fun.com/Alive/43J Reviewers: lebedev.ri, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49954 llvm-svn: 338300	2018-07-30 20:38:53 +00:00
Chen Zheng	567485a72f	[InstCombine] canonicalize abs pattern Differential Revision: https://reviews.llvm.org/D48754 llvm-svn: 338092	2018-07-27 01:49:51 +00:00
Chen Zheng	ccc8422464	[InstCombine] add more SPFofSPF folding Differential Revision: https://reviews.llvm.org/D49238 llvm-svn: 337143	2018-07-16 02:23:00 +00:00
John Brawn	e4ff0bd401	[InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need to use the type of the existing operand when creating a new operand not the type of a select operand, as the two may be different. This fixes PR37686. llvm-svn: 334019	2018-06-05 14:10:55 +00:00
Sanjay Patel	26368cd5d9	[InstCombine] narrow select to match condition operands' size This is the planned enhancement to D47163 / rL333611. We want to match cmp/select sizes because that will be recognized as min/max more easily and lead to better codegen (especially for vector types). As mentioned in D47163, this improves some of the tests that would also be folded by D46380, so we may want to adjust that patch to match the new patterns where the extend op occurs after the select. llvm-svn: 333689	2018-05-31 19:55:27 +00:00
Sanjay Patel	a003c728a5	[InstCombine] choose 1 form of abs and nabs as canonical We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819	2018-05-20 14:23:23 +00:00
Craig Topper	0198b73769	[InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs. llvm-svn: 332770	2018-05-18 21:21:56 +00:00
Sanjay Patel	e7b6654711	[InstCombine] refine select-of-constants to bitwise ops Add logic for the special case when a cmp+select can clearly be reduced to just a bitwise logic instruction, and remove an over-reaching chunk of general purpose bit magic. The primary goal is to remove cases where we are not improving the IR instruction count when doing these select transforms, and in all cases here that is true. In the motivating 3-way compare tests, there are further improvements because we can combine/propagate select values (not sure if that belongs in instcombine, but it's there for now). DAGCombiner has folds to turn some of these selects into bit magic, so there should be no difference in the end result in those cases. Not all constant combinations are handled there yet, however, so it is possible that some targets will see more cmov/csel codegen with this change in IR canonicalization. Ideally, we'll go further to not turn selects into multiple logic/math ops in instcombine, and we'll canonicalize to selects. But we should make sure that this step does not result in regressions first (and if it does, we should fix those in the backend). The general direction for this change was discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html Alive proofs for the new bit magic: https://rise4fun.com/Alive/XG7 Differential Revision: https://reviews.llvm.org/D46086 llvm-svn: 331486	2018-05-03 21:58:44 +00:00
Sanjay Patel	807ddee1bf	[InstCombine] clean up foldSelectICmpAnd(); NFC As discussed in D45862, we want to delete parts of this code because it can create more instructions than it removes. But we also want to preserve some folds that are winners, so tidy up what's here to make splitting the good from bad a bit easier. llvm-svn: 330841	2018-04-25 16:34:01 +00:00
Roman Lebedev	c00659328a	[InstCombine]: foldSelectICmpAndAnd(): and is commutative Summary: The fold added in D45108 did not account for the fact that the and instruction is commutative, and if the mask is a variable, the mask variable and the fold variable may be swapped. I have noticed this by accident when looking into [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]] This extends/generalizes that fold, so it is handled too. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45539 llvm-svn: 330001	2018-04-13 09:57:57 +00:00
Roman Lebedev	41922f1a6d	[InstCombine] Get rid of select of bittest (PR36950 / PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 \| PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 \| PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492	2018-04-07 10:37:24 +00:00
Sanjay Patel	93e64dd9a1	[PatternMatch] allow undef elements when matching vector FP +0.0 This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461	2018-03-25 21:16:33 +00:00
Sanjay Patel	0ce3086777	[InstCombine] canonicalize fcmp+select to fabs This is complicated by -0.0 and nan. This is based on the DAG patterns as shown in D44091. I'm hoping that we can just remove those DAG folds and always rely on IR canonicalization to handle the matching to fabs. We would still need to delete the broken code from DAGCombiner to fix PR36600: https://bugs.llvm.org/show_bug.cgi?id=36600 Differential Revision: https://reviews.llvm.org/D44550 llvm-svn: 327858	2018-03-19 15:14:30 +00:00
Craig Topper	ee99aa4dd0	[InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMore getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer. There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them. Differential Revision: https://reviews.llvm.org/D44398 llvm-svn: 327315	2018-03-12 18:46:05 +00:00
Sanjay Patel	1f2f5d18d3	[InstCombine] simplify min/max canonicalization; NFCI llvm-svn: 326828	2018-03-06 19:01:18 +00:00
Sanjay Patel	7ed0bc26ac	[ValueTracking] move helpers for SelectPatterns from InstCombine to ValueTracking Most of the folds based on SelectPatternResult belong in InstSimplify rather than InstCombine, so the helper code should be available to other passes/analysis. llvm-svn: 326812	2018-03-06 16:57:55 +00:00
Craig Topper	1c19cc1745	[InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W)) if the binop is rem or div. The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe. Fixes PR36362. Differential Revision: https://reviews.llvm.org/D43276 llvm-svn: 325148	2018-02-14 18:08:33 +00:00
Sanjay Patel	e9a153f414	[InstCombine] add unsigned saturation subtraction canonicalizations This is the instcombine part of unsigned saturation canonicalization. Backend patches already commited: https://reviews.llvm.org/D37510 https://reviews.llvm.org/D37534 It converts unsigned saturated subtraction patterns to forms recognized by the backend: (a > b) ? a - b : 0 -> ((a > b) ? a : b) - b) (b < a) ? a - b : 0 -> ((a > b) ? a : b) - b) (b > a) ? 0 : a - b -> ((a > b) ? a : b) - b) (a < b) ? 0 : a - b -> ((a > b) ? a : b) - b) ((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b) ((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b) ((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b) Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D41480 llvm-svn: 324255	2018-02-05 17:53:29 +00:00
John Brawn	2867bd72c0	[InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr Three (or more) operand getelementptrs could plausibly also be handled, but handling only two-operand fits in easily with the existing BinaryOperator handling. Differential Revision: https://reviews.llvm.org/D39958 llvm-svn: 322930	2018-01-19 10:05:15 +00:00
Sanjay Patel	31b4b76f99	[InstCombine] fold min/max tree with common operand (PR35717) There is precedence for factorization transforms in instcombine for FP ops with fast-math. We also have similar logic in foldSPFofSPF(). It would take more work to add this to reassociate because that's specialized for binops, and min/max are not binops (or even single instructions). Also, I don't have evidence that larger min/max trees than this exist in real code, but if we find that's true, we might want to reorganize where/how we do this optimization. In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have: int test(int xc, int xm, int xy) { int xk; if (xc < xm) xk = xc < xy ? xc : xy; else xk = xm < xy ? xm : xy; return xk; } This patch solves that problem because we recognize more min/max patterns after rL321672 https://rise4fun.com/Alive/Qjne https://rise4fun.com/Alive/3yg Differential Revision: https://reviews.llvm.org/D41603 llvm-svn: 321998	2018-01-08 15:05:34 +00:00
Sanjay Patel	26a6fcde83	[InstCombine] relax use constraint for min/max (~a, ~b) --> ~min/max(a, b) In the minimal case, this won't remove instructions, but it still improves uses of existing values. In the motivating example from PR35834, it does remove instructions, and sets that case up to be optimized by something like D41603: https://reviews.llvm.org/D41603 llvm-svn: 321936	2018-01-06 17:34:22 +00:00
Sanjay Patel	5b6aacf2c1	[InstCombine] add folds for min(~a, b) --> ~max(a, b) Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b), the use checking and operand creation were off. We were potentially creating repeated identical instructions of existing values. This led to infinite looping after I added the extra folds. By using the simpler m_Not matcher and not creating new 'not' ops for a and b, we avoid that problem. It's possible that not using IsFreeToInvert() here is more limiting than the simpler matcher, but there are no tests for anything more exotic. It's also possible that we should relax the use checking further to handle a case like PR35834: https://bugs.llvm.org/show_bug.cgi?id=35834 ...but we can make that a follow-up if it is needed. llvm-svn: 321882	2018-01-05 19:01:17 +00:00
Craig Topper	f7b86728fa	[InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition. Summary: This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select. As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero? Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39999 llvm-svn: 318267	2017-11-15 05:23:02 +00:00
Matthew Simpson	b6915fbfa2	[InstCombine] Simplify selects that test cmpxchg instructions If a select instruction tests the returned flag of a cmpxchg instruction and selects between the returned value of the cmpxchg instruction and its compare operand, the result of the select will always be equal to its false value. Differential Revision: https://reviews.llvm.org/D39383 llvm-svn: 316994	2017-10-31 12:34:02 +00:00
Eugene Zelenko	7f0f9bc5ab	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316503	2017-10-24 21:24:53 +00:00

1 2 3 4 5 ...

438 Commits