InstSimplify currently checks whether the instruction simplifies
back to itself, and returns undef in that case. Generally, this
should only occur in unreachable code.
However, this was also done for the simplifyInstructionWithOperands()
API. In that case, the instruction only serves as a template that
provides the opcode and other non-operand data. In this case,
simplifying back to the same "instruction" may be expected. This
caused PR58401 in conjunction with D134954.
As such, move this check into simplifyInstruction() only. The only
other caller of simplifyInstructionWithOperands() also handles the
self-simplification case explicitly.
https://alive2.llvm.org/ce/z/oShzr3
This was noted as a missing fold in D134876 (with additional
examples based on issue #58046).
I'm assuming that fmul with a zero operand is rare enough
that the use of ValueTracking will not noticeably increase
compile-time.
This adjusts a PowerPC codegen test that was added with D88388
because it would get folded away and no longer provide coverage
for the bug fix.
The constant is already commuted for an fmul opcode,
but this code can be called more directly for fma,
so we have to swap for that caller. There are tests
in InstSimplify and InstCombine to verify that this
works as expected.
This is an extension of the existing min/max+select fold (which already
has a very large number of variations) to allow a vector shuffle because
that's what we have in the motivating example from issue #42100.
A couple of Alive2 checks of variants (I don't know how to generalize
these in Alive):
https://alive2.llvm.org/ce/z/jUFAqT
And verify the PR42100 test:
https://alive2.llvm.org/ce/z/3EcASf
It's possible there is some generalization of the fold or a
VectorCombine/SLP answer for the motivating test, but I haven't found a
better/smaller solution yet.
We can also add even more variants here as follow-up patches. For example,
we can have shuffle followed by min/max; we also don't have this
canonicalization or the reverse:
https://alive2.llvm.org/ce/z/StHD9f
Differential Revision: https://reviews.llvm.org/D134879
The test shows that we would fail to consistently fold the
instruction based on the max value operand. This is also
the root cause for issue #57986, but I'll add an instcombine
test + assert for that exact problem in another commit.
It's still possible that there's a simpler way to specify
the conditions needed for this set of folds, but "getStrictPred"
converts >= to > for example, so there's no need to explicitly
check that.
This extends e5d15e1162 to handle the inverse predicates
(there's probably a more elegant way to specify the preds).
These patterns correspond to the existing simplify:
max (min X, Y), X --> X
...and extra preds for (non)equality.
The tests cycle through all 10 icmp preds for each min/max
variant with 4 swapped operand patterns each (and the min/max
operands are commuted in every other test within those).
Some Alive2 examples to verify:
https://alive2.llvm.org/ce/z/XMvEKQhttps://alive2.llvm.org/ce/z/QpMChr
This is similar to the existing simplify:
max (max X, Y), X --> max X, Y
...but the select condition can be one of
several predicates as shown in the tests.
The tests cycle through all 10 icmp preds for
each min/max variant with 4 swapped operand
patterns each (and the min/max operands are
commuted in every other test within those).
Some Alive2 examples to verify:
https://alive2.llvm.org/ce/z/lCAQm4https://alive2.llvm.org/ce/z/kzxVXC
We can handle vectors inside simplifyWithOpReplaced(), as long as
cross-lane operations are excluded. The equality can hold (or not
hold) for each vector lane independently, so we shouldn't use the
replacement value from other lanes.
I believe the only operations relevant here are shufflevector (where
all previous bugs were seen) and calls (which might use shuffle-like
intrinsics and would require more careful classification).
Differential Revision: https://reviews.llvm.org/D134348
We already support SGE, so the same logic should hold for SLE with
the LHS and RHS swapped.
I didn't see this in the wild. Just happened to walk past this code
and thought it was odd that it was asymmetric in what condition
codes it handled.
Reviewed By: spatel, reames
Differential Revision: https://reviews.llvm.org/D131805
My most recent change for D131607 had a formatting error that I didn't
notice until after I committed it. Let me fix it now so changes to this
file will be back-to-back from me.
Another ticket split out of D107285, this extends the optimization
of 0.0 - -X to just X when using constrained intrinsics and the
optimization is allowed.
If the negation of X is done with fsub then the match fails because of
the lack of IR Matcher support for constrained intrinsics.
While I'm here, remove some TODO notices since the work is no longer
planned.
Differential Revision: https://reviews.llvm.org/D131607
We get a couple of improvements from recognizing swapped
operand patterns that were not handled by the replicated
code.
This should also enable simplifying larger patterns as
seen in issue #56653 and issue #56654, but that requires
enhancements to isImpliedCondition() itself.
issue #56775
I rearranged the Thumb2 codegen test to avoid simplifying the chain
of rounding instructions. I'm assuming the intent of the test is
to verify lowering of each of those intrinsics.
We already call the more general isImpliedCondition() (which calls
isImpliedTrueByMatchingCmp() internally) from simplifyAndInst()
and simplifyOrInst().
There was a difference visible with this change on a vector test
before a925bef70c, but I can't find any gaps now.
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
Handle denormal constant input for fcmp instructions based on the
denormal handling mode.
Reviewed By: spatel, dcandler
Differential Revision: https://reviews.llvm.org/D128647
This allows all constant folding to happen through a single
function, without requiring special handling for loads at each
call-site.
This may not be NFC because some callers currently don't do that
special handling.
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).
This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.
Differential Revision: https://reviews.llvm.org/D127976
Depending on the environment, a floating point instruction should
treat denormal inputs as zero, and/or flush a denormal output to zero.
Denormals are not currently accounted for when an instruction gets
folded to a constant, which can lead to differences in output between
a folded and a unfolded instruction when running on the target. The
denormal handling mode can be set by the function level attribute
denormal-fp-math, which this patch uses to determine whether any
denormal inputs to or outputs from folding should be zero, and that
the sign is set appropriately.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D116952