+/-0 is obviously foldable. Other non-special, non-subnormal
values are also probably OK. For denormal values, check
the calling function's denormal mode. For now, don't fold
denormals to the input for IEEE mode because as far as I know
the langref is still pretending LLVM's float isn't IEEE.
Also folds undef to 0, although NaN may make more sense. Skips
folding nans and infinities, although it should be OK to fold those
in a future change.
https://alive2.llvm.org/ce/z/oShzr3
This was noted as a missing fold in D134876 (with additional
examples based on issue #58046).
I'm assuming that fmul with a zero operand is rare enough
that the use of ValueTracking will not noticeably increase
compile-time.
This adjusts a PowerPC codegen test that was added with D88388
because it would get folded away and no longer provide coverage
for the bug fix.
The constant is already commuted for an fmul opcode,
but this code can be called more directly for fma,
so we have to swap for that caller. There are tests
in InstSimplify and InstCombine to verify that this
works as expected.
This is an extension of the existing min/max+select fold (which already
has a very large number of variations) to allow a vector shuffle because
that's what we have in the motivating example from issue #42100.
A couple of Alive2 checks of variants (I don't know how to generalize
these in Alive):
https://alive2.llvm.org/ce/z/jUFAqT
And verify the PR42100 test:
https://alive2.llvm.org/ce/z/3EcASf
It's possible there is some generalization of the fold or a
VectorCombine/SLP answer for the motivating test, but I haven't found a
better/smaller solution yet.
We can also add even more variants here as follow-up patches. For example,
we can have shuffle followed by min/max; we also don't have this
canonicalization or the reverse:
https://alive2.llvm.org/ce/z/StHD9f
Differential Revision: https://reviews.llvm.org/D134879
The phase ordering test is the almost unoptimized IR for the example
in issue #42100; it was passed through -mem2reg to reduce obvious
excessive load/store and other noise.
D134879
The test shows that we would fail to consistently fold the
instruction based on the max value operand. This is also
the root cause for issue #57986, but I'll add an instcombine
test + assert for that exact problem in another commit.
This extends e5d15e1162 to handle the inverse predicates
(there's probably a more elegant way to specify the preds).
These patterns correspond to the existing simplify:
max (min X, Y), X --> X
...and extra preds for (non)equality.
The tests cycle through all 10 icmp preds for each min/max
variant with 4 swapped operand patterns each (and the min/max
operands are commuted in every other test within those).
Some Alive2 examples to verify:
https://alive2.llvm.org/ce/z/XMvEKQhttps://alive2.llvm.org/ce/z/QpMChr
This is similar to the existing simplify:
max (max X, Y), X --> max X, Y
...but the select condition can be one of
several predicates as shown in the tests.
The tests cycle through all 10 icmp preds for
each min/max variant with 4 swapped operand
patterns each (and the min/max operands are
commuted in every other test within those).
Some Alive2 examples to verify:
https://alive2.llvm.org/ce/z/lCAQm4https://alive2.llvm.org/ce/z/kzxVXC
We can handle vectors inside simplifyWithOpReplaced(), as long as
cross-lane operations are excluded. The equality can hold (or not
hold) for each vector lane independently, so we shouldn't use the
replacement value from other lanes.
I believe the only operations relevant here are shufflevector (where
all previous bugs were seen) and calls (which might use shuffle-like
intrinsics and would require more careful classification).
Differential Revision: https://reviews.llvm.org/D134348