Now that SimpleLoopUnswitch and other transforms no longer introduce
branch on poison, enable the -branch-on-poison-as-ub option by
default. The practical impact of this is mostly better flag
preservation in SCEV, and some freeze instructions no longer being
necessary.
Differential Revision: https://reviews.llvm.org/D125299
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute. The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.
This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
expansion know whether the VP intrinsic replacement will be
speculatable. VP expansion may only discard %evl where this is the
case.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125296
Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D126018
A load with !dereferenceable or !dereferenceable_or_null metadata
must return a well-defined (non-undef/poison) value. Effectively
they imply !noundef. This is the same as we do for the
dereferenceable(N) attribute.
This should fix https://github.com/llvm/llvm-project/issues/55672,
or at least the specific case discussed there.
Differential Revision: https://reviews.llvm.org/D126296
Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D125332
isImpliedCondition() currently handles and/or on the LHS, but not
on the RHS, resulting in asymmetric behavior. This patch adds two
new implication rules:
* LHS ==> (RHS1 || RHS2) if LHS ==> RHS1 or LHS ==> RHS2
* LHS ==> !(RHS1 && RHS2) if LHS ==> !RHS1 or LHS ==> !RHS2
Differential Revision: https://reviews.llvm.org/D125551
I fixed some poison-safety violations on related patterns in InstCombine
and noticed that we missed adding nsw/nuw on them, so this adds clauses
to the underlying analysis for that.
We need the undef input restriction to make this safe according to Alive2:
https://alive2.llvm.org/ce/z/48g9K8
Differential Revision: https://reviews.llvm.org/D125500
This extends haveNoCommonBitsSet() to two additional cases, allowing
the following folds:
* `A + (B & ~A)` --> `A | (B & ~A)`
(https://alive2.llvm.org/ce/z/crxxhN)
* `A + ((A & B) ^ B)` --> `A | ((A & B) ^ B)`
(https://alive2.llvm.org/ce/z/A_wsH_)
These should further fold to just `A | B`, though this currently
only works in the first case.
The reason why the second fold is necessary is that we consider
this to be the canonical form if B is a constant. (I did check
whether we can change that, but it looks like a number of folds
depend on the current canonicalization, so I ended up adding both
patterns here.)
Differential Revision: https://reviews.llvm.org/D124763
This reverts commit e810d55809.
The commit was not taken into account the fact that strduped string could be
modified. Checking if such modification happens would make the function very
costly, without a test case in mind it's not worth the effort.
Two interesting ommissions:
* When reordering in either direction, reordering two calls which both
contain inf-loops is illegal. This one is possibly a change in behavior
for certain callers (e.g. fixes a latent bug.)
* When moving down, control dependence must be respected by checking the
inverse of isSafeToSpeculativeExecute. Current callers all seem to
handle this case - though admitted, I did not do an exhaustive audit.
Most seem to be only interested in moving upwards within a block. This
is mostly a case of future proofing an API so that it implements what
the comments says, not just what current callers need.
Noticed via inspection. I don't have a test case.
Currently some optimizations are disabled because llvm::CannotBeNegativeZero()
does not know how to deal with the constrained intrinsics. This patch fixes
that by extending the existing implementation.
Differential Revision: https://reviews.llvm.org/D121483
We still need the code after stripAndAccumulateConstantOffsets() since
it doesn't handle GEPs of scalable types and non-constant but identical
indexes.
Differential Revision: https://reviews.llvm.org/D120523
This is a revert of cfcc42bdc. The analysis is wrong as shown by
the minimal tests for instcombine:
https://alive2.llvm.org/ce/z/y9Dp8A
There may be a way to salvage some of the other tests,
but that can be done as follow-ups. This avoids a miscompile
and fixes#54311.
This is the same special logic we apply for SPF signed clamps
when computing the number of sign bits, just for intrinsics.
This just uses the same logic as the select case, but there's
multiple directions this could be improved in: We could also use
the num sign bits from the clamped value, we could do this during
constant range calculation, and there's probably unsigned analogues
for the constant range case at least.
This one tries to fix:
https://github.com/llvm/llvm-project/issues/53357.
Simply, this one would check (x & y) and ~(x | y) in
haveNoCommonBitsSet. Since they shouldn't have common bits (we could
traverse the case by enumerating), and we could convert this one to (x &
y) | ~(x | y) . Then the compiler could handle it in
InstCombineAndOrXor.
Further more, since ((x & y) + (~x & ~y)) would be converted to ((x & y)
+ ~(x | y)), this patch would fix it too.
https://alive2.llvm.org/ce/z/qsKzRS
Reviewed By: spatel, xbolva00, RKSimon, lebedev.ri
Differential Revision: https://reviews.llvm.org/D118094
Fixes a MemCpyOpt miscompile with opaque pointers.
This function can be further cleaned up, but let's just fix the miscompile first.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D119652
D108992 added KnownBits handling for 'Quadratic Reciprocity' self-multiplication patterns (bit[1] == 0), which can be used for non-undef values (poison is OK).
This patch adds noundef selfmultiply handling to value tracking so demanded bits patterns can make use of it.
Differential Revision: https://reviews.llvm.org/D117995
The behavior in Analysis (knownbits) implements poison semantics already,
and we expect the transforms (for example, in instcombine) derived from
those semantics, so this patch changes the LangRef and remaining code to
be consistent. This is one more step in removing "undef" from LLVM.
Without this, I think https://github.com/llvm/llvm-project/issues/53330
has a legitimate complaint because that report wants to allow subsequent
code to mask off bits, and that is allowed with undef values. The clang
builtins are not actually documented anywhere AFAICT, but we might want
to add that to remove more uncertainty.
Differential Revision: https://reviews.llvm.org/D117912
This was noted in post-commit review for D116322 / 0edf99950e .
I am not seeing how to expose the bug in a test though because
we don't pass an assumption cache into this analysis from there.
This function returns an upper bound on the number of bits needed
to represent the signed value. Use "Max" to match similar functions
in KnownBits like countMaxActiveBits.
Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old
name around to keep this patch size down. Will do a bulk rename as
follow up.
Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.
Reviewed By: lebedev.ri, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D116522
We should not lose analysis precision if an 'add' has both no-wrap
flags (nsw and nuw) compared to just one or the other.
This patch is modeled on a similar construct that was added with
D59386.
I don't think it is possible to expose a problem with an unsigned
compare because of the way this was coded (nuw is handled first).
InstCombine has an assert that fires with the example from:
https://github.com/llvm/llvm-project/issues/52884
...because it was expecting InstSimplify to handle this kind of
pattern with an smax.
Fixes#52884
Differential Revision: https://reviews.llvm.org/D116322
These are deprecated and should be replaced with getAlign().
Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.
These flags are documented as generating poison values for particular input values. As such, we should really be consistent about their handling with how we handle nsw/nuw/exact/inbounds.
Differential Revision: https://reviews.llvm.org/D115460
This introduces a new ComputeMinSignedBits method for ValueTracking that
returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents
the minimum bit size for the value as a signed integer. Similar to the
existing APInt::getMinSignedBits method, this can make some of the
reasoning around ComputeSignBits more natural.
See https://reviews.llvm.org/D112298
The maximal value of a half is 0x7bff, which is 65504 when converted to
an integer. This patch teaches that to computeConstantRange to compute a
constant range with the correct maximum value.
https://alive2.llvm.org/ce/z/BV_Spbhttps://alive2.llvm.org/ce/z/Nwuqvb
The maximum value for a float converted in the same way is 3.4e38, which
requires 129bits of data. I have not added that here as integer types so
larger are rare, compared to integers types larger than 17 bits require
for half floats.
The MVE tests change because instsimplify happens to be run as a part of
the backend, where it doesn't tend to for other backends.
Differential Revision: https://reviews.llvm.org/D112694
This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.
This is mostly code movement, but I did go ahead and add the inrange constexpr gep case. This had been discussed previously, but apparently never followed up o.
The LangRef clearly states that branching on a undef or poison value is immediate undefined behavior, but historically, we have not been consistent about implementing that interpretation in the optimizer. Historically, we used (in some cases) a more relaxed model which essentially looked for provable UB along both paths which was control dependent on the condition. However, we've never been 100% consistent here. For instance SCEV uses the strong model for increments which form AddRecs (and only addrecs).
At the moment, the last big blocker for finally making this switch is enabling the fix landed in D106041. Loop unswitching (in it's classic form) is incorrect as it creates many "branch on poisons" when unswitching conditions originally unreachable within the loop.
This change adds a flag to value tracking which allows to easily test the optimization potential of treating branch on poison as immediate UB. It's intended to help ease work on getting us finally through this transition and avoid multiple independent rediscovers of the same issues.
Differential Revision: https://reviews.llvm.org/D112026
As discussed in D112016, our current requirement of speculatability
for ephemeral is overly strict: What we really care about is that
the instruction will be DCEd once the assume is dropped. For that
it is sufficient that the instruction is side-effect free and not
a terminator.
In particular, this allows non-dereferenceable loads to be ephemeral
values.
Differential Revision: https://reviews.llvm.org/D112179
This is a follow on for D111675 which implements the gep case. I'd originally left it out because I was hoping to actually implement the inrange todo, but after a bit of staring at the code, decided to leave it as is since it doesn't effect this use case (i.e. instcombine requires the op to freeze to be an instruction).
Differential Revision: https://reviews.llvm.org/D111691
The newly introduced API for checking whether poison comes solely from flags which can be dropped was out of sync. This was noticed by a reviewer post commit.
For the moment, disable the floating point flags. In a follow up change, I plan to add support in dropPoisonGeneratingFlags, but that deserves to be a change of it's own.
If we have an instruction which produces poison only when flags are specified on the instruction, then we know that freezing the operands and dropping flags is equivalent to freezing the result. If we know those flags don't result in any undefined behavior being executed, then there's no point in preserving the flags as we gain no knowledge by having them.
This patch extends the existing propagation logic which sinks freeze to single potential non-poison operands to allow dropping of flags when we know the freeze is the sole use of the instruction with poison flags.
The main value is that we tend to sink freezes towards the phi in IV cycles where the incoming value to the phi is the freeze of an IV increment. This will in turn (in a future patch), let us fold the freeze through the phi into the loop preheader. Motivated by eliminating need for CanonicalizeFreezeInLoops for the clearly profitable cases from onephi.ll test case in the test directory.
Differential Revision: https://reviews.llvm.org/D111675
This factors out utilities for scanning a bounded block of instructions since we have this code repeated in a bunch of places. The change to InlineFunction isn't strictly NFC as the limit mechanism there didn't handle debug instructions correctly.