I guess this case hasn't come up thus far, and i'm not sure if it can
really happen for the existing usages, thus no test in *this* commit.
But, the following commit adds test coverage,
there we'd expirience a crash without this fix.
Currently, InsertNoopCastOfTo() would implicitly insert that cast,
but now that we have SCEVPtrToIntExpr, i'm hoping we could stop
InsertNoopCastOfTo() from doing that. But first all users must be fixed.
These intrinsics, not the icmp+select are the canonical form nowadays,
so we might as well directly emit them.
This should not cause any regressions, but if it does,
then then they would needed to be fixed regardless.
Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`,
but that is a pessimization, not a correctness issue.
Additionally, the non-intrinsic form has issues with undef,
see https://reviews.llvm.org/D88287#2587863
This patch updates LV to generate the runtime checks just after cost
modeling, to allow a more precise estimate of the actual cost of the
checks. This information will be used in future patches to generate
larger runtime checks in cases where the checks only make up a small
fraction of the expected scalar loop execution time.
The runtime checks are created up-front in a temporary block to allow better
estimating the cost and un-linked from the existing IR. After deciding to
vectorize, the checks are moved backed. If deciding not to vectorize, the
temporary block is completely removed.
This patch is similar in spirit to D71053, but explores a different
direction: instead of delaying the decision on whether to vectorize in
the presence of runtime checks it instead optimistically creates the
runtime checks early and discards them later if decided to not
vectorize. This has the advantage that the cost-modeling decisions
can be kept together and can be done up-front and thus preserving the
general code structure. I think delaying (part) of the decision to
vectorize would also make the VPlan migration a bit harder.
One potential drawback of this patch is that we speculatively
generate IR which we might have to clean up later. However it seems like
the code required to do so is quite manageable.
Reviewed By: lebedev.ri, ebrevnov
Differential Revision: https://reviews.llvm.org/D75980
This patch changes costAndCollectOperands to use InstructionCost for
accumulated cost values.
isHighCostExpansion will return true if the cost has exceeded the budget.
Reviewed By: CarolineConcatto, ctetreau
Differential Revision: https://reviews.llvm.org/D92238
When LSR converts a branch on the pre-inc IV into a branch on the
post-inc IV, the nowrap flags on the addition may no longer be valid.
Previously, a poison result of the addition might have been ignored,
in which case the program was well defined. After branching on the
post-inc IV, we might be branching on poison, which is undefined behavior.
Fix this by discarding nowrap flags which are not present on the SCEV
expression. Nowrap flags on the SCEV expression are proven by SCEV
to always hold, independently of how the expression will be used.
This is essentially the same fix we applied to IndVars LFTR, which
also performs this kind of pre-inc to post-inc conversion.
I believe a similar problem can also exist for getelementptr inbounds,
but I was not able to come up with a problematic test case. The
inbounds case would have to be addressed in a differently anyway
(as SCEV does not track this property).
Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.
Differential Revision: https://reviews.llvm.org/D95286
Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute. Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.
This reverts the revert commit 408c4408fa.
This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.
Original message:
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.
This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.
This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.
I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.
This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.
This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.
I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
Reviewed By: dmgreen, RKSimon
Differential Revision: https://reviews.llvm.org/D90070
And use it to model LLVM IR's `ptrtoint` cast.
This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)
As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
to not loose pointer-ness of an expression in more cases
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.
But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89456
Use isKnownXY comparators when one of the operands can be with
scalable vectors or getFixedSize() for all the other cases.
This patch also does bug fixes for getPrimitiveSizeInBits by using
getFixedSize() near the places with the TypeSize comparison.
Differential Revision: https://reviews.llvm.org/D89703
The main tricky thing here is forward-declaring the enum:
we have to specify it's underlying data type.
In particular, this avoids the danger of switching over the SCEVTypes,
but actually switching over an integer, and not being notified
when some case is not handled.
I have updated most of such switches to be exaustive and not have
a default case, where it's pretty obvious to be the intent,
however not all of them.
If we switch over an enum, compiler can easily issue a diagnostic
if some case is not handled. However with an if cascade that isn't so.
Experimental evidence suggests new behavior to be superior.
All existing SCEV cast types operate on integers.
D89456 will add SCEVPtrToIntExpr cast expression type.
I believe this is best for consistency.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89455
Currently SCEVExpander creates inttoptr for non-integral pointers if the
base is a null constant for example. This results in invalid IR.
This patch changes InsertNoopCastOfTo to emit a GEP & bitcast to convert
to a non-integral pointer. First, a GEP of i8* null is generated and the
integral value is used as index. The GEP is then bitcasted to the target
type.
This was exposed by D71539.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87827
As code size is the only thing we care about at minsize, query the
cost of materialising immediates when calculating the cost of a SCEV
expansion. We also modify the CostKind to TCK_CodeSize for minsize,
instead of RecipThroughput.
Differential Revision: https://reviews.llvm.org/D76434
To enable the cost of constants, the helper function has been
reorganised:
- A struct has been introduced to hold SCEV operand information so
that we know the user of the operand, as well as the operand index.
The Worklist now uses instead instead of a bare SCEV.
- The costing of each SCEV, and collection of its operands, is now
performed in a helper function.
Differential Revision: https://reviews.llvm.org/D86050
Recommit the patch after fixing an issue reported caused by the fact
that re-used values are also added to InsertedValues.
Additional tests have been added in 88818491b9
This reverts the revert commit 38884641f2.
SCEVExpander already tracks which instructions have been inserted n
InsertedValues/InsertedPostIncValues. This patch adds an additional
vector to collect the instructions in insertion order. This can then be
used to remove exactly the instructions inserted by the expander.
This replaces ExpandedValuesCleaner, which in some cases might remove
values not inserted by the expander (e.g. if a value was dead before
insertion and is then used during expansion).
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D84327
Currently the SCEVExpander tries to re-use existing casts, even if they
are not exactly at the insertion point it was asked to create the cast.
To do so in some case, it creates a new cast at the insertion point and
updates all users to use the new cast.
This behavior is problematic, because it changes the IR outside of the
instructions created during the expansion. Therefore we cannot
completely undo all changes made during expansion.
This re-use should be only an extra optimization, so only using the new
cast in the expanded instructions should not be a correctness issue.
There are many cases equivalent instructions are created during
expansion.
This patch also adjusts findInsertPointAfter to skip instructions
inserted during expansion. This enables re-using existing casts without
the renaming any uses, by picking a better insertion point.
Reviewed By: efriedma, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84399
formLCSSAForInstructions is used by SCEVExpander, which tracks all
inserted instructions including LCSSA phis using asserting value
handles. This means cleanup needs to happen in the caller.
Extend formLCSSAForInstructions to take an optional pointer to a
vector. If this argument is non-nullptr, instead of directly deleting
the phis, add them to the vector, so the caller can process them.
This should address various PPC buildbot failures, including
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/40567
Use IRBuilder instead PHINode::Create. This should not impact the
generated code, but IRBuilder provides a way to register callbacks for
inserted instructions, which is convenient for some users.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85037
querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`,
because incomplete phis may be simplified to same value before get SCEV expression.
Reviewed By: lebedev.ri, mkazantsev
Differential Revision: https://reviews.llvm.org/D77560
This reverts the revert commit dc28675768.
It includes a fix for Polly, which uses SCEVExpander on IR that is not
in LCSSA form. Set PreserveLCSSA = false in that case, to ensure we do
not introduce LCSSA phis where there were none before.
This reverts commit 99166fd4fb, because it
breaks the polly builders.
polly/test/Isl/CodeGen/invariant_load_escaping_second_scop.ll fails
because a apparently unnecessary LCSSA phi node is introduced.
Make the bots green again, while I take a closer look.
This patch teaches SCEVExpander to directly preserve LCSSA.
As it is currently, SCEV does not look through PHI nodes in loops,
as it might break LCSSA form. Once SCEVExpander can preserve
LCSSA form, it should be safe for SCEV to look through PHIs.
To preserve LCSSA form, this patch uses formLCSSAForInstructions
on operands of newly created instructions, if the definition is inside
a different loop than the new instruction.
The final value we return from expandCodeFor may also need LCSSA
phis, depending on the insert point. As no user for it exists there yet,
create a temporary instruction at the insert point, which can be passed
to formLCSSAForInstructions. This temporary instruction is removed
after LCSSA construction.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D71538
Currently, getCastInstrCost has limited information about the cast it's
rating, often just the opcode and types. Sometimes there is a context
instruction as well, but it isn't trustworthy: for instance, when the
vectorizer is rating a plan, it calls getCastInstrCost with the old
instructions when, in fact, it's trying to evaluate the cost of the
instruction post-vectorization. Thus, the current system can get the
cost of certain casts incorrect as the correct cost can vary greatly
based on the context in which it's used.
For example, if the vectorizer queries getCastInstrCost to evaluate the
cost of a sext(load) with tail predication enabled, getCastInstrCost
will think it's free most of the time, but it's not always free. On ARM
MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar
situations can come up with how masked loads can be extended when being
split.
To fix that, this path adds a new parameter to getCastInstrCost to give
it a hint about the context of the cast. It adds a CastContextHint enum
which contains the type of the load/store being created by the
vectorizer - one for each of the types it can produce.
Original patch by Pierre van Houtryve
Differential Revision: https://reviews.llvm.org/D79162
Currently there are plenty of instructions that SCEVExpander creates but
does not track as created. IRBuilder allows specifying a callback
whenever an instruction is inserted. Use this to call
rememberInstruction automatically for each created instruction.
There are still a few rememberInstruction calls remaining, because in
some cases Inst::Create functions are used to construct instructions.
Suggested by @lebedev.ri in D75980.
Reviewers: mkazantsev, reames, sanjoy.google, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D84326
The block front may be a PHI node, inserting a cast instructions like
BitCast, PtrToInt, IntToPtr among PHIs is not right.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D80975
Summary:
In SCEVExpander FactorOutConstant(), when GEP indexing into/over scalable vector,
it is legal for the 'Factor' in a MulExpr to be the size of a scalable vector
instead of a compile-time constant.
Current upstream crash with the test attached.
Reviewers: efriedma, sdesmalen, sanjoy.google, mkazantsev
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80973
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.
The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537