getModRefInfo() queries currently track whether the result is a
MustAlias on a best-effort basis. The only user of this functionality
is the optimized memory access type in MemorySSA -- which in turn
has no users. Given that this functionality has not found a user
since it was introduced five years ago (in D38862), I think we
should drop it again.
The context is that I'm working to separate FunctionModRefBehavior
to track mod/ref for different location kinds (like argmem or
inaccessiblemem) separately, and the fact that ModRefInfo also has
an unrelated Must flag makes this quite awkward, especially as this
means that NoModRef is not a zero value. If we want to retain the
functionality, I would probably split getModRefInfo() results into
a part that just contains the ModRef information, and a separate
part containing a (best-effort) AliasResult.
Differential Revision: https://reviews.llvm.org/D130713
To solve the readnone problems in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for details.
According to the discussion, we decide to fix the problem by inserting
isPresplitCoroutine() checks in different passes instead of
wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes.
In this direction, we might not be able to cover every case at first.
Let's take a "find and fix" strategy.
Reviewed By: nikic, nhaehnle, jyknight
Differential Revision: https://reviews.llvm.org/D127383
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
This reverts commit 9b1e00738c.
Nikic reported in commit thread that I had forgotten history here, and that a) we'd tried this before, and b) had to revert due to an unexpected codegen impact. Current measurements confirm the same issue still exists.
This code pre-exists the generic handling for inaccessiblememonly. If we remove it and update one test with inaccessiblememonly, nothing else changes. Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.
D98718 caused the order of Values/MemoryLocations we pass to alias() to
be significant due to storing the offset in the PartialAlias case. But
some callers weren't audited and were still passing swapped arguments,
causing the returned PartialAlias offset to be negative in some
cases. For example, the newly added unittests would return -1
instead of 1.
Fixes#55343, a miscompile.
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D125328
And thread DSE's ephemeral values to EarliestEscapeInfo.
This allows more precise analysis in DSEState::isReadClobber() via BatchAA.
Followup to D123162.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123342
The patch adds an extra check to only set MinAbsVarIndex if
abs(V * Scale) won't wrap. In the absence of IsNSW, try to use the
bitwidths of the original V and Scale to rule out wrapping.
Attempt to model https://alive2.llvm.org/ce/z/HE8ZKj
The code in the else if below probably needs the same treatment, but I
need to come up with a test first.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D121695
Currently, basic AA has special support for llvm.memcpy.* intrinsics. This change extends this support for any memory trancsfer opration and in particular llvm.memmove.* intrinsic.
Reviewed By: reames, nikic
Differential Revision: https://reviews.llvm.org/D117095
Presence of operand bundles changes semantics in respect to ModRef. In particular, spec says: "From the compilers perspective, deoptimization operand bundles make the call sites theyre attached to at least readonly. They read through all of their pointer typed operands (even if theyre not otherwise escaped) and the entire visible heap. Deoptimization operand bundles do not capture their operands except during deoptimization, in which case control will not be returned to the compiled frame". Fix handling of llvm.memcpy.* according to the spec.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D118033
Allocation functions should be marked with onlyAccessesInaccessibleMemory (when that is correct for the given function) which is checked elsewhere so this check is no longer needed.
Differential Revision: https://reviews.llvm.org/D117180
The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.
We should not lose analysis precision if an 'add' has both no-wrap
flags (nsw and nuw) compared to just one or the other.
This patch is modeled on a similar construct that was added with
D59386.
I don't think it is possible to expose a problem with an unsigned
compare because of the way this was coded (nuw is handled first).
InstCombine has an assert that fires with the example from:
https://github.com/llvm/llvm-project/issues/52884
...because it was expecting InstSimplify to handle this kind of
pattern with an smax.
Fixes#52884
Differential Revision: https://reviews.llvm.org/D116322
Preserve the invariant that offset reported in the case of a
`PartialAlias` between `Loc1` and `Loc2`, is such that
`Loc1 + Offset = Loc2`, where `Loc1` and `Loc2` are the first and
the second argument, respectively, in alias queries.
Differential Revision: https://reviews.llvm.org/D115927
When accumulating the GEP offset in BasicAA, we should use the
pointer index size rather than the pointer size.
Differential Revision: https://reviews.llvm.org/D112370
All heuristics for variable accesses require both access sizes to
be known, so check this once at the start, rather than for each
particular heuristic.
If we know that the var * scale multiplication is nsw, we can use
a saturating multiplication on the range (as a good approximation
of an nsw multiply). This recovers some cases where the fix from
D112611 is unnecessarily strict. (This can be further strengthened
by using a saturating add, but we currently don't track all the
necessary information for that.)
This exposes an issue in our NSW tracking for multiplies. The code
was assuming that (X +nsw Y) *nsw Z results in
(X *nsw Z) +nsw (Y *nsw Z) -- however, it is possible that the
distributed multiplications overflow, even if the non-distributed
one does not. We should discard the nsw flag if the the offset is
non-zero. If we just have (X *nsw Y) *nsw Z then concluding
X *nsw (Y *nsw Z) is fine.
Differential Revision: https://reviews.llvm.org/D112848
The scale multiplication is only guaranteed to be nsw if the GEP
is inbounds (or the multiplication is trivial). Previously we were
only considering explicit muls in GEP indices.
GEP decomposition currently checks whether the multiplication of
the linear expression offset and GEP scale overflows. However, if
everything else works correctly, this overflow check is both
unnecessary and dangerously misleading. While it will avoid an
overflow in Scale * Offset in particular, other parts of the
calculation (including those on dynamic values) may still overflow.
The code working on the decomposed GEPs is responsible for ensuring
that it remains correct in the presence of overflow. D112611 fixes
the last issue of that kind that I'm aware of (in fact, the overflow
check was originally introduced to work around precisely that issue).
Differential Revision: https://reviews.llvm.org/D112618
BasicAA currently tries to determine that the offset is positive by
checking whether all variable indices are positive based on known
bits, multiplied by a positive scale. However, this is incorrect
if the scale multiplication might overflow. In the modified test
case the original value is positive, but may be negative after a
left shift.
Fix this by converting known bits into a constant range and reusing
the range-based logic, which handles overflow correctly.
Differential Revision: https://reviews.llvm.org/D112611
Make the range check more precise by calculating the range of
potentially accessed bytes for both accesses and checking whether
their intersection is empty. In that case there can be no overlap
between the accesses and the result is NoAlias.
This is more powerful than the previous approach, because it can
deal with sign-wrapped ranges. In the test case the original range
is [-1, INT_MAX] but becomes [0, INT_MIN] after applying the offset.
This is a wrapping range, so getSignedMin/getSignedMax will treat
it as a full range. However, the range excludes the elements
[INT_MIN+1, -1], which is enough to prove NoAlias with an access
at offset -1.
Differential Revision: https://reviews.llvm.org/D112486
D109746 made BasicAA use range information to determine the
minimum/maximum GEP offset. However, it was limited to the case of
a single variable index. This patch extends support to multiple
indices by adding all the ranges together.
Differential Revision: https://reviews.llvm.org/D112378
GEP indices larger than the GEP index size are implicitly truncated
to the index size. BasicAA currently doesn't model this, resulting
in incorrect alias analysis results.
Fix this by explicitly modelling truncation in CastedValue in the
same way we do zext and sext. Additionally we need to disable a
number of optimizations for truncated values, in particular
"non-zero" and "non-equal" may no longer hold after truncation.
I believe the constant offset heuristic is also not necessarily
correct for truncated values, but wasn't able to come up with a
test for that one.
A possible followup here would be to use the new mechanism to
model explicit trunc as well (which should be much more common,
as it is the canonical form). This is straightforward, but omitted
here to separate the correctness fix from the analysis improvement.
(Side note: While I say "index size" above, BasicAA currently uses
the pointer size instead. Something for another day...)
Differential Revision: https://reviews.llvm.org/D110977
The multiply() implementation is very slow -- it performs six
multiplications in double the bitwidth, which means that it will
typically work on allocated APInts and bypass fast-path
implementations. Add an additional implementation that doesn't
try to produce anything better than a full range if overflow is
possible. At least for the BasicAA use-case, we really don't care
about more precise modeling of overflow behavior. The current
use of multiply() is fine while the implementation is limited to
a single index, but extending it to the multiple-index case makes
the compile-time impact untenable.
Currently, DecomposeGEP() bails out on the whole decomposition if
it encounters a scalable GEP type anywhere. However, it is fine to
still analyze other GEPs that we look through before hitting the
scalable GEP. This does mean that the decomposed GEP base is no
longer required to be the same as the underlying object. However,
I don't believe this property is necessary for correctness anymore.
This allows us to compute slightly more precise aliasing results
for GEP chains containing scalable vectors, though my primary
interest here is simplifying the code.
Differential Revision: https://reviews.llvm.org/D110511
DecompGEP.Base and UnderlyingV are currently always the same.
However, logically DecompGEP.Base is the right value to use here,
because the decomposed offset is relative to that base.
BasicAA GEP decomposition currently performs all calculation on the
maximum pointer size, but at least 64-bit, with an option to double
the size. The code comment claims that this improves analysis power
when working with uint64_t indices on 32-bit systems. However, I don't
see how this can be, at least while maintaining correctness:
When working on canonical code, the GEP indices will have GEP index
size. If the original code worked on uint64_t with a 32-bit size_t,
then there will be truncs inserted before use as a GEP index. Linear
expression decomposition does not look through truncs, so this will
be an opaque value as far as GEP decomposition is concerned. Working
on a wider pointer size does not help here (or have any effect at all).
When working on non-canonical code (before first InstCombine), the
GEP indices are implicitly truncated to GEP index size. The BasicAA
code currently just ignores this fact completely, and pretends that
this truncation doesn't happen. This is incorrect and will be
addressed by D110977.
I believe that for correctness reasons, it is important to work on
the actual GEP index size to properly model potential overflow.
BasicAA tries to patch over the fact that it uses the wrong size
(see adjustToPointerSize), but it only does that in limited cases
(only for constant values, and not all of them either). I'd like to
move this code towards always working on the correct size, and
dropping these artificial pointer size adjustments is the first step
towards that.
Differential Revision: https://reviews.llvm.org/D110657
When determining NoAlias based on object size and dereferenceability
information, we can ignore frees for the same reason we can ignore
possible null pointers (if null is not a valid pointer): Actually
accessing the null pointer / freed pointer would be immediate UB,
and AA results are only valid under the assumption of an access.
This addresses a minor regression from D110745.
Differential Revision: https://reviews.llvm.org/D111028
Add methods to appropriately extend KnownBits/ConstantRange there,
same as with APInt. Also clean up the known bits handling by
actually doing that extension rather than checking ZExtBits. This
doesn't matter now, but becomes relevant once truncation is
involved.
The information can be implicit (from `ValueTracking`) or explicit.
This implements the backend part of the following RFC
https://groups.google.com/g/llvm-dev/c/T9o51zB1JY.
We still need to settle on how to best represent the information in the
IR, but this is a separate discussion.
Differential Revision: https://reviews.llvm.org/D109746
Rather than separately handling subtraction of offset and variable
indices, make this one operation. Also rewrite the implementation
to use range-based for loops.
This is a followup to D109844 (and alternative to D109907), which
integrates the new "earliest escape" tracking into AliasAnalysis.
This is done by replacing the pre-existing context-free capture
cache in AAQueryInfo with a replaceable (virtual) object with two
implementations: The SimpleCaptureInfo implements the previous
behavior (check whether object is captured at all), while
EarliestEscapeInfo implements the new behavior from DSE.
This combines the "earliest escape" analysis with the full power of
BasicAA: It subsumes the call handling from D109907, considers a
wider range of escape sources, and works with AA recursion. The
compile-time cost is slightly higher than with D109907.
Differential Revision: https://reviews.llvm.org/D110368
The case of an Argument and an identified function local is already
handled earlier, because we don't care about captures in that case.
As such, we don't need to additionally consider the combination of
an Argument with a non-escaping identified function local.
This ensures that isEscapeSource() only returns true for
instructions, which is necessary for D110368.