Commit Graph

708 Commits

Author SHA1 Message Date
Kazu Hirata 343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Kazu Hirata a5f8a36d02 [IPO] Use std::optional in GlobalOpt.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 23:38:32 -08:00
Arthur Eubanks 8c49b01a1e [GlobalOpt] Don't remove inalloca from varargs functions
Varargs and inalloca have a weird interaction where varargs are actually
passed via the inalloca alloca. Removing inalloca breaks the varargs
because they're still not passed as separate arguments.

Fixes #58718

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D137182
2022-11-01 13:04:05 -07:00
Mikael Holmen 51d4c7ceea [GlobalOpt] Fix debug variance problem in hasOnlyColdCalls
hasOnlyColdCalls skipped over calls to intrinsics, but it did so after
checking the linkage of the called function. This meant that the presence
of a call to a debug intrinsic could affect the outcome of the
optimization.

In my original reproducer (for an out of tree target) it was particularly
interesting, because the actual IR after GlobalOpt was not different with
debug instrinsics present, so -print-after-all printouts didn't show
anything there.

However, without debuginfo, GlobalOpt went further and ran
BlockFrequencyAnalysis and (more importanly) LoopAnalysis, and later on in
the pipeline, instcombine behaved in different ways when LoopInfo was
present.

So a call to a dbg.declare prevented running LoopAnalysis in
GlobalOpt, which later prevented InstCombine from doing an optimization.

The dbg-intrinsic-loopanalysis.ll testcase tries to expose this.

Then I also noted that adding a dbg.declare actually made the existing
testcase colccc_coldsites.ll generate different code, so I modified that
to now test it behaves the same way with and without the dbg.declare.

Reviewed By: nikic, fhahn

Differential Revision: https://reviews.llvm.org/D133193
2022-09-02 12:29:44 +02:00
Cameron McInally 38d58c1b37 [GlobalOpt] Bail out of GlobalOpt SROA if a Scalable Vector is seen
The SROA algorithm won't work for Scalable Vectors, since we don't
know how many bytes are loaded/stored. Bail out if a Scalable
Vector is seen.

Differential Revision: https://reviews.llvm.org/D132417
2022-08-24 13:17:59 -07:00
Kazu Hirata 50724716cd [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-14 12:51:58 -07:00
Kazu Hirata 109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Kazu Hirata ba0407ba86 [llvm] Use range-based for loops (NFC) 2022-08-07 00:16:21 -07:00
Kazu Hirata acf648b5e9 Use llvm::less_first and llvm::less_second (NFC) 2022-07-24 16:21:29 -07:00
Nikita Popov f45ab43332 [MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc
Alloc directly checking whether a given call is a removable
allocation, instead of first checking whether it is an allocation
first.
2022-07-21 09:39:19 +02:00
Nikita Popov 8ee913d83b [IR] Remove Constant::canTrap() (NFC)
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
2022-07-06 10:36:47 +02:00
Arthur Eubanks e422c0d3b2 [GlobalOpt] Perform store->dominated load forwarding for stored once globals
The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway.

The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store.

Compile time tracker:
https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions

Reviewed By: nikic, asbirlea, jdoerfert

Differential Revision: https://reviews.llvm.org/D128128
2022-06-24 09:09:26 -07:00
Arthur Eubanks b5db65e0da Reland [GlobalOpt] Preserve CFG analyses
The only place we modify the CFG is when calling
removeUnreachableBlocks(), so insert a callback there which invalidates
analyses for that function (or recomputes DT in the legacy PM).

We may delete functions, make sure to clear analyses for those
functions. (this was missed in the original revision)

Small compile time wins across the board:
https://llvm-compile-time-tracker.com/compare.php?from=f444ea8ce0aaaa5ec1a4129809389da15cc41396&to=698f41f4fc26cbf1006ed5d88e9d658edfc5b749&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128145
2022-06-21 09:19:59 -07:00
Arthur Eubanks 13ff7d6f39 Revert "[GlobalOpt] Perform store->dominated load forwarding for stored once globals"
This reverts commit 6f348b146b.

Am seeing internal test failures plus a linux kernel breakage reported due to this.
2022-06-20 10:26:47 -07:00
Arthur Eubanks 1cd2c72bef Revert "[GlobalOpt] Preserve CFG analyses"
This reverts commit cc65f3e167.

Causes crashes: https://github.com/llvm/llvm-project/issues/56131
2022-06-20 10:25:10 -07:00
Arthur Eubanks cc65f3e167 [GlobalOpt] Preserve CFG analyses
The only place we modify the CFG is when calling
removeUnreachableBlocks(), so insert a callback there which invalidates
analyses for that function (or recomputes DT in the legacy PM).

Small compile time wins across the board:
https://llvm-compile-time-tracker.com/compare.php?from=f444ea8ce0aaaa5ec1a4129809389da15cc41396&to=698f41f4fc26cbf1006ed5d88e9d658edfc5b749&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128145
2022-06-19 16:13:02 -07:00
Arthur Eubanks 6f348b146b [GlobalOpt] Perform store->dominated load forwarding for stored once globals
Compile time tracker:
https://llvm-compile-time-tracker.com/compare.php?from=1e556f459b44dd0ca4073e932f66ecb6f40fe31a&to=6d7bed4e1e72c6a8592748626091274209740a40&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128128
2022-06-19 10:27:20 -07:00
Arthur Eubanks 4a5201f484 [NFC][GlobalOpt] Remove unused parameters 2022-06-18 21:23:39 -07:00
Chuanqi Xu 0e10f12844 [NFC] Remove commented cerr debugging loggings
There are some unused cerr debugging loggings in the codes. It is weird
to remain such commented debug helpers in the product.
2022-06-08 15:58:06 +08:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Alexander Shaposhnikov badd088c57 [GlobalOpt] Enable optimization of constructors with different priorities
Adjust `optimizeGlobalCtorsList` to handle the case of different priorities.
This addresses the issue https://github.com/llvm/llvm-project/issues/55083.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D125278
2022-05-13 22:19:29 +00:00
Arthur Eubanks b07aab8fc1 [GlobalOpt] Iterate over replaced values deterministically to constprop
If there are pre-existing dead instructions, the order we visit replaced
values can cause us sometimes to not delete dead instructions.

The added test non-deterministically failed without the change.
2022-05-02 09:43:20 -07:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Nikita Popov db561064f6 [GlobalOpt] Handle non-instruction MTI source (PR54572)
This was reusing a cast to GlobalVariable to check for an
Instruction, which means we'll try to dereference a null pointer
if it's not actually a GlobalVariable. We should be casting
MTI->getSource() instead.

I don't think this problem is really specific to opaque pointers,
but it certainly makes it a lot easier to reproduce.

Fixes https://github.com/llvm/llvm-project/issues/54572.
2022-03-28 14:28:47 +02:00
serge-sans-paille f1985a3f85 Cleanup includes: Transforms/IPO
Preprocessor output diff: -238205 lines
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122183
2022-03-22 10:06:28 +01:00
Fangrui Song c6692f819e [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249
2022-03-18 14:17:05 -07:00
Alok Kumar Sharma 94823500a7 [DebugInfo][SROA] Correct debug info for global variables in case of SROA
The existing handling produced crash for test case (attached with patch).
Now the function transferSRADebugInfo is modified to
  - Ignore the current variable if it starts after the current Fragment.
  - Ignore the current variable if it ends before the current Fragment.
  - Generate (!DIExpression()) if current variable completely fits the
    current Fragment.
  - Otherwise (as earlier), generate the DW_OP_LLVM_fragment in IR if current
    Fragment partially defines current variable.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D121107
2022-03-10 00:41:30 +05:30
Arthur Eubanks f0b61f7957 Revert "[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible"
This reverts commit 30e8f83c84.

Causes huge compile time regressions on certain large files. Will followup offline with author.
2022-03-03 11:04:14 -08:00
Fangrui Song 30e8f83c84 [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249
2022-02-01 10:41:16 -08:00
Nikita Popov 1652c3b80c [GlobalOpt] Avoid early exit before dead constant check
In a similar vein to 236fbf571d,
make sure we don't early-exit before the dead constant check.
2022-02-01 15:57:19 +01:00
Philip Reames 26049b8ce3 [GlobalOpt] Generalize malloc-to-global for any allocation function
We can generalize the malloc-to-global transform for other allocation functions which are both a) removable, and b) have a known initialization value.

One subtlety that I want to point out - mostly because I hadn't realized it was true until I took a closer look - is that the existing code doesn't prove that initialization/malloc happens only once. The initialization function can be called multiple times. This is correct without special handling for malloc as undef can map to any value previously written, but a non-undef initializing allocation it means we may end up memseting the new global repeatedly. In particular, this means it's not legal to fold the memset into the initializer of the global.

Differential Revision: https://reviews.llvm.org/D117503
2022-01-17 15:06:23 -08:00
Nikita Popov 12bee2c054 [GlobalOpt] Drop an incorrect check
This was a last-minute addition to D117249, and of course I ended
up inverting the condition in a way that caused an uninitialized
memory read.

I've dropped it entirely, as I don't think we actually care whether
the size is zero or not here. The previous code wasn't checking
this either.
2022-01-17 10:10:56 +01:00
Nikita Popov 499f1ca79f [GlobalOpt] Use generic type when converting malloc to global
The malloc to global transform currently determines the type of the
global by looking at bitcasts of the malloc. This is limited (the
transform fails if there are multiple different types) and
incompatible with opaque pointers.

My initial approach was to construct an appropriate struct type
based on usage in loads/stores. What this patch does instead is
to always create an [i8 x AllocSize] global, without trying to
guess types at all.

This does mean that other transforms that require a certain global
type may break. I fixed two of these in D117034 and D117223, which
I believe should be sufficient to avoid regressions. In particular,
the global SRA change should end up splitting the global into
naturally-typed sub-globals, at which point all other optimizations
should work.

Differential Revision: https://reviews.llvm.org/D117092
2022-01-17 09:55:33 +01:00
Nikita Popov 4796b4ae7b [GlobalOpt] Make global SRA offset based
Currently global SRA uses the GEP structure to determine how to
split the global. This patch instead analyses the loads and stores
that are performed on the global, and collects which types are used
at which offset, and then splits the global according to those.

This is both more general, and works fine with opaque pointers.
This is also closer to how ordinary SROA is performed.

Differential Revision: https://reviews.llvm.org/D117223
2022-01-17 09:28:36 +01:00
Nikita Popov 1cbb456123 [GlobalOpt] Fix global to select transform under opaque pointers
We need to check that the load/store type is also the same, as this
is no longer implicitly checked through the pointer type.
2022-01-13 11:13:06 +01:00
Nikita Popov 5642ce5ac2 [GlobalOpt] Drop redundant setExternallyInitialized() call (NFC)
This is part of copyAttributesFrom().
2022-01-12 09:42:58 +01:00
Nikita Popov f3e87176e1 [GlobalOpt] Support "stored once" optimization for different types
GlobalOpt can optimize a global with undef initializer and a single
store to put the stored value into the initializer instead. Currently,
this requires the type of the global and the store to match.

This patch extends support to cases with different types (but same
size), in which case we create a new global to replace the old one.

Differential Revision: https://reviews.llvm.org/D117034
2022-01-12 09:39:31 +01:00
Philip Reames a3573f203e Fix a bug in 67a3331e (cast instead of dyn_cast)
The original commit was expected to be NFC, but I didn't account for the fact that invokes could be considered allocation functions.  Interestingly, only one builder caught the problem.
2022-01-07 08:25:02 -08:00
Philip Reames 7052670e96 Move getMallocAllocatedType and getMallocArraySize to GlobalOpt [NFC]
These are implementation details of the global-opt transform and not easily reuseable, so remove them from the analysis header.
2022-01-06 18:02:13 -08:00
Philip Reames 67a3331e4f Inline extractMallocCall to sole use and delete [NFC] 2022-01-06 18:02:13 -08:00
Nikita Popov 99c6b12b92 [ConstantFolding] Unify handling of load from uniform value
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case b) it bypasses casts that might not be legal generally but
do work with uniform values.

We had multiple implementations of this, with a different set of
supported values each time. This replaces two usages with a more
complete helper. Other usages will be replaced separately, because
they have larger impact.

This is part of D115924.
2022-01-05 12:30:46 +01:00
Nikita Popov bbeaf2aac6 [GlobalOpt][Evaluator] Rewrite global ctor evaluation (fixes PR51879)
Global ctor evaluation currently models memory as a map from Constant*
to Constant*. For this to be correct, it is required that there is
only a single Constant* referencing a given memory location. The
Evaluator tries to ensure this by imposing certain limitations that
could result in ambiguities (by limiting types, casts and GEP formats),
but ultimately still fails, as can be seen in PR51879. The approach
is fundamentally fragile and will get more so with opaque pointers.

My original thought was to instead store memory for each global as an
offset => value representation. However, we also need to make sure
that we can actually rematerialize the modified global initializer
into a Constant in the end, which may not be possible if we allow
arbitrary writes.

What this patch does instead is to represent globals as a MutableValue,
which is either a Constant* or a MutableAggregate*. The mutable
aggregate exists to allow efficient mutation of individual aggregate
elements, as mutating an element on a Constant would require interning
a new constant. When a write to the Constant* is made, it is converted
into a MutableAggregate* as needed.

I believe this should make the evaluator more robust, compatible
with opaque pointers, and a bit simpler as well.

Fixes https://github.com/llvm/llvm-project/issues/51221.

Differential Revision: https://reviews.llvm.org/D115530
2022-01-04 09:30:54 +01:00
Nikita Popov 2926d6d335 [ConstantFold][GlobalOpt] Don't create x86_mmx null value
This fixes the assertion failure reported at
https://reviews.llvm.org/D114889#3198921 with a straightforward
check, until the cleaner fix in D115924 can be reapplied.
2021-12-21 09:11:41 +01:00
Nikita Popov aeb36ae0f4 Revert "[ConstantFolding] Unify handling of load from uniform value"
This reverts commit 9fd4f80e33.

This breaks SingleSource/Regression/C/gcc-c-torture/execute/pr19687.c
in test-suite. Either the test is incorrect, or clang is generating
incorrect union initialization code. I've submitted
https://reviews.llvm.org/D115994 to fix the test, assuming my
interpretation is correct. Reverting this in the meantime as it
may take some time to resolve.
2021-12-18 20:46:52 +01:00
Kazu Hirata 2b7be47b22 [llvm] Strip redundant lambda (NFC) 2021-12-17 10:51:40 -08:00
Nikita Popov 9fd4f80e33 [ConstantFolding] Unify handling of load from uniform value
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case and b) it bypasses casts that might not be legal generally
but do work with uniform values.

We had multiple implementations of this, with a different set of
supported values each time, as well as incomplete type checks in
some cases. In particular, this fixes the assertion reported in
https://reviews.llvm.org/D114889#3198921, as well as a similar
assertion that could be triggered via constant folding.

Differential Revision: https://reviews.llvm.org/D115924
2021-12-17 17:05:06 +01:00
Nikita Popov 8d1759c404 [GlobalOpt] Simplify CleanupConstantGlobalUsers()
This bases the CleanupConstantGlobalUsers() implementation around
the ConstantFoldLoadFromConst() API. The general approach is that
we discover all users while looking through casts, and then
constant fold loads and drop stores and memintrinsics.

This avoids special cases and limitations in the previous
implementation, which is also incompatible with opaque pointers.
The result is a bit more powerful than before, because we now use
more general load folding logic which can for example look through
pointer bitcasts between different sizes. This is where the test
changes come from, as we now fold more loads and can thus remove
more globals.

Differential Revision: https://reviews.llvm.org/D114889
2021-12-01 21:06:25 +01:00
Kazu Hirata 0d182d9d1e [Transforms] Use make_early_inc_range (NFC) 2021-11-07 17:03:15 -08:00
Jay Foad 1b758925ad [IR] Merge createReplacementInstr into ConstantExpr::getAsInstruction
createReplacementInstr was a trivial wrapper around
ConstantExpr::getAsInstruction, which also inserted the new instruction
into a basic block. Implement this directly in getAsInstruction by
adding an InsertBefore parameter and change all callers to use it. NFC.

A follow-up patch will remove createReplacementInstr.

Differential Revision: https://reviews.llvm.org/D112791
2021-10-29 15:02:58 +01:00
Hongtao Yu 098a0d8fbc [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 3.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.

Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110847
2021-10-12 09:44:12 -07:00