Commit Graph

931 Commits

Author SHA1 Message Date
Fangrui Song e53ce9618a StandardInstrumentations: Internalize some cl::opt 2022-11-23 22:58:14 -08:00
Sami Tolvanen cacd3e73d7 Add generic KCFI operand bundle lowering
The KCFI sanitizer emits "kcfi" operand bundles to indirect
call instructions, which the LLVM back-end lowers into an
architecture-specific type check with a known machine instruction
sequence. Currently, KCFI operand bundle lowering is supported only
on 64-bit X86 and AArch64 architectures.

As a lightweight forward-edge CFI implementation that doesn't
require LTO is also useful for non-Linux low-level targets on
other machine architectures, add a generic KCFI operand bundle
lowering pass that's only used when back-end lowering support is not
available and allows -fsanitize=kcfi to be enabled in Clang on all
architectures.

This relands commit eb2a57ebc7 with
fixes.

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D135411
2022-11-22 23:01:18 +00:00
Rong Xu 6327d263f5 [CHR] Add a threshold for the code duplication
ControlHeightReduction (CHR) clones the code region to reduce the
branches in the hot code path. The number of clones is linear to the
depth of the region.

Currently it does not have control over the code size increase. We are
seeing one ~9000 BB functions get expanded to ~250000 BBs, an 25x
increase. This creates a big compile time issue for the downstream
optimizations.

This patch adds a cap for number of clones for one region.

Differential Revision: https://reviews.llvm.org/D138333
2022-11-22 11:36:40 -08:00
Sanjay Patel 163bb6d64e [Passes][VectorCombine] enable early run generally and try load folds
An early run of VectorCombine was added with D102496 specifically to
deal with unnecessary vector ops produced with the C matrix extension.
This patch is proposing to try those folds in general and add a pair
of load folds to the menu.

The load transform will partly solve (see PhaseOrdering diffs) a
longstanding vectorization perf bug by removing redundant loads via GVN:
issue #17113

The main reason for not enabling the extra pass generally in the initial
patch was compile-time cost. The cost of VectorCombine was significantly
(surprisingly) improved with:
87debdadaf
https://llvm-compile-time-tracker.com/compare.php?from=ffe05b8f57d97bc4340f791cb386c8d00e0739f2&to=87debdadaf18f8a5c7e5d563889e10731dc3554d&stat=instructions:u

...so the extra run is going to cost very little now - the total cost of
the 2 runs should be less than the 1 run before that micro-optimization:
https://llvm-compile-time-tracker.com/compare.php?from=5e8c2026d10e8e2c93c038c776853bed0e7c8fc1&to=2c4b68eab5ae969811f422714e0eba44c5f7eefb&stat=instructions:u

It may be possible to reduce the cost slightly more with a few more
earlier-exits like that, but it's probably in the noise based on timing
experiments.

Differential Revision: https://reviews.llvm.org/D138353
2022-11-21 13:57:55 -05:00
Sanjay Patel 8f337f8ffe [VectorCombine] generalize pass param name for early combines; NFC
The option was added with https://reviews.llvm.org/D102496,
and currently the name is accurate, but I am hoping to add
a load transform that is not a scalarization. See issue #17113.
2022-11-21 13:57:55 -05:00
Alexander Shaposhnikov 7059a6c32c [IR] Split out IR printing passes into IRPrinter
This diff splits out (from LLVMCore) IR printing passes into IRPrinter.
This structure is similar to what we already have for IRReader and
enables us to avoid circular dependencies between LLVMCore and Analysis
(this is a preparation for https://reviews.llvm.org/D137768).
The legacy interface is left unchanged, once the legacy pass manager
is removed (in the future) we will be able to clean it up further.
The bazel build configuration has been updated as well.

Test plan:
1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc
2/ bazel build --config=generic_clang @llvm-project//...

Differential revision: https://reviews.llvm.org/D138081
2022-11-18 01:47:56 +00:00
Fangrui Song fc91c70593 Revert D135411 "Add generic KCFI operand bundle lowering"
This reverts commit eb2a57ebc7.

llvm/include/llvm/Transforms/Instrumentation/KCFI.h including
llvm/CodeGen is a layering violation. We should use an approach where
Instrumementation/ doesn't need to include CodeGen/.
Sorry for not spotting this in the review.
2022-11-17 22:45:30 +00:00
Sami Tolvanen eb2a57ebc7 Add generic KCFI operand bundle lowering
The KCFI sanitizer emits "kcfi" operand bundles to indirect
call instructions, which the LLVM back-end lowers into an
architecture-specific type check with a known machine instruction
sequence. Currently, KCFI operand bundle lowering is supported only
on 64-bit X86 and AArch64 architectures.

As a lightweight forward-edge CFI implementation that doesn't
require LTO is also useful for non-Linux low-level targets on
other machine architectures, add a generic KCFI operand bundle
lowering pass that's only used when back-end lowering support is not
available and allows -fsanitize=kcfi to be enabled in Clang on all
architectures.

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D135411
2022-11-17 21:55:00 +00:00
Roman Lebedev 8adfa29706
[Pipelines] Introduce SROA after (final, run-time) loop unrolling
Now that we are done with loop unrolling, be it either by LoopVectorizer,
or LoopUnroll passes, some variable-offset GEP's into alloca's could have
become constant-offset, thus enabling SROA and alloca promotion,
yet we don't capitalize on that, which is surprizing.

While it would be good to not introduce one more SROA invocation,
but instead move the one from `PassBuilder::buildFunctionSimplificationPipeline()`,
the existing test coverage says that is a bad idea,
though it would be fine compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=b150d34c47efbd8fa09604bce805c0920360f8d7&to=5a9a5c855158b482552be8c7af3e73d67fa44805&stat=instructions

So instead, i add yet another SROA run.
I have checked, and it needs to be at least after said final loop unrolling.
This is still fine compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=70324cd88328c0924e605fa81b696572560aa5c9&to=fb489bbef687ad821c3173a931709f9cad9aee8a&stat=instructions

I've encountered this in a real code, `SROA-after-final-loop-unrolling.ll` has been reduced from https://godbolt.org/z/fsdMhETh3

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D136806
2022-11-17 21:31:30 +03:00
Arthur Eubanks cbcf123af2 [LegacyPM] Remove cl::opts controlling optimization pass manager passes
Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reland with UseNewGVN usage in clang removed.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915
2022-11-14 09:38:17 -08:00
Arthur Eubanks d7c1427953 Revert "[LegacyPM] Remove cl::opts controlling optimization pass manager passes"
This reverts commit 7ec05fec71.

Breaks bots, e.g. https://lab.llvm.org/buildbot#builders/217/builds/15008
2022-11-14 09:33:38 -08:00
Arthur Eubanks 7ec05fec71 [LegacyPM] Remove cl::opts controlling optimization pass manager passes
Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915
2022-11-14 09:23:17 -08:00
OCHyams 913b561c0a [Assignment Tracking][6/*] Add trackAssignments function
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Add trackAssignments which adds assignment tracking metadata to a function for
a specified set of variables. The intended callers are the inliner and the
front end - those calls will be added in separate patches.

I've added a pass called declare-to-assign (AssignmentTrackingPass) that
converts dbg.declare intrinsics to dbg.assigns using trackAssignments so that
the function can be easily tested (see
llvm/test/DebugInfo/Generic/track-assignments.ll). The pass could also be used
by front ends to easily test out enabling assignment tracking.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D132225
2022-11-08 16:52:11 +00:00
Arthur Eubanks 4fa328074e [NewPM][Pipeline] Add PipelineTuningOption to set inliner threshold
The legacy PM allowed you to set a custom inliner threshold via
  builder.Inliner = llvm::createFunctionInliningPass(inline_threshold);

This allows the same thing to be done with the new PM optimization pipelines.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D137038
2022-11-02 10:47:51 -07:00
Paul Walker ab8257ca0e [NFC] Fix a few whitespace inconsistencies. 2022-10-20 14:52:25 +00:00
Arthur Eubanks 743087fb63 Port print-cfg-sccs to new pass manager
This is actually used, see https://discourse.llvm.org/t/use-print-callgrapg-sccs-from-opt/65782.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D135718
2022-10-18 08:47:08 -07:00
Arthur Eubanks f59e1bcc22 [PrintPipeline] Handle CoroConditionalWrapper and add more verification
Add a check (can be disabled via a flag) that the pipeline we generate is actually parsable.
Can be disabled because we don't expect to handle every pass in -print-pipeline-passes.

Fixes #58280.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135703
2022-10-12 09:36:45 -07:00
Arthur Eubanks 60e4af7ab8 [CallGraph] Port -print-callgraph-sccs to new pass manager
And remove the legacy opt-specific pass.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D135487
2022-10-11 14:43:16 -07:00
Arthur Eubanks f3a928e233 [opt] Don't translate legacy -analysis flag to require<analysis>
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07 14:54:34 -07:00
Arthur Eubanks 5df4ab55f9 [llvm] Migrate PAEval to new pass manager 2022-10-01 16:41:58 -07:00
Florian Hahn 7c0ff64b0f
[LAA] Change to function analysis for new PM.
At the moment, LoopAccessAnalysis is a loop analysis for the new pass
manager. The issue with that is that LAI caches SCEV expressions and
modifications in a loop may impact SCEV expressions in other loops, but
we do not have a convenient way to invalidate LAI for other loops
withing a loop pipeline.

To avoid this issue, turn it into a function analysis which returns a
manager object that keeps track of the individual LAI objects per loop.

Fixes #50940.

Fixes #51669.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134606
2022-10-01 15:44:27 +01:00
Pavel Samolysov 1c530500ab [Pipelines] Introduce DAE after ArgumentPromotion
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830
2022-09-22 15:33:46 -07:00
Nuno Lopes d953d01737 Introduce -enable-global-analyses to allow users to disable inter-procedural analyses
Alive2 doesn't support verification of optimizations that use inter-procedural analyses.
Right now, clang uses GlobalsAA by default and there's no way to disable it.
This leads to Alive2 producing false positives.
The added flag allows us to skip global analyses altogether.

Differential Revision: https://reviews.llvm.org/D134139
2022-09-19 11:59:35 +01:00
Arthur Eubanks ccc9107ad6 [OptBisect] Add flag to print IR when opt-bisect kicks in
-opt-bisect-print-ir-path=foo will dump the IR to foo when opt-bisect-limit starts skipping passes.

Currently we don't print the IR if the opt-bisect-limit is higher than the total number of times opt-bisect is called.

This makes getting the IR right before a bad transform easier.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D133809
2022-09-14 13:48:03 -07:00
Junduo Dong 6975ab7126 [Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.

The key codes are like the demo below:
```
  /// Runs the function pass across every function in the module.
  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
                        LazyCallGraph &CG, CGSCCUpdateResult &UR) {
      /// ...
      PreservedAnalyses PassPA;
      {
        TimeTraceScope TimeScope(Pass.name());
        PassPA = Pass.run(F, FAM);
      }
      /// ...
 }
```

It can be bothered to judge where should we add the tracing codes by hands.

With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.

Differential Revision: https://reviews.llvm.org/D131960
2022-09-11 05:42:55 -07:00
Jamie Schmeiser 5e3ac79690 Loop names used in reporting can grow very large
Summary:
The code for generating a name for loops for various reporting scenarios
created a name by serializing the loop into a string.  This may result in
a very large name for a loop containing many blocks.  Use the getName()
function on the loop instead.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: Whitney (Whitney Tsang), aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D133587
2022-09-09 13:45:14 -04:00
Fangrui Song f48931f3a8 [NewPM] Switch -filter-passes from ClassName to pass-name
NewPM -filter-passes (D86360) uses ClassName instead of pass-name as used in
`-passes`, `-print-after`, etc. D87216 has added a mechanism to map
ClassName to pass-name. Adopt it for -filter-passes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133263
2022-09-07 22:02:26 -07:00
Marco Elver 97c2220565 [SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass
Introduces the SanitizerBinaryMetadata instrumentation pass which uses
the new MD_pcsections metadata kinds to instrument certain types of
instructions and functions required for breakpoint-based sanitizers.

The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. GWP-TSan will require information about atomic
accesses; to unambiguously determine if an access is atomic or not, we
also require "covered" information which code has been compiled with
SanitizerBinaryMetadata instrumentation enabled.

[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D130887
2022-09-07 21:25:40 +02:00
Vitaly Buka 4c18670776 [NFC][sancov] Rename ModuleSanitizerCoveragePass 2022-09-06 20:55:39 -07:00
Vitaly Buka 5e38b2a456 [NFC][msan] Rename ModuleMemorySanitizerPass 2022-09-06 20:30:35 -07:00
Vitaly Buka 181d408186 [pipelines] OptimizerEarlyEPCallbacks for ThinLTO prelink
Similar to OptimizerLastEPCallbacks workaround
added D96320.

Probably NFC as-is, I don't see anything hooked with this callbacks yet,
but I we are looking to move sanitizers.

Reviewed By: aeubanks, MaskRay

Differential Revision: https://reviews.llvm.org/D133333
2022-09-06 15:54:04 -07:00
Vitaly Buka 93600eb50c [NFC][asan] Rename ModuleAddressSanitizerPass 2022-09-06 15:02:11 -07:00
Vitaly Buka e7bac3b9fa [msan] Convert Msan to ModulePass
MemorySanitizerPass function pass violatied requirement 4 of function
pass to do not insert globals. Msan nees to insert globals for origin
tracking, and paramereters tracking.

https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class

Reviewed By: kstoimenov, fmayer

Differential Revision: https://reviews.llvm.org/D133336
2022-09-06 15:01:04 -07:00
Arthur Eubanks 7e3aa8f01a Revert "[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests"
This reverts commit 57fd866551.

Causes crashes, see comments in D132581.
2022-09-05 15:42:48 -07:00
Kazu Hirata 03c3c2db10 [llvm] Use std::remove_reference_t (NFC) 2022-09-03 23:27:22 -07:00
Arthur Eubanks 57fd866551 [LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests
The current code is basically just emulating what the analysis manager does.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D132581
2022-09-02 10:55:53 -07:00
Fangrui Song 8d95fd7e56 [MachineFunctionPass] Support -filter-passes for -print-changed
[MachineFunctionPass] Support -filter-passes for -print-changed

-filter-passes specifies a `PassID` (a lower-case dashed-separated pass name,
also used by -print-after, -stop-after, etc) instead of a CamelCasePass.

`-filter-passes=CamelCaseNewPMPass` seems like a workaround for new PM passes before
we can use lower-case dashed-separated pass names (as used by `-passes=`).

Example:
```
# getPassName() is "IRTranslator". PassID is "irtranslator"
llc -mtriple=aarch64 -print-changed -filter-passes=irtranslator < print-changed-machine.ll
```

Close https://github.com/llvm/llvm-project/issues/57453

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133055
2022-09-01 11:06:06 -07:00
Arthur Eubanks 9599393eeb Revert "[Pipelines] Introduce DAE after ArgumentPromotion"
This reverts commit b10a341aa5.

This commit exposes the pre-existing https://github.com/llvm/llvm-project/issues/56503 in some edge cases. Will fix that and then reland this.
2022-09-01 08:52:19 -07:00
Arthur Eubanks 04f3c20989 [NFC][LICM] Stop passing around unused BFI
Uses of this were removed in 1a25d0bfbb.
2022-08-31 19:15:34 -07:00
Pavel Samolysov b10a341aa5 [Pipelines] Introduce DAE after ArgumentPromotion
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830
2022-08-28 10:47:03 +03:00
Pavel Samolysov f964417c32 Revert "[Pipelines] Introduce DAE after ArgumentPromotion"
The commit breaks the compiler when a function is used as a function
parameter (hm... for a function from the standard C library?):

```
static float strtof(char *, char *) {}
void a() { strtof(a, 0); }
```

This reverts commit 879f5118fc.
2022-08-26 13:43:09 +03:00
Florian Hahn 555e09c2b0
[LAA] Rename printing pass to print<access-info>.
This updates the naming for the LAA printing pass to be in line with
most other analysis printing passes.

The old name has come up as confusing multiple times already, e.g. in
D131924.
2022-08-26 11:00:09 +01:00
Pavel Samolysov 879f5118fc [Pipelines] Introduce DAE after ArgumentPromotion
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830
2022-08-25 10:55:47 +03:00
Pavel Samolysov 6703ad1e0c Revert "[Pipelines] Introduce DAE after ArgumentPromotion"
This reverts commit 3f20dcbf70.
2022-08-24 12:44:13 +03:00
Pavel Samolysov 3f20dcbf70 [Pipelines] Introduce DAE after ArgumentPromotion
The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated `alloca` instructions as well as meaningless `store`s and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted `GEP`s as well as `load`s and `cast`s in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Differential Revision: https://reviews.llvm.org/D128830
2022-08-24 10:36:12 +03:00
Ellis Hoag 0f946a50a4 [InstrProf] Add option to disable loop opt after PGO
Add the `-enable-post-pgo-loop-rotation` option to enable or disable the loop rotation transformation [1]. With some instrumentations, e.g., function entry coverage [2], loop rotation is not necessary and can lead to some surprise differences in codegen, even for functions where instrumentation is blocked with `noprofile` or `skipprofile`. The default value is `true` so the default behavior does not change.

[1] https://www.llvm.org/docs/LoopTerminology.html#loop-terminology-loop-rotate
[2] https://reviews.llvm.org/D116180

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D131817
2022-08-17 12:23:18 -07:00
Kazu Hirata 109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Ruobing Han f756f06cc4 [SimpleLoopUnswitch] Skip non-trivial unswitching of cold loops
With profile data, non-trivial LoopUnswitch will only apply on non-cold loops, as unswitching cold loops may not gain much benefit but significantly increase the code size.

Reviewed By: aeubanks, asbirlea

Differential Revision: https://reviews.llvm.org/D129599
2022-08-08 18:12:04 +00:00
Arthur Eubanks 81c4e58e2a [StandardInstrumentations] Handle case where block order changes
Previously we'd go off the end of the BI iterator because we expected
that the relative positions of common blocks before and after were
consistent. That's not always true though, for example with
jump-threading.

Reviewed By: jamieschmeiser

Differential Revision: https://reviews.llvm.org/D130596
2022-08-08 07:41:39 -07:00
Congzhe Cao 76be554931 [DependenceAnalysis][PR56275] Normalize negative dependence analysis results
This patch is the first of the two-patch series (D130188, D130179) that
resolve PR56275 (https://github.com/llvm/llvm-project/issues/56275)
which is a missed opportunity, where a perfrectly valid case for loop
interchange failed interchange legality.

If the distance/direction vector produced by dependence analysis (DA) is
negative, it needs to be normalized (reversed). This patch provides helper
functions `isDirectionNegative()` and `normalize()` in DA that does the
normalization, and clients can query DA to do normalization if needed.

A pass option `<normalized-results>` is added to DependenceAnalysisPrinterPass,
and we leverage it to update DA test cases to make sure of test coverage. The
test cases added in `Banerjee.ll` shows that negative vectors are normalized
with `print<da><normalized-results>`.

Reviewed By: bmahjour, Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D130188
2022-08-03 19:59:00 -04:00