llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	109df7f9a4	[llvm] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-13 12:55:42 -07:00
Ruobing Han	f756f06cc4	[SimpleLoopUnswitch] Skip non-trivial unswitching of cold loops With profile data, non-trivial LoopUnswitch will only apply on non-cold loops, as unswitching cold loops may not gain much benefit but significantly increase the code size. Reviewed By: aeubanks, asbirlea Differential Revision: https://reviews.llvm.org/D129599	2022-08-08 18:12:04 +00:00
Arthur Eubanks	81c4e58e2a	[StandardInstrumentations] Handle case where block order changes Previously we'd go off the end of the BI iterator because we expected that the relative positions of common blocks before and after were consistent. That's not always true though, for example with jump-threading. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D130596	2022-08-08 07:41:39 -07:00
Congzhe Cao	76be554931	[DependenceAnalysis][PR56275] Normalize negative dependence analysis results This patch is the first of the two-patch series (D130188, D130179) that resolve PR56275 (https://github.com/llvm/llvm-project/issues/56275) which is a missed opportunity, where a perfrectly valid case for loop interchange failed interchange legality. If the distance/direction vector produced by dependence analysis (DA) is negative, it needs to be normalized (reversed). This patch provides helper functions `isDirectionNegative()` and `normalize()` in DA that does the normalization, and clients can query DA to do normalization if needed. A pass option `<normalized-results>` is added to DependenceAnalysisPrinterPass, and we leverage it to update DA test cases to make sure of test coverage. The test cases added in `Banerjee.ll` shows that negative vectors are normalized with `print<da><normalized-results>`. Reviewed By: bmahjour, Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D130188	2022-08-03 19:59:00 -04:00
Arthur Eubanks	43aa4ac70b	[StandardInstrumentations] Assign names to basic blocks without names Fixes code in OrderedChangedData<T>::report which assumes that a string will only appear once in Before/After. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D130587	2022-08-02 11:04:01 -07:00
Fangrui Song	2b70bebc6d	[MachineFunctionPass] Support -print-changed={,c}diff{,-quiet} Follow-up to D130434. Move doSystemDiff to PrintPasses.cpp and call it in MachineFunctionPass.cpp. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D130833	2022-08-01 12:56:15 -07:00
Fangrui Song	f106525de2	[MachineFunctionPass] Support -print-changed and -print-changed=quiet -print-changed for new pass manager is handy beside -print-after-all. Port it to MachineFunctionPass. Note: lib/Passes/StandardInstrumentations.cpp implements a number of misc features. If we want to use them for codegen, we may need to lift some functionality to LLVMIR. Reviewed By: aeubanks, jamieschmeiser Differential Revision: https://reviews.llvm.org/D130434	2022-07-26 10:16:49 -07:00
Sanjay Patel	bfb9b8e075	[Passes] add a tail-call-elim pass near the end of the opt pipeline We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later. In the motivating case from issue #47852, the missing 'tail' on memset leads to sub-optimal codegen. I experimented with removing the early instance of tail-call-elim instead of just adding another pass, but that appears to be slightly worse for compile-time: +0.15% vs. +0.08% time. "tailcall" shows adding the pass; "tailcall2" shows moving the pass to later, then adding the original early pass back (so 1596886802 is functionally equivalent to 180b0439dc ): https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright Note that there was an effort to split the tail call functionality into 2 passes - that could help reduce compile-time if we find that this change costs more in compile-time than expected based on the preliminary testing: D60031 Differential Revision: https://reviews.llvm.org/D130374	2022-07-25 15:25:47 -04:00
Fangrui Song	89357f0cb9	[Passes] Simplify ChangePrinter names. NFC	2022-07-23 19:32:13 -07:00
Arthur Eubanks	8c6305b8b4	[NewPM] Print function/SCC size with -debug-pass-manager This is helpful for debugging issues with very large functions or SCC. Also helpful when function names are very large and it's hard to tell the number of nodes in an SCC. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D128003	2022-07-19 09:00:37 -07:00
Alina Sbirlea	846d10f16a	Turn on flag to not re-run simplification pipeline. This patch turns on the flag `-enable-no-rerun-simplification-pipeline`, which means the simplification pipeline will not be rerun on unchanged functions in the CGSCCPass Manager. Compile time improvement: https://llvm-compile-time-tracker.com/compare.php?from=17457be1c393ff691cca032b04ea1698fedf0301&to=882301ebb893c8ef9f09fe1ea871f7995426fa07&stat=instructions No meaningful run time regressions observed in the llvm test suite and in additional internal workloads at this time. The example test in `test/Other/no-rerun-function-simplification-pipeline.ll` is a good means to understand the effect of this change: ``` define void @f1(void()* %p) alwaysinline { call void %p() ret void } define void @f2() #0 { call void @f1(void()* @f2) call void @f3() ret void } define void @f3() #0 { call void @f2() ret void } ``` There are two SCCs formed by the ModuleToPostOrderCGSCCAdaptor: (f1) and (f2, f3). The pass manager runs on the first SCC, leading to running the simplification pipeline (function and loop passes) on f1. With the flag on, after this, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f1`. Next, the pass manager runs on the second SCC: (f2, f3). Since f1() was inlined, f2() now calls itself, and also calls f3(), while f3() only calls f2(). So the pass manager for the SCC first runs the Inliner on (f2, f3), then the simplification pipeline on f2. With the flag on, the output will have `Running analysis: ShouldNotRunFunctionPassesAnalysis on f2`; unless the inliner makes a change, this analysis remains preserved which means there's no reason to rerun the simplification pipeline. With the flag off, there is a second run of the simplification pipeline run on f2. Next, the same flow occurs for f3. The simplification pipeline is run on f3 a single time with the flag on, along with `ShouldNotRunFunctionPassesAnalysis on f3`, and twice with the flag off. The reruns occur only on f2 and f3 due to the additional ref edges.	2022-07-14 06:23:55 -07:00
Kazu Hirata	ec9a0e36d9	[IPO] Remove addLTOOptimizationPasses and addLateLTOOptimizationPasses (NFC) The last uses were removed on Apr 15, 2022 in commit `2e6ac54cf4`. Differential Revision: https://reviews.llvm.org/D129460	2022-07-11 20:15:24 -07:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in `e9ce1a5880` due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit `e6f1f06245`. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Ben Dunbobbin	325e7e8b87	[LLVM][LTO][LLD] Enable Profile Guided Layout (--call-graph-profile-sort) for FullLTO The CGProfilePass needs to be run during FullLTO compilation at link time to emit the .llvm.call-graph-profile section to the compiled LTO object file. Currently, it is being run only during the initial LTO-prelink compilation stage (to produce the bitcode files to be consumed by the linker) and so the section is not produced. ThinLTO is not affected because: - For ThinLTO-prelink compilation the CGProfilePass pass is not run because ThinLTO-prelink passes are added via buildThinLTOPreLinkDefaultPipeline. Normal and FullLTO-prelink passes are both added via buildPerModuleDefaultPipeline which uses the LTOPreLink parameter to customize its behavior for the FullLTO-prelink pass differences. - ThinLTO backend compilation phase adds the CGProfilePass (see: buildModuleOptimizationPipeline). Adjust when the pass is run so that the .llvm.call-graph-profile section is produced correctly for FullLTO. Fixes #56185 (https://github.com/llvm/llvm-project/issues/56185)	2022-07-01 13:57:36 +01:00
Nicolai Hähnle	8de6d4b712	StandardInstrumentation: print verifier output to errs Enabling the verifiers is not very helpful if their output is suppressed beyond the fatal error. Differential Revision: https://reviews.llvm.org/D128743	2022-06-29 12:11:55 +02:00
Mitch Phillips	dacfa24f75	Delete 'llvm.asan.globals' for global metadata. Now that we have the sanitizer metadata that is actually on the global variable, and now that we use debuginfo in order to do symbolization of globals, we can delete the 'llvm.asan.globals' IR synthesis. This patch deletes the 'location' part of the __asan_global that's embedded in the binary as well, because it's unnecessary. This saves about ~1.7% of the optimised non-debug with-asserts clang binary. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D127911	2022-06-27 14:40:40 -07:00
Chuanqi Xu	24e53b01d5	Revert "[Coroutines] Only do symmetric transfer if optimization is on" This reverts commit `7782e080e8`. According to the discussion of WG21, symmetric transfer is a desired feature.	2022-06-27 10:54:56 +08:00
Mingming Liu	e0d069598b	[Inline] Annotate inline pass name with link phase information for analysis. The annotation is flag gated; flag is turned off by default. Differential Revision: https://reviews.llvm.org/D125495	2022-06-24 10:06:43 -07:00
Chuanqi Xu	7782e080e8	[Coroutines] Only do symmetric transfer if optimization is on Symmetric transfer is not a part of C++ standards. So the vendors is not forced to implement it any way. Given the symmetric transfer nowadays is an optimization. It makes more sense to enable it only if the optimization is enabled. It is also helpful for the compilation speed in O0.	2022-06-20 16:20:36 +08:00
Jin Xin Ng	aaff3fb6d5	[mlgo] Fix accounting for SCC splits Previously if the inliner split an SCC such that an empty one remained, the MLInlineAdvisor could potentially lose track of the EdgeCount if a subsequent CGSCC pass modified the calls of a function that was initially in the SCC pre-split. Saving the seen nodes in onPassEntry resolves this. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127693	2022-06-15 10:53:23 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Arthur Eubanks	36096c2b38	[NFC][JumpThreading] Remove InsertFreezeWhenUnfoldingSelect pass parameter All callers pass true. select-unfold-freeze.ll is now a subset of select.ll so delete it. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126501	2022-05-26 16:13:34 -07:00
Jamie Schmeiser	24239e246c	Add new hidden option -print-on-crash that prints out IR that caused opt pipeline to crash A new hidden option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash. The IR is saved in its print form before each pass is started and a signal handler is registered. If the compilation crashes, the signal handler will print the saved IR to dbgs(). This option can be modified using -print-module-scope to get the IR for the complete module. Note that this option only works with the new pass manager. Reviewed By: yrouban Differential Revision: https://reviews.llvm.org/D86657	2022-05-23 15:38:38 -07:00
Yang Keao	7dce9eb6e5	[DomPrinter] Migrate -dot-dom to the new pass manager. In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D124904	2022-05-16 15:07:16 -05:00
Chuanqi Xu	02d6845234	[NFC] [Coroutines] Remove EnableReuseStorageInFrame option The EnableReuseStorageInFrame option is designed for testing only. But it is better to use *_PASS_WITH_PARAMS macro to keep consistent with other passes.	2022-05-10 17:28:43 +08:00
Chuanqi Xu	405bf90235	[NFC] [Pipelines] Hoist CoroCleanup as Module Pass This is similar to previous patch https://reviews.llvm.org/D123925. It could also reduce the time we call declaresCoroCleanupIntrinsics. And it is helpful for further changes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D124362	2022-05-05 15:15:09 +08:00
Chuanqi Xu	7d40f562e7	[Pipelines] Hoist CoroCleanup to avoid blocking optimizations CoroCleanup is designed to lowering all the remaining coroutine intrinsics. It is required to run after CoroSplit only. However, the position of CoroCleanup now is far too late. The downside here is that the unlowered coroutine instrincs might blocking other optimizations too. So it should be a pure win to hoist the position of CoroCleanup. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D124360	2022-05-05 15:13:27 +08:00
Mingming Liu	408bb9a375	Add a regression test to guard the 0 hot-caller threshold in SamplePGO + ThinLTO. - Add a comment near where the threshold is set.	2022-04-25 18:29:56 +00:00
Chuanqi Xu	f9bee35689	[Pipelines] Hoist CoroEarly as a module pass This change could reduce the time we call `declaresCoroEarlyIntrinsics`. And it is helpful for future changes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123925	2022-04-19 11:04:24 +08:00
Arthur Eubanks	a7e20a8a7a	[CallPrinter] Port CallPrinter passes to new pass manager Port the legacy CallGraphViewer and CallGraphDOTPrinter to work with the new pass manager. Addresses issue https://github.com/llvm/llvm-project/issues/54323 Adds back related tests that were removed in commits `d53a4e7b4a` and `9e9d9aba14` Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D122989	2022-04-18 10:02:18 -07:00
Evgeny Mandrikov	443b6ec169	[NFC] Fix build failure with GCC 11 in C++20 mode This was already fixed in `2ccf0b76bc` but then regressed in `79a1f3e7c6` Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123589	2022-04-13 09:42:41 -07:00
Matt Arsenault	39f1568633	Transforms: Split LowerAtomics into separate Utils and pass This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.	2022-04-06 20:54:45 -04:00
Wenju He	0bda12b5bc	[NewPM] Add OptimizerEarly module extension point VectorizerStart extension is module callback in old PM, but is function callback in new PM. We lack a module extension point between end of buildModuleSimplificationPipeline and the function optimization (including vectorizer) pipeline. So this patch adds a new module extension point before the function optimization pipeline. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D122296	2022-03-31 08:22:27 -07:00
Julian Lettner	64902d335c	Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-23 18:36:55 -07:00
Zequan Wu	581dc3c729	Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" This reverts commit `22570bac69`.	2022-03-23 16:11:54 -07:00
Arthur Eubanks	9bd66b312c	[PassManager][Coroutine] Run passes under -O0 conditionally and run GlobalDCE CoroSplit lowers various coroutine intrinsics. It's a CGSCC pass and CGSCC passes don't run on unreachable functions. Normally GlobalDCE will come along and delete unreachable functions, but we don't run GlobalDCE under -O0, so an unreachable function with coroutine intrinsics may never have CoroSplit run on it. This patch adds GlobalDCE when coroutines intrinsics are present. It also now runs all coroutine passes conditional when coroutine intrinsics are present. This should also solve the -O0 regression reported in D105877 due to LazyCallGraph construction. Fixes https://github.com/llvm/llvm-project/issues/54117 Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D122275	2022-03-23 11:03:26 -07:00
Florian Hahn	5ab421fb4e	[LICM] Add allowspeculation pass options. This adds a new option to control AllowSpeculation added in D119965 when using `-passes=...`. This allows reproducing #54023 using opt. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D121944	2022-03-18 16:51:57 +00:00
Julian Lettner	22570bac69	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-17 10:47:13 -07:00
Simon Pilgrim	7262eacd41	Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'	2022-03-15 13:01:35 +00:00
Julian Lettner	9c542a5a4e	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121327	2022-03-14 17:51:18 -07:00
Arthur Eubanks	4fc7c55fff	[NewPM] Actually recompute GlobalsAA before module optimization pipeline RequireAnalysis<GlobalsAA> doesn't actually recompute GlobalsAA. GlobalsAA isn't invalidated (unless specifically invalidated) because it's self-updating via ValueHandles, but can be imprecise during the self-updates. Rather than invalidating GlobalsAA, which would invalidate AAManager and any analyses that use AAManager, create a new pass that recomputes GlobalsAA. Fixes #53131. Differential Revision: https://reviews.llvm.org/D121167	2022-03-14 09:42:34 -07:00
Xiang1 Zhang	c31014322c	TLS loads opimization (hoist) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120000	2022-03-10 09:29:06 +08:00
Florian Hahn	f98125abb2	Revert "[PassManager] Add pretty stack entries before P->run() call." This reverts commit `128745cc26`. This increased compile-time unnecessarily. Revert this change and follow ups `2c7afadb47` & `add0c5856d`. http://llvm-compile-time-tracker.com/compare.php?from=338dfcd60f843082bb589b287d890dbd9394eb82&to=128745cc2681c284bc6d0150a319673a6d6e8424&stat=instructions	2022-03-09 18:46:32 +00:00
Florian Hahn	128745cc26	[PassManager] Add pretty stack entries before P->run() call. This patch adds PrettyStackEntries before running passes. The entries include the pass name and the IR unit the pass runs on. The information is used the print additional information when a pass crashes, including the name and a reference to the IR unit on which it crashed. This is similar to the behavior of the legacy pass manager. The improved stack trace now includes: Stack dump: 0. Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll 1. Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll' 2. Running pass 'LoopVectorizePass' on function '@a' Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D120993	2022-03-09 13:01:09 +00:00
Arthur Eubanks	79a1f3e7c6	[NFC] Cleanup StandardInstrumentations	2022-03-07 16:24:36 -08:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Xiang1 Zhang	65588a0776	Revert "TLS loads opimization (hoist)" Revert for more reviews This reverts commit `30e612ebdf`.	2022-03-02 14:10:11 +08:00

1 2 3 4 5 ...

885 Commits