llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	25a3130d89	[Local] Do not introduce a new `llvm.trap` before `unreachable` This is the second attempt to remove the `llvm.trap` insertion after https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6 reverted the first one. It is not clear what the exact issue was back then and it might already be gone by now, it has been >5 years after all. Replaces D106299. Differential Revision: https://reviews.llvm.org/D106308	2021-07-26 23:33:36 -05:00
Sjoerd Meijer	c4a0969b9c	Function Specialization Pass This adds a function specialization pass to LLVM. Constant parameters like function pointers and constant globals are propagated to the callee by specializing the function. This is a first version with a number of limitations: - The pass is off by default, so needs to be enabled on the command line, - It does not handle specialization of recursive functions, - It does not yet handle constants and constant ranges, - Only 1 argument per function is specialised, - The cost-model could be further looked into, and perhaps related, - We are not yet caching analysis results. This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan. More recently this was also discussed on the list, see: https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html. The motivation for this work is that function specialisation often comes up as a reason for performance differences of generated code between LLVM and GCC, which has this enabled by default from optimisation level -O3 and up. And while this certainly helps a few cpu benchmark cases, this also triggers in real world codes and is thus a generally useful transformation to have in LLVM. Function specialisation has great potential to increase compile-times and code-size. The summary from some investigations with this patch is: - Compile-time increases for short compile jobs is high relatively, but the increase in absolute numbers still low. - For longer compile-jobs, the extra compile time is around 1%, and very much in line with GCC. - It is difficult to blame one thing for compile-time increases: it looks like everywhere a little bit more time is spent processing more functions and instructions. - But the function specialisation pass itself is not very expensive; it doesn't show up very high in the profile of the optimisation passes. The goal of this work is to reach parity with GCC which means that eventually we would like to get this enabled by default. But first we would like to address some of the limitations before that. Differential Revision: https://reviews.llvm.org/D93838	2021-06-11 09:11:29 +01:00
Arthur Eubanks	6b9524a05b	[NewPM] Don't mark AA analyses as preserved Currently all AA analyses marked as preserved are stateless, not taking into account their dependent analyses. So there's no need to mark them as preserved, they won't be invalidated unless their analyses are. SCEVAAResults was the one exception to this, it was treated like a typical analysis result. Make it like the others and don't invalidate unless SCEV is invalidated. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102032	2021-05-18 13:49:03 -07:00
Sjoerd Meijer	39d29817f3	[SCCP] Follow up of rGbbab9f986c6d. NFC. This addresses the linter messages, mainly the inconsistent capitalisation of member functions.	2021-04-14 17:14:46 +01:00
Sjoerd Meijer	bbab9f986c	[SCCP] Create SCCP Solver This refactors SCCP and creates a SCCPSolver interface and class so that it can be used by other passes and transformations. We will use this in D93838, which adds a function specialisation pass. This is based on an early version by Vinay Madhusudan. Differential Revision: https://reviews.llvm.org/D93762	2021-04-14 14:58:03 +01:00
Dimitry Andric	6abb92f210	[SCCP] Avoid modifying AdditionalUsers while iterating over it When run under valgrind, or with a malloc that poisons freed memory, this can lead to segfaults or other problems. To avoid modifying the AdditionalUsers DenseMap while still iterating, save the instructions to be notified in a separate SmallPtrSet, and use this to later call OperandChangedState on each instruction. Fixes PR49582. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D98602	2021-04-02 19:05:59 +02:00
Akira Hatanaka	1900503595	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `ed4718eccb`, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in `75805dce5f`. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Hans Wennborg	0a5dd06718	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit `ed4718eccb`.	2021-03-03 15:51:40 +01:00
Akira Hatanaka	ed4718eccb	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-12 09:51:57 -08:00
Kazu Hirata	1238378f18	[llvm] Use pop_back_val (NFC)	2021-01-23 10:56:33 -08:00
Florian Hahn	d68bed0fa9	[SCCP] Handle bitcast of vector constants. Vectors where all elements have the same known constant range are treated as a single constant range in the lattice. When bitcasting such vectors, there is a mis-match between the width of the lattice value (single constant range) and the original operands (vector). Go to overdefined in that case. Fixes PR47991.	2020-11-03 12:58:39 +00:00
Juneyoung Lee	9b3c2a72e4	[ValueTracking] Use assume's noundef operand bundle This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89219	2020-10-14 20:16:33 +09:00
Eli Friedman	278299b0f0	[SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules. If a module has many values that need to be resolved by ResolvedUndefsIn, compilation takes quadratic time overall. Solve should do a small amount of work, since not much is added to the worklists each time markOverdefined is called. But ResolvedUndefsIn is linear over the length of the function/module, so resolving one undef at a time is quadratic in general. To solve this, make ResolvedUndefsIn resolve every undef value at once, instead of resolving them one at a time. This loses a little optimization power, but can be a lot faster. We still need a loop around ResolvedUndefsIn because markOverdefined could change the set of blocks that are live. That should be uncommon, hopefully. We could optimize it by tracking which blocks transition from dead to live, instead of iterating over the whole module to find them. But I'll leave that for later. (The whole function will become a lot simpler once we start pruning branches on undef.) The regression test changes seem minor. The specific cases in question could probably be optimized with a bit more work, but they seem like edge cases that don't really matter. Fixes an "infinite" compile issue my team found on an internal workoad. Differential Revision: https://reviews.llvm.org/D89080	2020-10-09 15:24:16 -07:00
Florian Hahn	b76df593eb	Revert "Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one."" Looks like there is still another remaining issue: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/22273/steps/build%20libcxx%2Fmsan/logs/stdio This reverts commit `86a20d9e34`.	2020-09-29 09:18:19 +01:00
Florian Hahn	86a20d9e34	Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one." This version includes an small fix allowing function pointers to be unconditionally replaced for now. This reverts commit `4c5e4aa89b`.	2020-09-29 09:10:27 +01:00
Nikita Popov	9fb46a452d	[SCCP] Compute ranges for supported intrinsics For intrinsics supported by ConstantRange, compute the result range based on the argument ranges. We do this independently of whether some or all of the input ranges are full, as we can often still constrain the result in some way. Differential Revision: https://reviews.llvm.org/D87183	2020-09-07 22:16:06 +02:00
Florian Hahn	4c5e4aa89b	Revert "[SCCP] Do not replace deref'able ptr with un-deref'able one." This reverts commit `3542feeb20`. This seems to be causing issues with a sanitizer build http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/21677	2020-09-03 10:28:42 +01:00
Florian Hahn	3542feeb20	[SCCP] Do not replace deref'able ptr with un-deref'able one. Currently IPSCCP (and others like CVP/GVN) blindly propagate pointer equalities. In certain cases, that leads to dereferenceable pointers being replaced, as in the example test case. I think this is not allowed, as it introduces an access of an un-dereferenceable pointer. Note that the pointer is inbounds, but one past the last element, so it is valid, but not dereferenceable. This patch is mostly to highlight the issue and start a discussion. Currently it only checks for specifically looking one-past-the-last-element pointers with array typed bases. This causes the mis-compile outlined in https://stackoverflow.com/questions/55754313/is-this-gcc-clang-past-one-pointer-comparison-behavior-conforming-or-non-standar In the test case, if we replace %p with the GEP for the store, we subsequently determine that the store and the load cannot alias, because they are to different underlying objects. Note that Alive2 seems to think that the replacement is valid: https://alive2.llvm.org/ce/z/2rorhk Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85332	2020-09-03 10:22:21 +01:00
Congzhe Cao	ec489ae048	[IPSCCP] Fix a bug that the "returned" attribute is not cleared when function is optimized to return undef In IPSCCP when a function is optimized to return undef, it should clear the returned attribute for all its input arguments and its corresponding call sites. The bug is exposed when the value of an input argument of the function is assigned to a physical register and because of the argument having a returned attribute, the value of this physical register will continue to be used as the function return value right after the call instruction returns, even if the value that this register holds may be clobbered during the function call. This potentially results in incorrect values being used afterwards. Reviewed By: jdoerfert, fhahn Differential Revision: https://reviews.llvm.org/D84220	2020-09-02 11:21:48 -04:00
Benjamin Kramer	3524c23ff2	[SCCP] Use bulk-remove API to bulk-remove attributes. NFCI.	2020-08-28 14:44:14 +02:00
David Stenberg	e8ebebb0bd	[InstCombine] Fix incorrect Modified status When removing instructions from unreachable blocks, and only debug info intrinsics were removed, InstCombine could incorrectly return a false Modified status. This is fixed by making removeAllNonTerminatorAndEHPadInstructions() also return how many debug info intrinsics that were removed, and take that into account. This was caught using the check introduced by D80916. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D85839	2020-08-13 15:10:41 +02:00
Nikita Popov	4564974504	[SCCP] Propagate inequalities Teach SCCP to create notconstant lattice values from inequality assumes and nonnull metadata, and update getConstant() to make use of them. Additionally isOverdefined() needs to be changed to consider notconstant an overdefined value. Handling inequality branches is delayed until our branch on undef story in other passes has been improved. Differential Revision: https://reviews.llvm.org/D83643	2020-08-04 20:20:52 +02:00
Nikita Popov	4c16eafe12	[SCCP] Remove dead switch cases based on range information Determine whether switch edges are feasible based on range information, and remove non-feasible edges lateron. This does not try to determine whether the default edge is dead, as we'd have to determine that the range is fully covered by the cases for that. Another limitation here is that we don't remove dead cases that have the same successor as a live case. I'm not handling this because I wanted to keep the edge removal based on feasible edges only, rather than inspecting ranges again there -- this does not seem like a particularly useful case to handle. Differential Revision: https://reviews.llvm.org/D84270	2020-07-30 21:21:08 +02:00
Nikita Popov	632a89e866	[SCCP] Restore the change reporting as well Reapply `5db5b4bc43`.	2020-07-25 15:11:30 +02:00
Nikita Popov	ad16e71c95	Reapply [SCCP] Directly remove non-feasible edges Reapply with DTU update moved after CFG update, which is a requirement of the API. ----- Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.) Instead, this patch explicitly removes non-feasible edges. It currently only needs to handle the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement. Differential Revision: https://reviews.llvm.org/D84264	2020-07-25 14:52:35 +02:00
Florian Hahn	3c1476d26c	[IPSCCP] Drop argmemonly after replacing pointer argument. This patch updates IPSCCP to drop argmemonly and inaccessiblemem_or_argmemonly if it replaces a pointer argument. Fixes PR46717. Reviewers: efriedma, davide, nikic, jdoerfert Reviewed By: efriedma, jdoerfert Differential Revision: https://reviews.llvm.org/D84432	2020-07-25 11:52:14 +01:00
Fangrui Song	4637daa990	Revert D84264 "[SCCP] Directly remove non-feasible edges" & `5db5b4bc43` It breaks stage-2 build. Clang crashed when compiling llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp llvm/Support/GenericDomTree.h eraseNode: Node is not a leaf node	2020-07-23 17:51:48 -07:00
Nikita Popov	5db5b4bc43	[SCCP] Add missing change reporting Forgot to actually use the return value of the function.	2020-07-23 20:58:29 +02:00
Nikita Popov	9394c3ec88	[SCCP] Directly remove non-feasible edges Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.) Instead, this patch explicitly removes non-feasible edges. It currently only needs to handle the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement. Differential Revision: https://reviews.llvm.org/D84264	2020-07-23 20:32:57 +02:00
Florian Hahn	752fea7c27	[SCCP] Add range metadata to call sites with known return ranges. If we inferred a range for the function return value, we can add !range at all call-sites of the function, if the range does not include undef. Reviewers: efriedma, davide, nikic Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D83952	2020-07-21 10:06:54 +01:00
Nikita Popov	c6e13667e7	[PredicateInfo] Add a method to interpret predicate as cmp constraint Both users of predicteinfo (NewGVN and SCCP) are interested in getting a cmp constraint on the predicated value. They currently implement separate logic for this. This patch adds a common method for this in PredicateBase. This enables a missing bit of PredicateInfo handling in SCCP: Now the predicate on the condition itself is also used. For switches it means we know that the switched-on value is the same as the case value. For assumes/branches we know that the condition is true or false. Differential Revision: https://reviews.llvm.org/D83640	2020-07-19 15:34:32 +02:00
Florian Hahn	569868f6b7	[SCCP] Only track returns of functions with non-void ret ty (NFC). There is no need to add functions with void return types to the set of tracked return values. This does not change functionality, because we such functions do not have return values and we never update or access them.	2020-07-16 15:15:19 +01:00
Florian Hahn	a86ce06faf	[SCCP] Use conditional info with AND/OR branch conditions. Currently SCCP does not combine the information of conditions joined by AND in the true branch or OR in the false branch. For branches on AND, 2 copies will be inserted for the true branch, with one being the operand of the other as in the code below. We can combine the information using intersection. Note that for the OR case, the copies are inserted in the false branch, where using intersection is safe as well. define void @foo(i32 %a) { entry: %lt = icmp ult i32 %a, 100 %gt = icmp ugt i32 %a, 20 %and = and i1 %lt, %gt ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %lt = icmp ult i32 %a, 100 Edge: [label %entry,label %true] } %a.0 = call i32 @llvm.ssa.copy.140247425954880(i32 %a) ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %gt = icmp ugt i32 %a, 20 Edge: [label %entry,label %false] } %a.1 = call i32 @llvm.ssa.copy.140247425954880(i32 %a.0) br i1 %and, label %true, label %false true: ; preds = %entry call void @use(i32 %a.1) %true.1 = icmp ne i32 %a.1, 20 call void @use.i1(i1 %true.1) ret void false: ; preds = %entry call void @use(i32 %a.1) ret void } Reviewers: efriedma, davide, mssimpso, nikic Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D77808	2020-07-09 12:59:24 +01:00
Nikita Popov	8691544a27	[SCCP] Use range metadata for loads and calls When all else fails, use range metadata to constrain the result of loads and calls. It should also be possible to use !nonnull, but that would require some general support for inequalities in SCCP first. Differential Revision: https://reviews.llvm.org/D83179	2020-07-07 21:09:21 +02:00
Nikita Popov	9dfea03517	[SCCP] Handle assume predicates Take assume predicates into account when visiting ssa.copy. The handling is the same as for branch predicates, with the difference that we're always on the true edge. Differential Revision: https://reviews.llvm.org/D83257	2020-07-07 20:22:52 +02:00
Simon Pilgrim	a53dddb3e9	Local.h - reduce includes to forward declarations. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 19:27:37 +01:00
Florian Hahn	f9d8e33c32	[SCCP] Turn sext into zext for non-negative ranges. This patch updates SCCP/IPSCCP to use the computed range info to turn sexts into zexts, if the value is known to be non-negative. We already to a similar transform in CorrelatedValuePropagation, but it seems like we can catch a lot of additional cases by doing it in SCCP/IPSCCP as well. The transform is limited to ranges that are known to not include undef. Currently constant ranges from conditions are treated as potentially containing undef, due to PR46144. Once we flip this, the transform will be more effective in practice. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81756	2020-06-19 10:17:55 +01:00
Florian Hahn	773353be4e	[SCCP] Move common code to simplify basic block to helper (NFC). Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81755	2020-06-17 10:03:43 +01:00
serge-sans-paille	977d27d881	[SCCP] Report changes after removing stores to constant global Differential Revision: https://reviews.llvm.org/D81228	2020-06-05 16:09:07 +02:00
Florian Hahn	01f999ae88	[SCCP] Switch to widen at PHIs, stores and call edges. Currently SCCP does not widen PHIs, stores or along call edges (arguments/return values), but on operations that directly extend ranges (like binary operators). This means PHIs, stores and call edges are not pessimized by widening currently, while binary operators are. The main reason for widening operators initially was that opting-out for certain operations was more straight-forward in the initial implementation (and it did not matter too much, as range support initially was only implemented for a very limited set of operations. During the discussion in D78391, it was suggested to consider flipping widening to PHIs, stores and along call edges. After adding support for tracking the number of range extensions in ValueLattice, limiting the number of range extensions per value is straight forward. This patch introduces a MaxWidenSteps option to the MergeOptions, limiting the number of range extensions per value. For PHIs, it seems natural allow an extension for each (active) incoming value plus 1. For the other cases, a arbitrary limit of 10 has been chosen initially. It would potentially make sense to set it depending on the users of a function/global, but that still needs investigating. This potentially leads to more state-changes and longer compile-times. The results look quite promising (MultiSource, SPEC): Same hash: 179 (filtered out) Remaining: 58 Metric: sccp.IPNumInstRemoved Program base widen-phi diff test-suite...ks/Prolangs-C/agrep/agrep.test 58.00 82.00 41.4% test-suite...marks/SciMark2-C/scimark2.test 32.00 43.00 34.4% test-suite...rks/FreeBench/mason/mason.test 6.00 8.00 33.3% test-suite...langs-C/football/football.test 104.00 128.00 23.1% test-suite...cations/hexxagon/hexxagon.test 36.00 42.00 16.7% test-suite...CFP2000/177.mesa/177.mesa.test 214.00 249.00 16.4% test-suite...ngs-C/assembler/assembler.test 14.00 16.00 14.3% test-suite...arks/VersaBench/dbms/dbms.test 10.00 11.00 10.0% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 47.00 9.3% test-suite...ications/JM/ldecod/ldecod.test 179.00 195.00 8.9% test-suite...CFP2006/433.milc/433.milc.test 249.00 265.00 6.4% test-suite.../CINT2000/175.vpr/175.vpr.test 98.00 104.00 6.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 70.00 74.00 5.7% test-suite...CFP2000/188.ammp/188.ammp.test 71.00 75.00 5.6% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 111.00 117.00 5.4% test-suite...ce/Applications/Burg/burg.test 41.00 43.00 4.9% test-suite...000/197.parser/197.parser.test 66.00 69.00 4.5% test-suite...tions/lambda-0.1.3/lambda.test 23.00 24.00 4.3% test-suite...urce/Applications/lua/lua.test 301.00 313.00 4.0% test-suite...TimberWolfMC/timberwolfmc.test 76.00 79.00 3.9% test-suite...lications/ClamAV/clamscan.test 991.00 1030.00 3.9% test-suite...plications/d/make_dparser.test 53.00 55.00 3.8% test-suite...fice-ispell/office-ispell.test 83.00 86.00 3.6% test-suite...lications/obsequi/Obsequi.test 28.00 29.00 3.6% test-suite.../Prolangs-C/bison/mybison.test 56.00 58.00 3.6% test-suite.../CINT2000/254.gap/254.gap.test 170.00 176.00 3.5% test-suite.../Applications/lemon/lemon.test 30.00 31.00 3.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1202.00 1240.00 3.2% test-suite...pplications/treecc/treecc.test 79.00 81.00 2.5% test-suite...chmarks/MallocBench/gs/gs.test 357.00 366.00 2.5% test-suite...eeBench/analyzer/analyzer.test 103.00 105.00 1.9% test-suite...T2006/445.gobmk/445.gobmk.test 1697.00 1724.00 1.6% test-suite...006/453.povray/453.povray.test 1812.00 1839.00 1.5% test-suite.../Benchmarks/Bullet/bullet.test 337.00 342.00 1.5% test-suite.../CINT2000/252.eon/252.eon.test 426.00 432.00 1.4% test-suite...T2000/300.twolf/300.twolf.test 214.00 217.00 1.4% test-suite...pplications/oggenc/oggenc.test 244.00 247.00 1.2% test-suite.../CINT2006/403.gcc/403.gcc.test 4008.00 4055.00 1.2% test-suite...T2006/456.hmmer/456.hmmer.test 175.00 177.00 1.1% test-suite...nal/skidmarks10/skidmarks.test 430.00 434.00 0.9% test-suite.../Applications/sgefa/sgefa.test 115.00 116.00 0.9% test-suite...006/447.dealII/447.dealII.test 1082.00 1091.00 0.8% test-suite...6/482.sphinx3/482.sphinx3.test 141.00 142.00 0.7% test-suite...ocBench/espresso/espresso.test 152.00 153.00 0.7% test-suite...3.xalancbmk/483.xalancbmk.test 4003.00 4025.00 0.5% test-suite...lications/sqlite3/sqlite3.test 548.00 551.00 0.5% test-suite...marks/7zip/7zip-benchmark.test 5522.00 5551.00 0.5% test-suite...nsumer-lame/consumer-lame.test 208.00 209.00 0.5% test-suite...:: External/Povray/povray.test 1556.00 1563.00 0.4% test-suite...000/186.crafty/186.crafty.test 298.00 299.00 0.3% test-suite.../Applications/SPASS/SPASS.test 2019.00 2025.00 0.3% test-suite...ications/JM/lencod/lencod.test 8427.00 8449.00 0.3% test-suite...6/464.h264ref/464.h264ref.test 6797.00 6813.00 0.2% test-suite...6/471.omnetpp/471.omnetpp.test 431.00 430.00 -0.2% test-suite...006/450.soplex/450.soplex.test 446.00 447.00 0.2% test-suite...0.perlbench/400.perlbench.test 1729.00 1727.00 -0.1% test-suite...000/255.vortex/255.vortex.test 3815.00 3819.00 0.1% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79036	2020-05-29 11:59:17 +01:00
Florian Hahn	935685f420	[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC). There's no need to duplicate the logic to push to the different work-lists.	2020-05-04 10:19:39 +01:00
Florian Hahn	d911c17596	[SCCP] Get a copy of the state of CopyOf once. This fixes potential reference invalidations, when no lattice value is assigned for CopyOf. As the state of CopyOf won't change while in handleCallResult, we can get a copy once and use that. Should fix PR45749.	2020-05-01 14:46:35 +01:00
Florian Hahn	7d57d22baa	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Florian Hahn	352b612a71	[SCCP] Drop unnecessary early exit for ExtractValueInst. visitExtractValueInst uses mergeInValue, so it already can handle constant ranges. Initially the early exit was using isOverdefined to keep things as NFC during the initial move to ValueLatticeElement. As the function already supports constant ranges, it can just use ValueState[&I].isOverdefined. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78393	2020-04-22 22:07:59 +01:00
Craig Topper	53ee8fbc23	[CallSite removal][SCCP] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78470	2020-04-20 00:16:09 -07:00
Florian Hahn	6ba0695c60	[ValueLattice] Add struct for merge options. This makes it easier to extend the merge options in the future and also reduces the risk of accidentally setting a wrong option. Reviewers: efriedma, nikic, reames, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78368	2020-04-19 09:03:16 +01:00
Florian Hahn	46853b95ca	[SCCP] Drop unused early exit from visitStoreInst (NFC). There are no lattice values associated with store instructions directly. They will never get marked as overdefined.	2020-04-18 19:44:54 +01:00
Florian Hahn	034e8d58a8	[SCCP] Drop unused early exit from visitReturnInst (NFC). There are no lattice values associated with return instructions directly. They will never get marked as overdefined.	2020-04-18 13:52:41 +01:00
Florian Hahn	c245d3e033	[ValueLattice] Steal bits from Tag to track range extensions (NFC). Users of ValueLatticeElement currently have to ensure constant ranges are not extended indefinitely. For example, in SCCP, mergeIn goes to overdefined if a constantrange value is repeatedly merged with larger constantranges. This is a simple form of widening. In some cases, this leads to an unnecessary loss of information and things can be improved by allowing a small number of extensions in the hope that a fixed point is reached after a small number of steps. To make better decisions about widening, it is helpful to keep track of the number of range extensions. That state is tied directly to a concrete ValueLatticeElement and some unused bits in the class can be used. The current patch preserves the existing behavior by default: CheckWiden defaults to false and if CheckWiden is true, a single change to the range is allowed. Follow-up patches will slightly increase the threshold for widening. Reviewers: efriedma, davide, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78145	2020-04-17 15:38:23 +01:00
Aaron Puchert	e833e58300	[ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC Reviewed By: fhahn, echristo Differential Revision: https://reviews.llvm.org/D78061	2020-04-14 13:32:53 +02:00

1 2 3 4 5 ...

586 Commits