llvm-project

Commit Graph

Author	SHA1	Message	Date
Amir Ayupov	556efdba85	[BOLT][NFC] Extend debug logging in analyzeJumpTable Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131918	2022-08-15 20:34:40 -07:00
Kazu Hirata	2febc32c9c	Use llvm::erase_if (NFC)	2022-08-13 12:55:48 -07:00
Fangrui Song	53113515cd	[BOLT] Use Optional::emplace to avoid move assignment. NFC	2022-08-12 12:51:50 -07:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Thorsten Schütt	0c9258612b	[bolt] silence unused variables warnings	2022-08-06 20:52:45 +02:00
Rafael Auler	19eb908e61	[BOLT] Remove always true if statement Got a warning from GCC when building this. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D131092	2022-08-03 13:11:33 -07:00
Nicolai Hähnle	f7872cdce1	CommandLine: add and use cl::SubCommand::get{All,TopLevel} Prefer using these accessors to access the special sub-commands corresponding to the top-level (no subcommand) and all sub-commands. This is a preparatory step towards removing the use of ManagedStatic: with a subsequent change, these global instances will be moved to be regular function-scope statics. It is split up to give downstream projects a (albeit short) window in which they can switch to using the accessors in a forward-compatible way. Differential Revision: https://reviews.llvm.org/D129118	2022-08-02 23:49:16 +02:00
David Blaikie	7651522b78	Fold assert-used variable into assert Fixes #56724	2022-08-01 21:57:11 +00:00
Alexander Yermolovich	dd29b3c542	[BOLT][DWARF] Fix handling of multiple DW_OP_addrx in an expression We were not handling correclty multiple DW_OP_addrx in the location expression. This was exposed by clang-15 build in release mode with debug information. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130812	2022-08-01 14:38:47 -07:00
Kazu Hirata	bf6021709a	Use drop_begin (NFC)	2022-07-31 15:17:09 -07:00
Kazu Hirata	ce3b687b88	[BOLT] Remove redundaunt string initialization (NFC) Identified with readability-redundant-string-init.	2022-07-31 15:17:05 -07:00
Kazu Hirata	1bf531a5d0	[BOLT] Use boolean literals (NFC) Identified with modernize-use-bool-literals.	2022-07-31 15:17:02 -07:00
Amir Ayupov	468d4f6d18	Revert "[BOLT] Ignore functions accessing false positive jump tables" This diff uncovers an ASAN leak in getOrCreateJumpTable: ``` Indirect leak of 264 byte(s) in 1 object(s) allocated from: #1 0x4f6e48c in llvm::bolt::BinaryContext::getOrCreateJumpTable ... ``` The removal of an assertion needs to be accompanied by proper deallocation of a `JumpTable` object for which `analyzeJumpTable` was unsuccessful. This reverts commit `52cd00cabf`.	2022-07-30 10:39:46 -07:00
Kazu Hirata	12b29900a1	Use any_of (NFC)	2022-07-30 10:35:56 -07:00
Kazu Hirata	f081ec20b5	[bolt] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-30 10:35:51 -07:00
Kazu Hirata	b498a8991e	[bolt] Remove redundaunt control-flow statements (NFC) Identified with readability-redundant-control-flow.	2022-07-30 10:35:49 -07:00
Kazu Hirata	60db8d9b4e	Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2022-07-30 10:35:48 -07:00
Rafael Auler	fc0ced73dc	Add BAT testing framework This patch refactors BAT to be testable as a library, so we can have open-source tests on it. This further fixes an issue with basic blocks that lack a valid input offset, making BAT omit those when writing translation tables. Test Plan: new testcases added, new testing tool added (llvm-bat-dump) Differential Revision: https://reviews.llvm.org/D129382	2022-07-29 14:55:04 -07:00
Fangrui Song	7430894a65	Replace Optional::hasValue with has_value or operator bool. NFC	2022-07-29 10:57:25 -07:00
Fangrui Song	999514bb9a	[bolt] Replace Optional::getValue with value or operator*. NFC	2022-07-29 01:15:24 -07:00
Huan Nguyen	52cd00cabf	[BOLT] Ignore functions accessing false positive jump tables Disassembly and branch target analysis are not decoupled, so any analysis that depends on disassembly may not operate properly. In specific, analyzeJumpTable uses instruction bounds check property. A jump table was analyzed twice: (a) during disassembly, and (b) after disassembly, so there are potentially some mismatched results. In this update, functions that access JTs which fail the second check will be marked as ignored. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130431	2022-07-28 23:22:17 -07:00
Huan Nguyen	ccabbfff86	[BOLT] Remove --allow-stripped option AllowStripped has not been used in BOLT. This option is replaced by actively detecting stripped binary. Test Plan: Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130036	2022-07-28 23:15:53 -07:00
Huan Nguyen	986362d4a3	[BOLT] Add BinaryContext::IsStripped Determine stripped status of a binary based on .symtab Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130034	2022-07-28 23:11:03 -07:00
Amir Ayupov	77c1977384	[BOLT] Support files with no symbols `LastSymbol` handling in `discoverFileObjects` assumes a non-zero number of symbols in an object file. It's not the case for broken_dynsym.test added in D130073, and potentially other stripped binaries. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130544	2022-07-26 00:07:59 -07:00
Fabian Parzefall	83882606db	[BOLT] Process each block only once in fixCFGForPIC Rather than iterating over the whole function from the start until no internal calls are found, process each block only once and continue processing after splitting. This version of the function also does not seemingly invalidate iterators from within the loop. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130436	2022-07-25 15:06:24 -07:00
Huan Nguyen	8eb68d92d4	[BOLT] Handle broken .dynsym in stripped binaries Strip tools cause a few symbols in .dynsym to have bad section index. This update safely keeps such broken symbols intact. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130073	2022-07-22 11:24:09 -07:00
Maksim Panchenko	661577b5f4	[BOLT] Add support for the latest perf tool The latest perf tool can return non-empty buffer when executing buildid-list command, even when perf.data was recorded with -B flag. Some binaries will be listed without the ID, while others may have a recorded ID. Allow invalid entires on the input, while checking the valid ones for the match. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130223	2022-07-22 07:56:15 -07:00
Sriraman Tallam	116ee23f4c	[bolt] std::atomic_uint64_t to std::atomic<uint64_t> Differential Revision: https://reviews.llvm.org/D129903	2022-07-19 16:09:11 -07:00
Fabian Parzefall	8477bc6761	[BOLT] Add function layout class This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split). Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129518	2022-07-16 17:23:24 -07:00
Huan Nguyen	ae563c9146	[BOLT] Support split landing pad We previously support split jump table, where some jump table entries target different fragments of same function. In this fix, we provide support for another type of intra-indirect transfer: landing pad. When C++ exception handling is used, compiler emits .gcc_except_table that describes the location of catch block (landing pad) for specific range that potentially invokes a throw(). Normally landing pads reside in the function, but with -fsplit-machine-functions, landing pads can be moved to another fragment. The intuition is, landing pads are rarely executed, so compiler can move them to .cold section. This update will mark all fragments that have landing pad to another fragment as non-simple, and later propagate non-simple to all related fragments. This update also includes one manual test case: split-landing-pad.s Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128561	2022-07-14 18:10:22 -07:00
Fabian Parzefall	d55dfeaf32	[BOLT] Replace uses of layout with basic block list As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout. Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129585	2022-07-14 13:07:05 -07:00
Huan Nguyen	05523dc32d	[BOLT] Support multiple parents for split jump table There are two assumptions regarding jump table: (a) It is accessed by only one fragment, say, Parent (b) All entries target instructions in Parent For (a), BOLT stores jump table entries as relative offset to Parent. For (b), BOLT treats jump table entries target somewhere out of Parent as INVALID_OFFSET, including fragment of same split function. In this update, we extend (a) and (b) to include fragment of same split functinon. For (a), we store jump table entries in absolute offset instead. In addition, jump table will store all fragments that access it. A fragment uses this information to only create label for jump table entries that target to that fragment. For (b), using absolute offset allows jump table entries to target fragments of same split function, i.e., extend support for split jump table. This can be done using relocation (fragment start/size) and fragment detection heuristics (e.g., using symbol name pattern for non-stripped binaries). For jump table targets that can only be reached by one fragment, we mark them as local label; otherwise, they would be the secondary function entry to the target fragment. Test Plan ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128474	2022-07-13 23:37:31 -07:00
Vladislav Khmelevsky	35efe1d806	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D129260	2022-07-13 14:47:22 +03:00
Denis Revunov	7564167885	[BOLT][AArch64] Use all supported CPU features on AArch64 Since we now have +all feature for AArch64 disassembler, we can use it in BOLT and allow it to disassemble all ARM instructions supported by LLVM. Reviewed by: rafauler Differential Revision: https://reviews.llvm.org/D129139	2022-07-12 03:56:04 -04:00
Rafael Auler	a3cfdd746e	[BOLT] Increase coverage of shrink wrapping [5/5] Add -experimental-shrink-wrapping flag to control when we want to move callee-saved registers even when addresses of the stack frame are captured and used in pointer arithmetic, making it more challenging to do alias analysis to prove that we do not access optimized stack positions. This alias analysis is not yet implemented, hence, it is experimental. In practice, though, no compiler would emit code to do pointer arithmetic to access a saved callee-saved register unless there is a memory bug or we are failing to identify a callee-saved reg, so I'm not sure how useful it would be to formally prove that. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126115	2022-07-11 17:30:13 -07:00
Rafael Auler	3e5f67f356	[BOLT] Increase coverage of shrink wrapping [4/5] Change shrink-wrapping to try a priority list of save positions, instead of trying the best one and giving up if it doesn't work. This also increases coverage. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126114	2022-07-11 17:30:05 -07:00
Rafael Auler	3332904ad6	[BOLT] Increase coverage of shrink wrapping [3/5] Add the option to run -equalize-bb-counts before shrink wrapping to avoid unnecessarily optimizing some CFGs where profile is inaccurate but we can prove two blocks have the same frequency. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126113	2022-07-11 17:30:00 -07:00
Rafael Auler	3508ced6ea	[BOLT] Increase coverage of shrink wrapping [2/5] Refactor isStackAccess() to reflect updates by D126116. Now we only handle simple stack accesses and delegate the rest of the cases to getMemDataSize. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126112	2022-07-11 17:29:54 -07:00
Rafael Auler	42465efd17	[BOLT] Increase coverage of shrink wrapping [1/5] Change how function score is calculated and provide more detailed statistics when reporting back frame optimizer and shrink wrapping results. In this new statistics, we provide dynamic coverage numbers. The main metric for shrink wrapping is the number of executed stores that were saved because of shrink wrapping (push instructions that were either entirely moved away from the hot block or converted to a stack adjustment instruction). There is still a number of reduced load instructions (pop) that we are not counting at the moment. Also update alloc combiner to report dynamic numbers, as well as frame optimizer. For debugging purposes, we also include a list of top 10 functions optimized by shrink wrapping. These changes are aimed at better understanding the impact of shrink wrapping in a given binary. We also remove an assertion in dataflow analysis to do not choke on empty functions (which makes no sense). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126111	2022-07-11 17:29:22 -07:00
spupyrev	228970f612	Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"" This reverts commit `76029cc53e`.	2022-07-11 09:50:47 -07:00
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
spupyrev	7228371054	[BOLT] Do not merge cold and hot chains of basic blocks There is a post-processing in ext-tsp block reordering that merges some blocks into chains. This allows to maintain the original block order in the absense of profile data and can be beneficial for code size (when fallthroughs are merged). In the earlier version we could merge hot and cold (with zero execution count) chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The diff eliminates the redundant merging. It is unlikely the change will affect the performance of a binary in a measurable way, as it is mostly operates with cold basic blocks. However, after the diff the impact of split-all-cold is almost negligible and we can avoid the extra function splitting. Measuring on the clang binary (negative is good, positive is a regression): clang12 benchmark1: `0.0253` benchmark2: `-0.1843` benchmark3: `0.3234` benchmark4: `0.0333` clang10 benchmark1 `-0.2517` benchmark2 `-0.3703` benchmark3 `-0.1186` benchmark4 `-0.3822` clang7 benchmark1 `0.2526` benchmark2 `0.0500` benchmark3 `0.3024` benchmark4 `-0.0489` Overall: `-0.0671 ± 0.1172` (insignificant) Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129397	2022-07-11 09:31:52 -07:00
Maksim Panchenko	76029cc53e	Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations" Summary: This reverts commit `729d29e167`. Needed as a workaround for T112872562. Manual rebase conflict history: https://phabricator.intern.facebook.com/D35230076 https://phabricator.intern.facebook.com/D35681740 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: spupyrev Differential Revision: https://phabricator.intern.facebook.com/D37098481	2022-07-11 09:31:52 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
Alexander Yermolovich	e159abdb04	[BOLT][DWARF] Support mix mode DWARF Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D128232	2022-06-30 16:53:15 -07:00
Amir Ayupov	66b01a8934	[BOLT] Fix getDynoStats to handle BCs with no functions Address fuzzer crash Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D120696	2022-06-30 01:18:45 -07:00
Amir Ayupov	cb75faf40c	[X86][BOLT] Use getOperandType to determine memory access size Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc. This diff adds support for instructions that were previously reported as having memory access size 0. It replaces the heuristic of looking at instruction register width to determine memory access width by instead checking the memory operand type using tablegen-provided tables. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D126116	2022-06-30 00:25:32 -07:00
Amir Ayupov	798e92c6c4	[BOLT] Respect shouldPrint in dump-dot-all Don't dump dot CFG graph for functions that should not be printed. Reviewed By: rafauler, maksfb Differential Revision: https://reviews.llvm.org/D128699	2022-06-29 17:01:17 -07:00
Maksim Panchenko	ed74304506	[BOLT] Fix EH trampoline backout code When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it larger than the whole function pre-split. When this happens, the pass reverts the splitting action by restoring the original block order and marking all blocks hot. However, if createEHTrampolines() added new blocks to the CFG and modified invoke instructions, simply restoring the original block layout will not suffice as the new CFG has more blocks. For proper backout of the split, modify the original layout by merging in trampoline blocks immediately before their matching targets. As a result, the number of blocks increases, but the number of instructions and the function size remains the same as pre-split. Add an assertion for the number of blocks when updating a function layout. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128696	2022-06-29 14:35:57 -07:00
Fabian Parzefall	e341e9f094	[BOLT] Add option to randomize function split point For test purposes, we want to split functions at a random split point to be able to test different layouts without relying on the profile. This patch introduces an option, that randomly chooses a split point to partition blocks of a function into hot and cold regions. Reviewed By: Amir, yota9 Differential Revision: https://reviews.llvm.org/D128773	2022-06-29 13:02:05 -07:00
Rafael Auler	fc2d96c334	Revert "[BOLT][AArch64] Handle gold linker veneers" This reverts commit `425dda76e9`. This commit is currently causing BOLT to crash in one of our binaries and needs a bit more checking to make sure it is safe to land.	2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky	425dda76e9	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D128082	2022-06-28 16:14:05 +03:00
Amir Ayupov	d58b5a0614	[BOLT] Restrict icp-inline to callsites ICP peel for inline mode only makes sense for calls, not jump tables. Plus, add a check that the Target BinaryFunction is found. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128404	2022-06-27 11:08:55 -07:00
Amir Ayupov	0d477f63b0	[BOLT][NFC] Add aliases for ICP flags - `indirect-call-promotion` -> `icp` - `indirect-call-promotion-mispredict-threshold` -> `icp-mp-threshold` - `indirect-call-promotion-use-mispredicts` -> `icp-use-mp` - `indirect-call-promotion-topn` -> `icp-topn` - `indirect-call-promotion-calls-topn` -> `icp-calls-topn` - `indirect-call-promotion-jump-tables-topn` -> `icp-jt-topn` - `icp-jump-table-targets` -> `icp-jt-targets` This also fixes an inconsistency in ICP flag names that some start with `indirect-call-promotion` while others start with `icp`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128375	2022-06-27 10:29:26 -07:00
Amir Ayupov	c4302e4fc2	[BOLT][NFC] Use llvm::less_first Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128242	2022-06-27 10:27:17 -07:00
Fabian Parzefall	96f6ec5090	[BOLT] Mark option values of --split-functions deprecated The SplitFunctions pass does not distinguish between various splitting modes anymore. This change updates the command line interface to reflect this behavior by deprecating values passed to the --split-function option. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128558	2022-06-24 17:01:13 -07:00
Alexander Yermolovich	11a8dd65ec	[BOLT][DWARF] Add support for DW_AT_call_pc/DW_AT_call_return_pc DWARF 5 added two new attributes DW_AT_call_pc and DW_AT_call_return_pc. Adding support for them. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D128526	2022-06-24 12:37:58 -07:00
Amir Ayupov	d2c8769936	[BOLT][NFC] Use range-based STL wrappers Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128154	2022-06-23 22:16:27 -07:00
Maksim Panchenko	f263a66ba0	[BOLT] Split functions with exceptions in shared objects and PIEs Add functionality to allow splitting code with C++ exceptions in shared libraries and PIEs. To overcome a limitation in exception ranges format, for functions with fragments spanning multiple sections, add trampoline landing pads in the same section as the corresponding throwing range. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127936	2022-06-19 16:48:48 -07:00
Amir Ayupov	445bc88501	[BOLT] Use 32-bit MOV to zero 64-bit register in instrumentation code Instead of `movabsq $0x0, %rax` emit shorter equivalent `movl $0x0, %eax`. Intel SDM, 3.4.1.1 General-Purpose Registers in 64-Bit Mode: >32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in > the destination general-purpose register. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D127045	2022-06-19 11:34:32 -07:00
Huan Nguyen	543f13c99b	[BOLT] Allow function entry to be a cold fragment Allow cold fragment to get new address. Our previous assumption is that a fragment (.cold) is only reached through the main fragment of same function. In addition, .cold fragment must be reached through either (a) direct transfer, or (b) split jump table. For (a), we perform a simple fix-up. For (b), we currently mark all relevant fragments as non-simple. Therefore, there is no need to get new address for .cold fragment. This is not always the case, as function entry can be rarely executed, and is placed in .text.cold segment. Essentially we cannot tell which the source-level function entry is based on hot and cold segments, so we must treat each fragment a function on its own. Therfore, we remove the assertion that a function entry cannot be cold fragment. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128111	2022-06-18 11:39:51 -07:00
Huan Nguyen	28b1dcb122	[BOLT] Allow function fragments to point to one jump table Resolve a crash related to split functions Due to split function optimization, a function can be divided to two  fragments, and both fragments can access same jump table. This violates  the assumption that a jump table can only have one parent function,  which causes a crash during instrumentation. We want to support the case: different functions cannot access same jump tables, but different fragments of same function can! As all fragments are from same function, we point JT::Parent to one specific fragment. Right now it is the first disassembled fragment, but we can point it to the function's main fragment later. Functions are disassembled sequentially. Previously, at the end of processing a function, JT::OffsetEntries is cleared, so other fragment can no longer reuse JT::OffsetEntries. To extend the support for split function, we only clear JT::OffsetEntries after all functions are disassembled. Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to different fragment start. A simple solution: A.hot = {X',Y',INVALID_OFFSET} A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET} We update the assertion to allow different fragments of same function to get the same JumpTable object. Potential improvements: A.hot = {X',Y',INVALID_OFFSET} A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'} The main issue is A.hot and A.cold have separate CFGs, thus jump table targets are still constrained within fragment bounds. Future improvements: A.hot = {X, Y, Z} A.cold = {X, Y, Z} Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127924	2022-06-17 16:22:30 -07:00
Rafael Auler	9d5e6ccd9b	[BOLT] Fix for missing entry offset Temporary fix for missing entry offset when creating address translation tables (BAT) after D127935 landed. Will later work on assigning a more reasonable offset different than zero. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128092	2022-06-17 13:14:42 -07:00
Maksim Panchenko	8228c70358	[BOLT][NFCI] Refactor interface for adding basic blocks Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127935	2022-06-16 11:51:57 -07:00
Amir Ayupov	02d510b416	[BOLT][NFC] Pass Function to BC.printInstructions in BinaryBasicBlock::dump BC::printInstruction(s) has many uses of Function ptr if it's available: # printing CFI instructions (unconditional) # printing debug line information (-print-debug-info) # printing instruction relocations (-print-relocations) Enable these uses by passing Function ptr from the primary printing entry point: BinaryBasicBlock::dump. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D126916	2022-06-13 14:26:51 -07:00
Amir Ayupov	a2c4d6d332	[BOLT][NFC] Forward declare ReorderBlocks for MSVC19 Fix bolt-x86_64-wine-msvc builder: https://lab.llvm.org/buildbot/#/builders/222/builds/1154 Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D127612	2022-06-13 10:26:58 -07:00
Vladislav Khmelevsky	6e26ffa064	[BOLT][AARCH64] Skip R_AARCH64_LD_PREL_LO19 relocation Supress failed to analyze relocations warning for R_AARCH64_LD_PREL_LO19 relocation. This relocation is mostly used to get value stored in CI and we don't process it since we are caluclating target address using the instruction value in evaluateMemOperandTarget(). Differential Revision: https://reviews.llvm.org/D127413	2022-06-13 15:40:06 +03:00
Amir Ayupov	7dee646b28	[BOLT][NFC] Move printDebugInfo out of BC::printInstruction Simplify `BinaryContext::printInstruction`. Reviewed By: ayermolo Differential Revision: https://reviews.llvm.org/D127561	2022-06-11 11:58:36 -07:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Huan Nguyen	82095bd5ed	[BOLT] Mark fragments related to split jump table as non-simple Mark fragments related to split jump table as non-simple. A function could be splitted into hot and cold fragments. A split jump table is challenging for correctly reconstructing control flow graphs, so it was marked as ignored. This update marks those fragments as non-simple, allowing them to be printed and partial control flow graph construction. Test Plan: ``` llvm-lit -a tools/bolt/test/X86/split-func-icf.s ``` This test has two functions (main, main2), each has a jump table target to the same cold portion main2.cold.1(*2). We try to print out only this cold portion. If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We verify that it can be printed. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127464	2022-06-10 15:49:32 -07:00
Denis Revunov	0b7e8baf83	[BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG. This patch adds getFirstInstructionOffset method for BinaryFunction which is used to properly handle cases where data is at zero offset in a function. The main change is that we add basic block at first instruction offset when disassembling, which prevents assertion failures in buildCFG. Reviewed By: yota9, rafauler Differential Revision: https://reviews.llvm.org/D127111	2022-06-09 15:26:32 -07:00
Maksim Panchenko	1817642684	[BOLT] Add support for GOTPCRELX relocations The linker can convert instructions with GOTPCRELX relocations into a form that uses an absolute addressing with an immediate. BOLT needs to recognize such conversions and symbolize the immediates. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126747	2022-06-09 13:37:04 -07:00
Alexander Yermolovich	1c6dc43de9	[BOLT]DWARF] Eagerly write out loclists Taking advantage of us being able to re-write .debug_info to reduce memory footprint loclists. Writing out loc-list as they are added, similar to how we handle ranges. Collected on clang-14 trunk 4:41.20 real, 389.50 user, 59.50 sys, 0 amem, 38412532 mmem 4:30.08 real, 376.10 user, 63.75 sys, 0 amem, 38477844 mmem 4:25.58 real, 373.76 user, 54.71 sys, 0 amem, 38439660 mmem diff 4:34.66 real, 392.83 user, 57.73 sys, 0 amem, 38382560 mmem 4:35.96 real, 377.70 user, 58.62 sys, 0 amem, 38255840 mmem 4:27.61 real, 390.18 user, 57.02 sys, 0 amem, 38223224 mmem Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D126999	2022-06-08 16:52:59 -07:00
Vladislav Khmelevsky	fd9604952d	[BOLT] Set valid index for functions with profiles Some of the passes that calculates tentative layout like LongJmp and Golang are expecting that only functions with valid index will be located in hot text section. But currently functions with valid profiles and not set index are breaking this logic, to fix this we can move the hasValidProfile() condition from AssignSections pass to ReorderFunctions. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D127223	2022-06-08 14:13:12 +03:00
Fangrui Song	15d82c62dc	[MC] De-capitalize MCStreamer functions Follow-up to `c031378ce0` . The class is mostly consistent now.	2022-06-07 00:31:02 -07:00
Fangrui Song	b92436efcb	[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options	2022-06-05 13:29:49 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Fangrui Song	72f9c69421	[Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Similar to `557efc9a8b`	2022-06-03 22:04:57 -07:00
Huan Nguyen	5ac26156fe	[BOLT][NFC] Warning for deprecated option '-reorder-blocks=cache+' Emit warning when using deprecated option '-reorder-blocks=cache+'. Auto switch to option '-reorder-blocks=ext-tsp'. Test Plan: ``` ninja check-bolt ``` Added a new test cache+-deprecated.test. Run and verify that the upstream tests are passed. Reviewed By: rafauler, Amir, maksfb Differential Revision: https://reviews.llvm.org/D126722	2022-06-03 14:16:55 -07:00
spupyrev	5904836b8a	[BOLT] Cache-Aware Tail Duplication A new "cache-aware" strategy for tail duplication. Differential Revision: https://reviews.llvm.org/D123050	2022-06-03 09:08:45 -07:00
Amir Ayupov	e2142ff47c	[BOLT][NFC] Make ICP::verifyProfile static Follow LLVM style guide suggestion to avoid function definitions in anonymous namespaces: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124896	2022-06-02 19:09:29 -07:00
Maksim Panchenko	986e5dedf2	[BOLT][NFC] Fix braces in BinaryEmitter Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126844	2022-06-02 12:45:25 -07:00
Amir Ayupov	6333e5dde9	[BOLT][NFC] Use colors in CFG dumps Use color coding to distinguish nodes: - Entry nodes have bold border - Scalar (non-loopy) code is milk white - Outer loops are light yellow - Innermost loops are light blue `-print-loops` needs to be enabled to provide BinaryLoopInfo. Examples: {F23170673} {F23170680} Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126248	2022-06-02 00:27:12 -07:00
Amir Ayupov	cc23c64ff1	[BOLT][NFC] Print block instructions in dumpGraph as part of node label Reuse the option `-dot-tooltip-code` to put block instructions into the label. This way, the instructions are displayed by default when used with dot viewer. When the .dot file is used with dot2html, instructions are hidden by default, and are shown by clicking on a node. {F23169510} Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126237	2022-06-01 23:41:41 -07:00
Alexander Yermolovich	ab9a175990	[BOLT][DWARF] Fix TU Index handling for DWARF4/5 When we generate split dwarf with -fdebug-types-section we will have .debug_types.dwo sections. These go into TU Index when we run llvm-dwp. BOLT was not handling DWP input correctly with this section. Added support for handling DWP with TU Index as an input and output for DWARF4. Added support for handling DWP with TU Index as an input for DWARF5 Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D126087	2022-06-01 18:16:12 -07:00
Maksim Panchenko	0426100ff4	[BOLT][NFC] Remove unused variable Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126808	2022-06-01 13:43:10 -07:00
Maksim Panchenko	e290133c76	[BOLT] Add new class for symbolizing X86 instructions Summary: While disassembling instructions, we need to replace certain immediate operands with symbols. This symbolizing process relies on reading relocations against instructions. However, some X86 instructions can have multiple immediate operands and up to two relocations against them. Thus, correctly matching a relocation to an operand is not always possible without knowing the operand offset within the instruction. Luckily, LLVM provides an interface for passing the required info from the disassembler via a virtual MCSymbolizer class. Creating a target-specific version allows a precise matching of relocations to operands. This diff adds X86MCSymbolizer class that performs X86-specific symbolizing (currently limited to non-branch instructions). Reviewers: yota9, Amir, ayermolo, rafauler, zr33 Differential Revision: https://reviews.llvm.org/D120928	2022-05-31 17:48:19 -07:00
Denis Revunov	8579db96e8	[BOLT] [AArch64] Handle constant islands spanning multiple functions Fix BOLT's constant island mapping when a constant island marked by $d spans multiple functions. Currently, because BOLT only marks the constant island in the first function where $d is located, if the next function contains data at its start, BOLT will miss the data and try to disassemble it. This patch adds code to explicitly go through all symbols between $d and $x markers and mark their respective offsets as data, which stops BOLT from trying to disassemble data. It also adds MarkerType enum and refactors related functions. Reviewed By: yota9, rafauler Differential Revision: https://reviews.llvm.org/D126177	2022-05-31 13:51:35 -07:00
Balazs Benics	a73b50ad06	Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable" This reverts commit `3988bd1398`. Did not build on this bot: https://lab.llvm.org/buildbot#builders/215/builds/6372 /usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to ‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock>&, const std::pair<long unsigned int, std::nullptr_t>&)’ 177 \| { return bool(_M_comp(__it, __val)); }	2022-05-27 11:19:18 +02:00
Balazs Benics	3988bd1398	[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but not quite the same, such as it might have an assertion in the lambda or other constructs. Thus, I've not touched any of those, as it might change the behavior in some way. As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal Chris Lattner > LLVM intentionally has a “yes, you can apply common sense judgement to > things” policy when it comes to code review. If you are doing mechanical > patches (e.g. adopting less_first) that apply to the entire monorepo, > then you don’t need everyone in the monorepo to sign off on it. Having > some +1 validation from someone is useful, but you don’t need everyone > whose code you touch to weigh in. Differential Revision: https://reviews.llvm.org/D126068	2022-05-27 11:15:23 +02:00
Rafael Auler	c09cd64e5c	[BOLT] Fix AND evaluation bug in shrink wrapping Fix a bug where shrink-wrapping would use wrong stack offsets because the stack was being aligned with an AND instruction, hence, making its true offsets only available during runtime (we can't statically determine where are the stack elements and we must give up on this case). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126110	2022-05-26 14:59:28 -07:00
Amir Ayupov	f7581a3969	[BOLT][NFC] Use ListSeparator in BinaryFunction print methods Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126243	2022-05-24 18:29:24 -07:00
Amir Ayupov	69f87b6c29	[BOLT][NFC] Customize endline character for printInstruction(s) This would be used in `BF::dumpGraph` to dump left-justified text. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126232	2022-05-24 18:26:12 -07:00
Amir Ayupov	5d8247d4c7	[BOLT][NFC] Use for_each to simplify printLoopInfo Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D126242	2022-05-24 18:05:43 -07:00
Amir Ayupov	c907d6e0e9	[BOLT][NFC] Suppress unused variable warnings Addresses the warnings emitted by Apple Clang 13.1.6 (Xcode 13.3.1). Tip @tschuett issue #55404. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125733	2022-05-17 14:30:23 -07:00
Amir Ayupov	a7b69dbdd1	[BOLT][NFC] Move BinaryDominatorTree out of BinaryLoop header Split up the BinaryLoop header and move BinaryDominatorTree into its own header, preparing it for a standalone use. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125664	2022-05-17 14:20:11 -07:00
Amir Ayupov	bdba3d091c	[BOLT][CMAKE] Fix DYLIB build Move BOLT libraries out of `LLVM_LINK_COMPONENTS` to `target_link_libraries`. Addresses issue #55432. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125568	2022-05-13 13:27:21 -07:00
Amir Ayupov	253b8f0abd	[BOLT][NFC] Use refs for loop variables to avoid copies Addresses warnings when built with Apple Clang. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D125483	2022-05-13 20:18:29 +01:00
Amir Ayupov	139744ac53	[BOLT][NFC] Suppress unused variable warnings Address warnings in Release build without assertions. Tip @tschuett for reporting the issue #55404. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125475	2022-05-13 20:10:19 +01:00
Amir Ayupov	d63c5a38fe	[BOLT][NFC] Use BitVector::set_bits Refactor and use `set_bits` BitVector interface. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125374	2022-05-11 16:23:44 -07:00
Amir Ayupov	8cb7a873ab	[BOLT][NFC] Add MCPlus::primeOperands iterator_range Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D125397	2022-05-11 09:34:51 -07:00
Amir Ayupov	c2d40f1dfb	[BOLT] Add icp-inline option Add an option to only peel ICP targets that can be subsequently inlined. Yet there's no guarantee that they will be inlined. The mode is independent from the heuristic used to choose ICP targets: by exec count, mispredictions, or memory profile. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124900	2022-05-11 03:21:24 -07:00
Alexander Yermolovich	3abb68a626	[BOLT][DWARF] Fix assert for split dwarf. Fixing a small bug where it would assert if CU does not modify .debug_addr section. Differential Revision: https://reviews.llvm.org/D125181	2022-05-08 19:18:17 -07:00
Alexander Yermolovich	ba1ac98c62	[BOLT][DWARF] Add version 5 split dwarf support Added support for DWARF5 Split Dwarf. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D122988	2022-05-05 14:59:05 -07:00
Rahman Lavaee	733dc3e50b	[BOLT] Report per-section hotness in bolt-heatmap. This patch adds a new feature to bolt heatmap to print the hotness of each section in terms of the percentage of samples within that section. Sample output generated for the clang binary: Section Name, Begin Address, End Address, Percentage Hotness .text, 0x1a7b9b0, 0x20a2cc0, 1.4709 .init, 0x20a2cc0, 0x20a2ce1, 0.0001 .fini, 0x20a2ce4, 0x20a2cf2, 0.0000 .text.unlikely, 0x20a2d00, 0x431990c, 0.3061 .text.hot, 0x4319910, 0x4bc6927, 97.2197 .text.startup, 0x4bc6930, 0x4c10c89, 0.0058 .plt, 0x4c10c90, 0x4c12010, 0.9974 Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124412	2022-05-05 11:37:46 -07:00
Amir Ayupov	f8d2d8b587	[BOLT][NFC] Move getInliningInfo out of Inliner class `getInliningInfo` is useful in other passes that need to check inlining eligibility for some function. Move the declaration and InliningInfo definition out of Inliner class. Prepare for subsequent use in ICP. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124899	2022-05-04 14:08:06 -07:00
Amir Ayupov	2ad1c7540e	[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite Minor refactoring. NFC. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124898	2022-05-04 14:06:53 -07:00
Amir Ayupov	68c7299f16	[BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior Caching behavior of `getAliases` causes a failure in unit tests where two MCPlusBuilder objects are created corresponding to AArch64 and X86: the alias cache is created for AArch64 but then used for X86. https://lab.llvm.org/staging/#/builders/211/builds/126 The issue only affects unit tests as we only construct one MCPlusBuilder for ELF binary. Resolve the issue by moving alias bitvectors to MCPlusBuilder object. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D124942	2022-05-04 12:53:26 -07:00
Amir Ayupov	60957a5a08	[BOLT] Fix ICPJumpTablesTopN option use Fix non-sensical `opts::ICPJumpTablesTopN != 0 ? opts::ICPTopN : opts::ICPTopN`. Refactor/simplify another similar assignment. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124880	2022-05-03 19:34:10 -07:00
Amir Ayupov	c3d5372093	[BOLT][NFC] Make ICP options naming uniform Rename `opts::IndirectCallPromotion` to `opts::ICP`, making option naming uniform and easier to follow. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124879	2022-05-03 19:32:45 -07:00
Amir Ayupov	d0b1c98c96	[BOLT][NFC] ICP: simplify findTargetsIndex Unnest lambda and use `llvm::is_contained`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124877	2022-05-03 19:31:20 -07:00
Amir Ayupov	ec02227bf7	[BOLT][NFC] Refactor ICP::findCallTargetSymbols Reduce nesting making it easier to read. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124876	2022-05-03 19:29:22 -07:00
Paul Kirth	625e0e611b	[BOLT] [NFC] Remove unused variable This patch fixes a warning from -Wunused-but-set-variable MismatchedBranches are counted, but are never reported. Since evaluateProfileData() should already identify and report these cases, we can safely remove the unused variable. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124588	2022-05-03 15:15:56 +00:00
Amir Ayupov	64421e191b	[BOLT][NFC] Reduce Target/{AArch64,X86} dependencies We don't actually depend on entire X86/AArch64 components that pull in CodeGen, SelectionDAG etc., just the Desc part with opcode and other definitions. Note that it doesn't decouple BOLT from these components - we still pull in X86 and AArch64 from top-level llvm-bolt dependencies as we use assembler and disassembler. It's difficult to reduce these as this requires non-trivial changes to X86/AArch64 components themselves (e.g. moving out AsmPrinter). Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124206	2022-04-29 20:37:53 -07:00
Paul Kirth	a0b8ab1ba3	[BOLT][NFC] Fix warning for unqualified call to std::move Fixes warning from RetpolineInsertion.cpp:171:44: warning: unqualified call to std::move [-Wunqualified-std-cast-call] Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D124482	2022-04-26 23:18:20 +00:00
Rahman Lavaee	e59e580116	[BOLT] Refactor DataAggregator::printLBRHeatMap. This also fixes some logs that were impacted by D123067. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D124281	2022-04-25 11:39:44 -07:00
Alexander Yermolovich	014cd37f51	[BOLT][DWARF] Implement monolithic DWARF5 Added implementation to support DWARF5 in monolithic mode. Next step DWARF5 split dwarf support. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D121876	2022-04-21 16:02:23 -07:00
Alexey Moksyakov	48e894a536	[BOLT] Add R_AARCH64_PREL16/32/64 relocations support Reviewed By: yota9, rafauler Differential Revision: https://reviews.llvm.org/D122294	2022-04-21 13:52:47 +03:00
Vladislav Khmelevsky	63686af1e1	[BOLT] Fix build with GCC 7.3.0 The gcc 7.3.0 version raises "could not covert" error without std::move used explicitly. Differential Revision: https://reviews.llvm.org/D124009	2022-04-21 13:47:58 +03:00
Maksim Panchenko	76981fbcf6	[BOLT] Add fuzzy function name matching for LLVM LTO LLVM with LTO can generate function names in the form func.llvm.<number>, where <number> could vary based on the compilation environment. As a result, if a profiled binary originated from a different build than a corresponding binary used for BOLT optimization, then profiles for such LTO functions will be ignored. To fix the problem, use "fuzzy" matching with "func.llvm.*" form. Reviewed By: yota9, Amir Differential Revision: https://reviews.llvm.org/D124117	2022-04-20 17:00:21 -07:00
Alexander Yermolovich	7d6716786f	[BOLT][DWARF] Handle Error returned by visitLocationList Looks like implementation in llvm changed, and now we need to process error being returned. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D124133	2022-04-20 16:40:46 -07:00
Amir Ayupov	4f277f28ab	[BOLT] Check if LLVM_REVISION is defined Handle the case where LLVM_REVISION is undefined (due to LLVM_APPEND_VC_REV=OFF or otherwise) by setting "<unknown>" value as before D123549. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D123852	2022-04-15 06:33:14 -07:00
Amir Ayupov	2a9386726b	[BOLT][NFC] Use LLVM_REVISION instead of BOLT_VERSION_STRING Remove duplicate version string identification Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D123549	2022-04-14 19:16:35 -07:00
Maksim Panchenko	77b75ca53f	[BOLT][perf2bolt] Fix base address calculation for shared objects When processing profile data for shared object or PIE, perf2bolt needs to calculate base address of the binary based on the map info reported by the perf tool. When the mapping data provided is for the second (or any other than the first) segment and the segment's file offset does not match its memory offset, perf2bolt uses wrong assumption about the binary base address. Add a function to calculate binary base address using the reported memory mapping and use the returned base for further address adjustments. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D123755	2022-04-14 10:29:53 -07:00
Vladislav Khmelevsky	2f98c5febc	[BOLT] Update skipRelocation for aarch64 The ld might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP, add the new case to the skipRelocation for aarch64. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D123334	2022-04-13 22:54:06 +03:00
Maksim Panchenko	36cb736665	[BOLT] Ignore PC-relative relocations from data to data BOLT expects PC-relative relocations in data sections to reference code and the relocated data to form a jump table. However, there are cases where PC-relative addressing is used for data-to-data references (e.g. clang-15 can generate such code). BOLT should recognize and ignore such relocations. Otherwise, they will be considered relocations not claimed by any jump table and cause a failure in the strict mode. Reviewed By: yota9, Amir Differential Revision: https://reviews.llvm.org/D123650	2022-04-13 11:13:51 -07:00
Amir Ayupov	bad3798113	[BOLT] Fix data race in shortenInstructions Address ThreadSanitizer warning Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D121338	2022-04-13 11:10:36 -07:00
Rahman Lavaee	0c13d97e2b	Allow building heatmaps from basic sampled events with `-nl`. I find that this is useful for finding event hotspots. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D123067	2022-04-11 15:04:44 -07:00
Amir Ayupov	9b02dc631d	[BOLT] Check MCContext errors Abort on emission errors to prevent a malformed binary being written. Example: ``` <unknown>:0: error: Undefined temporary symbol .Ltmp26310 <unknown>:0: error: Undefined temporary symbol .Ltmp26311 <unknown>:0: error: Undefined temporary symbol .Ltmp26312 <unknown>:0: error: Undefined temporary symbol .Ltmp26313 <unknown>:0: error: Undefined temporary symbol .Ltmp26314 <unknown>:0: error: Undefined temporary symbol .Ltmp26315 BOLT-ERROR: Emission failed. ``` Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D123263	2022-04-08 21:08:39 -07:00
Argyrios Kyrtzidis	330268ba34	[Support/Hash functions] Change the `final()` and `result()` of the hashing functions to return an array of bytes Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`: * When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural * Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef` As part of this patch also: * Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes. * Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API. Differential Revision: https://reviews.llvm.org/D123100	2022-04-05 21:38:06 -07:00
Amir Ayupov	f99398fe0e	[BOLT][NFC] Move isADD64rr and isADDri out of MCPlusBuilder class Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D123077	2022-04-05 14:32:07 -07:00
Vladislav Khmelevsky	2e51a32219	[BOLT] Check for !isTailCall in isUnconditionalBranch Add !isTailCall in isUnconditionalBranch check in order to sync the x86 and aarch64 and fix the fixDoubleJumps pass on aarch64. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D122929	2022-04-05 23:39:34 +03:00
Vladislav Khmelevsky	4956e0e197	[BOLT] Fix plt relocations symbol match The bfd linker adds the symbol versioning string to the symbol name in symtab. Skip the versioning part in order to find the registered PLT function. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D122039	2022-04-05 15:57:26 +03:00
Amir Ayupov	686406a006	[BOLT][NFC] Use X86 mnemonic checks Remove switches in X86MCPlusBuilder.cpp, use mnemonic checks instead Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D122853	2022-04-04 14:05:46 -07:00
Vladislav Khmelevsky	3b1314f4de	[BOLT] AArch64: Read all static relocations Read static relocs on the same address, as dynamic in order to update constant island data address properly. Differential Revision: https://reviews.llvm.org/D122100	2022-04-03 19:03:35 +03:00
Vladislav Khmelevsky	4c14519ecb	[BOLT] LongJmp: Check for shouldEmit Check that the function will be emitted in the final binary. Preserving old function address is needed in case it is PLT trampiline, that is currently not moved by the BOLT. Differential Revision: https://reviews.llvm.org/D122098	2022-03-31 22:33:09 +03:00
Vladislav Khmelevsky	fed958c6cc	[BOLT] AArch64: Emit text objects BOLT treats aarch64 objects located in text as empty functions with contant islands. Emit them with at least 8-byte alignment to the new text section. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D122097	2022-03-31 22:28:50 +03:00
Amir Ayupov	c31af7cfe3	[MC][BOLT] Add setter for AllowAtInName Use the setter in BOLT to allow printing names with variant kind in the name (e.g. "func@PLT"). Fixes BOLT buildbot tests that broke after D122516: https://lab.llvm.org/buildbot/#/builders/215/builds/3595 Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D122694	2022-03-30 13:04:28 -07:00
Vladislav Khmelevsky	af9bdcfc46	[BOLT] Align constant islands to 8 bytes AArch64 requires CI to be aligned to 8 bytes due to access instructions restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes. Differential Revision: https://reviews.llvm.org/D122065	2022-03-27 22:30:42 +03:00
spupyrev	4609f60ebc	[BOLT] Avoid pointless loop rotation It seems the earlier implementation does not follow the description in LoopRotationPass.h: It rotates loops even if they are already laid out correctly. The diff adjusts the behaviour. Given that the impact of LoopInversionPass is minor, this change won't yield significant perf differences. Tested on clang-10: there seems to be a 0.1%-0.3% cpu win and a small reduction of branch misses. Before: BOLT-INFO: 120 Functions were reordered by LoopInversionPass After: BOLT-INFO: 79 Functions were reordered by LoopInversionPass Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D121921	2022-03-22 12:42:42 -07:00
Vladislav Khmelevsky	5be5d0f56e	[BOLT] LongJmp speedup refactoring Run tentativeLayoutRelocMode twice only if UseOldText option was passed. Refactor BF loop to break on condtition met. Differential Revision: https://reviews.llvm.org/D121825	2022-03-18 16:16:47 +03:00
Amir Ayupov	42e8e00189	[BOLT][NFC] Use X86 mnemonic tables Remove tables from X86MCPlusBuilder, make use of llvm::X86 mnemonic tables. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D121573	2022-03-18 01:52:11 -07:00
Amir Ayupov	dc1cf838a5	[BOLT] Strip redundant AdSize override prefix Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it in BOLT explicitly (-x86-strip-redundant-adsize, on by default). Test Plan: `bin/llvm-lit -a bolt/test/X86/addr32.s` Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D120975	2022-03-16 09:38:17 -07:00
Amir Ayupov	698127df51	[BOLT][NFC] Move isMOVSX64rm32 out of MCPlusBuilder Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D121669	2022-03-16 08:18:56 -07:00
Vladislav Khmelevsky	62a289d85c	[BOLT] LongJmp: Fix hot text section alignment The BinaryEmitter uses opts::AlignText value to align the hot text section. Also check that the opts::AlignText is at least equal opts::AlignFunctions for the same reason, as described in D121392. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D121728	2022-03-16 15:57:46 +03:00
Maksim Panchenko	57f03db195	[BOLT][NFC] Remove unused function Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D121729	2022-03-15 12:39:14 -07:00
Vladislav Khmelevsky	8ab69baad5	[BOLT] Set cold sections alignment explicitly The cold text section alignment is set using the maximum alignment value passed to the emitCodeAlignment. In order to calculate tentetive layout right we will set the minimum alignment of such sections to the maximum possible function alignment explicitly. Differential Revision: https://reviews.llvm.org/D121392	2022-03-15 22:12:17 +03:00
Amir Ayupov	5790441c45	[BOLT][NFC] Use getShortOpcodeArith in X86MCPlusBuilder Unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in X86MCPlusBuilder.cpp. Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D121404	2022-03-12 09:07:28 -08:00
Elvina Yakubova	db65429db5	[BOLT] Divide RegularPageSize for X86 and AArch64 cases For AArch64 in some cases/some distributions ld uses 64K alignment of LOAD segments by default. Reviewed By: yota9, maksfb Differential Revision: https://reviews.llvm.org/D119267	2022-03-10 23:09:50 +03:00
Vladislav Khmelevsky	04b87cf0e7	[BOLT] LongJmp: Use per-function alignment values The per-function alignment values must be used in order to create tentative layout. Differential Revision: https://reviews.llvm.org/D121298	2022-03-10 19:48:48 +03:00

1 2 3 4 5 ...

356 Commits