llvm-project

Commit Graph

Author	SHA1	Message	Date
Guillaume Chatelet	e647b4f519	[reland][Alignment][NFC] Use the Align type in MCSection Differential Revision: https://reviews.llvm.org/D138653	2022-11-24 13:19:18 +00:00
Guillaume Chatelet	3467f9c7d6	Revert D138653 [Alignment][NFC] Use the Align type in MCSection" This breaks the bolt project. This reverts commit `409f0dc4a4`.	2022-11-24 12:42:30 +00:00
Guillaume Chatelet	409f0dc4a4	[Alignment][NFC] Use the Align type in MCSection Differential Revision: https://reviews.llvm.org/D138653	2022-11-24 12:32:58 +00:00
Jez Ng	d4bcb45db7	[MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible This reverts commit `d941d59783`. Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 17:24:19 -04:00
Jez Ng	d941d59783	Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible" This reverts commit `ef501bf85d`.	2022-06-12 10:47:08 -04:00
Jez Ng	ef501bf85d	[MC] Omit DWARF unwind info if compact unwind is present where eligible Previously, omitting unnecessary DWARF unwinds was only done in two cases: * For Darwin + aarch64, if no DWARF unwind info is needed for all the functions in a TU, then the `__eh_frame` section would be omitted entirely. If any one function needed DWARF unwind, then MC would emit DWARF unwind entries for all the functions in the TU. * For watchOS, MC would omit DWARF unwind on a per-function basis, as long as compact unwind was available for that function. This diff makes it so that we omit DWARF unwind on a per-function basis for Darwin + aarch64 as well. In addition, we introduce the flag `--emit-dwarf-unwind=` which can toggle between `always`, `no-compact-unwind` (only emit DWARF when CU cannot be emitted for a given function), and the target platform `default`. `no-compact-unwind` is particularly useful for newer x86_64 platforms: we don't want to omit DWARF unwind for x86_64 in general due to possible backwards compat issues, but we should make it possible for people to opt into this behavior if they are only targeting newer platforms. Motivation: I'm working on adding support for `__eh_frame` to LLD, but I'm concerned that we would suffer a perf hit. Processing compact unwind is already expensive, and that's a simpler format than EH frames. Given that MC currently produces one EH frame entry for every compact unwind entry, I don't think processing them will be cheap. I tried to do something clever on LLD's end to drop the unnecessary EH frames at parse time, but this made the code significantly more complex. So I'm looking at fixing this at the MC level instead. Addendum: It turns out that there was a latent bug in the X86 backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is not too surprising given that this combination has not been heretofore used. For functions that have unwind info that cannot be encoded with CU, MC would end up dropping both the compact unwind entry (OK; existing behavior) as well as the DWARF entries (not OK). This diff fixes things so that we emit the DWARF entry, as well as a CU entry with encoding `UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry is necessary, this was the simplest fix. ld64 seems to be able to handle both the absence and presence of this CU entry. Ultimately ld64 (and LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there is no impact to the final binary size. Reviewed By: davide, lhames Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 10:03:56 -04:00
Dávid Bolvanský	44572be295	[NFCI] Fix set-but-unused warning in X86AsmBackend.cpp	2022-03-24 08:13:28 +01:00
Craig Topper	6cfe41dcc8	[X86] Rename more target feature related things consistency. NFC -Rename ModeBit to IsBit to match X86Subtarget. -Rename FeatureLAHFSAHF to FeatureLAFHSAFH64 to match X86Subtarget. -Use consistent capitalization Reviewed By: skan Differential Revision: https://reviews.llvm.org/D121975	2022-03-17 22:27:17 -07:00
Amir Ayupov	999fa9f687	[X86][NFC] Move table from getRelaxedOpcodeArith into its own class Move out the table and prepare the code to reuse it for the reverse mapping. Follows the example of memory folding/unfolding tables in X86InstrFoldTables.cpp Preparation step to unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in BOLT X86MCPlusBuilder.cpp. Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html Reviewed By: skan, MaskRay Differential Revision: https://reviews.llvm.org/D121402	2022-03-12 09:06:17 -08:00
Kazu Hirata	fd7d40640d	[llvm] Use range-based for loops (NFC)	2021-11-28 18:14:49 -08:00
Quinn Pham	b11c66accf	[NFC] Inclusive language: rename master flag to main flag [NFC] As part of using inclusive language within the llvm project, this patch renames master flag to main flag in these comments. Reviewed By: ZarkoCA Differential Revision: https://reviews.llvm.org/D114090	2021-11-25 15:16:11 -06:00
Kazu Hirata	d000431fb2	[X86] Remove X86ELFObjectWriter in X86AsmBackend.cpp (NFC) Note that the identically named class is defined in an anonymous namespace in X86ELFObjectWriter.cpp.	2021-11-01 08:31:54 -07:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Peter Smith	e63455d5e0	[MC] Use local MCSubtargetInfo in writeNops On some architectures such as Arm and X86 the encoding for a nop may change depending on the subtarget in operation at the time of encoding. This change replaces the per module MCSubtargetInfo retained by the targets AsmBackend in favour of passing through the local MCSubtargetInfo in operation at the time. On Arm using the architectural NOP instruction can have a performance benefit on some implementations. For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to limit the chances of this causing problems in the future. I've not done this for other targets such as X86 as there is more frequent use of the MCSubtargetInfo and it looks to be for stable properties that we would not expect to vary per function. This change required threading STI through MCNopsFragment and MCBoundaryAlignFragment. I've attempted to take into account the in tree experimental backends. Differential Revision: https://reviews.llvm.org/D45962	2021-09-07 15:46:19 +01:00
Simon Pilgrim	e78bf49a58	[X86] Rename Subtarget Tuning Feature Flag Prefix. NFC. As suggested on D107370, this patch renames the tuning feature flags to start with 'Tuning' instead of 'Feature'. Differential Revision: https://reviews.llvm.org/D107459	2021-08-05 13:09:23 +01:00
Harald van Dijk	75521bd9d8	[X32] Add Triple::isX32(), use it. So far, support for x86_64-linux-gnux32 has been handled by explicit comparisons of Triple.getEnvironment() to GNUX32. This worked as long as x86_64-linux-gnux32 was the only X32 environment to worry about, but we now have x86_64-linux-muslx32 as well. To support this, this change adds an isX32() function and uses it. It replaces all checks for GNUX32 or MuslX32 by isX32(), except for the following: - Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat GNUX32 and MuslX32 differently. - computeTargetTriple() needs to be able to transform triples to add or remove X32 from the environment and needs to map GNU to GNUX32, and Musl to MuslX32. - getMultiarchTriple() completely lacks any Musl support and retains the explicit check for GNUX32 as it can only return x86_64-linux-gnux32. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D103777	2021-06-07 20:48:39 +01:00
Tim Northover	ba1509da7b	Recommit X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86). Recommited with a fix for unwind info when i386 pc-rel thunks are adjacent to a prologue.	2021-05-18 15:19:05 +01:00
Mitch Phillips	6791a6b309	Revert "X86: support Swift Async context" This reverts commit `747e5cfb9f`. Reason: New frame layout broke the sanitizer unwinder. Not clear why, but seems like some of the changes aren't always guarded by Swyft checks. See https://reviews.llvm.org/rG747e5cfb9f5d944b47fe014925b0d5dc2fda74d7 for more information.	2021-05-17 12:44:57 -07:00
Tim Northover	747e5cfb9f	X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86).	2021-05-17 11:56:16 +01:00
Fangrui Song	4f7562d52f	[MC][X86] Support .reloc , BFD_RELOC_{NONE,8,16,32,64}, The names are unfortunate, but BFD_RELOC_NONE provides a generic way indicating a dependency between two sections, which is useful for ld --gc-sections. See https://sourceware.org/bugzilla/show_bug.cgi?id=27530	2021-03-05 21:31:05 -08:00
Bill Wendling	a9f9ceb35f	[X86] Use correct padding when in 16-bit mode In 16-bit mode, some of the nop patterns used in 32-bit mode can end up mangling other instructions. For instance, an aligned "movz" instruction may have the 0x66 and 0x67 prefixes omitted, because the nop that's used messes things up. xorl %ebx, %ebx .p2align 4, 0x90 movzbl (%esi,%ebx), %ecx Use instead nop patterns we know 16-bit mode can handle. Differential Revision: https://reviews.llvm.org/D97268	2021-02-25 20:05:45 -08:00
Fangrui Song	a048ce13e3	[X86] Default to -x86-pad-for-align=false to drop assembler difference with or w/o -g Fix PR48742: the D75203 assembler optimization locates MCRelaxableFragment's within two MCSymbol's and relaxes some MCRelaxableFragment's to reduce the size of a MCAlignFragment. A -g build has more MCSymbol's and therefore may have different assembler output (e.g. a MCRelaxableFragment (jmp) may have 5 bytes with -O1 while 2 bytes with -O1 -g). `.p2align 4, 0x90` is common due to loops. For a larger program, with a lot of temporary labels, the assembly output difference is somewhat destined. The cost seems to overweigh the benefits so we default to -x86-pad-for-align=false until the heuristic is improved. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D94542	2021-01-16 16:39:54 -08:00
Simon Pilgrim	af71298648	[X86] Cleanup/add namespace closure comments. NFCI. Fixes some clang-tidy llvm-namespace-comment warnings.	2020-09-22 15:06:58 +01:00
Jian Cai	c6334db577	[X86] support .nops directive Add support of .nops on X86. This addresses llvm.org/PR45788. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D82826	2020-08-03 11:50:56 -07:00
Craig Topper	0aad82943a	[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment. The default CPU used by llvm-mc doesn't have the NOPL feature, but if we know we're compiling in 64-bit mode we should be able to use nopl.	2020-07-01 23:59:01 -07:00
Craig Topper	c420762172	Revert "[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment." Looks like lld tests need updates too This reverts commit `3367e9dac5`.	2020-07-01 15:20:53 -07:00
Craig Topper	3367e9dac5	[X86] Enable multibyte NOPs in 64-bit mode for padding/alignment. The default CPU used by llvm-mc doesn't have the NOPL feature, but if we know we're compiling in 64-bit mode we should be able to use nopl.	2020-07-01 10:57:24 -07:00
Fangrui Song	0f6bd9cda6	[MC] Drop unneeded std::abs for DW_def_cfa_offset in DarwinX86AsmBackend::generateCompactUnwindEncoding This clean-up is available after double negation bugs are fixed.	2020-05-22 21:12:47 -07:00
Shengchen Kan	99ac9ce701	[NFC] Clean up in MCObjectStreamer and X86AsmBackend	2020-05-09 12:50:44 +08:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Shengchen Kan	0d3149f431	[MC][X86] Disable branch align in non-text section Summary: The instruction in non-text section can not be executed, so they will not affect performance. In addition, their encoding values are treated as data, so we should not touch them. Reviewers: MaskRay, reames, LuoYuanke, jyknight Reviewed By: MaskRay Subscribers: annita.zhang, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77971	2020-04-18 14:41:25 +08:00
Shengchen Kan	2477cec2ac	[NFC][X86] Refine code in X86AsmBackend Summary: Move code to a better place, rename function, etc Tags: #llvm Differential Revision: https://reviews.llvm.org/D77778	2020-04-09 21:31:52 +08:00
Shengchen Kan	916044d819	[X86][MC] Support enhanced relaxation for branch align Summary: Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization. The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable". Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76286	2020-04-08 19:08:19 +08:00
Shengchen Kan	9f92d4612f	Revert "[NFC][X86] Refine code in X86AsmBackend" This reverts commit `a157cde0ac`.	2020-04-02 15:57:06 +08:00
Shengchen Kan	a157cde0ac	[NFC][X86] Refine code in X86AsmBackend Replace pattern getContents().size with universe function call	2020-04-02 15:41:10 +08:00
Shengchen Kan	d0efd7bfcf	[X86][MC] Disable Prefix padding after hardcode/prefix Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight, eli.friedman Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76475	2020-04-01 09:49:52 +08:00
Fangrui Song	152d14da64	[MC][X86] Make .reloc support arbitrary relocation types Generalizes D62014 (R_386_NONE/R_X86_64_NONE). Unlike ARM (D76746) and AArch64 (D76754), we cannot delete FK_NONE from getFixupKindSize because FK_NONE is still used by R_386_TLS_DESC_CALL/R_X86_64_TLSDESC_CALL.	2020-03-27 13:33:15 -07:00
Shengchen Kan	1fb4f99a21	[X86][MC] Fix the bug for prefix padding support Summary: There is a tiny logic error of D75300, making branch is not correctly aligned with option -x86-pad-max-prefix-size Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76285	2020-03-27 14:16:09 +08:00
Shengchen Kan	39bcc76a92	[X86] Disable nop padding before instruction following hardcode Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: LuoYuanke Subscribers: annita.zhang, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D76176	2020-03-17 09:45:12 +08:00
Shengchen Kan	b1a7a245ec	[NFC][MC] Rename alignBranches* to emitInstruction* alignBranches is X86 specific, change the name in a more general one since other target can do some state chang before and after emitting the instruction.	2020-03-16 17:13:14 +08:00
Philip Reames	a79863f2f7	Support prefix padding for alignment purposes (Relaxable instructions only) Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward. For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not. Differential Revision: https://reviews.llvm.org/D75300	2020-03-15 19:53:41 -07:00
Shengchen Kan	e6f1dd40bd	[X86] Disable nop padding before instruction following a prefix Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: LuoYuanke Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76052	2020-03-14 13:15:30 +08:00
Simon Pilgrim	1e686d2689	[X86] Add FeatureFast7ByteNOP flag Lets us remove another SLM proc family flag usage. This is NFC, but we should probably check whether atom/glm/knl? should be using this flag as well...	2020-03-12 13:06:43 +00:00
Shengchen Kan	3a503ce663	[X86] Reduce the number of emitted fragments due to branch align Summary: Currently, a BoundaryAlign fragment may be inserted after the branch that needs to be aligned to truncate the current fragment, this fragment is unused at most of time. To avoid that, we can insert a new empty Data fragment instead. Non-relaxable instruction is usually emitted into Data fragment, so the inserted empty Data fragment will be reused at a high possibility. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames, LuoYuanke Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75438	2020-03-12 15:37:35 +08:00
Philip Reames	c93f1046fc	[X86/MC] Factor out common code [NFC]	2020-03-05 09:43:41 -08:00
David Blaikie	7a6878a72e	X86AsmBackend.cpp: #ifndef NDEBUG some only-used-in-asserts variables to fix the -Werror non-asserts build	2020-03-04 22:36:24 -08:00
Philip Reames	c94a4133bb	Consistently capitalize a variable [NFC] One instance in a copy paste was pointed out in a review, fix all instances at once.	2020-03-04 20:00:08 -08:00
Shengchen Kan	b3722dea3b	[X86] Add a private member function determinePaddingPrefix for X86AsmBackend Summary: X86 can reduce the bytes of NOP by padding instructions with prefixes to get a better peformance in some cases. So a private member function `determinePaddingPrefix` is added to determine which prefix is the most suitable. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75357	2020-03-05 09:26:33 +08:00
Philip Reames	f708c823f0	[X86] Relax existing instructions to reduce the number of nops needed for alignment purposes If we have an explicit align directive, we currently default to emitting nops to fill the space. As discussed in the context of the prefix padding work for branch alignment (D72225), we're allowed to play other tricks such as extending the size of previous instructions instead. This patch will convert near jumps to far jumps if doing so decreases the number of bytes of nops needed for a following align. It does so as a post-pass after relaxation is complete. It intentionally works without moving any labels or doing anything which might require another round of relaxation. The point of this patch is mainly to mock out the approach. The optimization implemented is real, and possibly useful, but the main point is to demonstrate an approach for implementing such "pad previous instruction" approaches. The key notion in this patch is to treat padding previous instructions as an optional optimization, not as a core part of relaxation. The benefit to this is that we avoid the potential concern about increasing the distance between two labels and thus causing further potentially non-local code grown due to relaxation. The downside is that we may miss some opportunities to avoid nops. For the moment, this patch only implements a small set of existing relaxations.. Assuming the approach is satisfactory, I plan to extend this to a broader set of instructions where there are obvious "relaxations" which are roughly performance equivalent. Note that this patch doesn't change which instructions are relaxable. We may wish to explore that separately to increase optimization opportunity, but I figured that deserved it's own separate discussion. There are possible downsides to this optimization (and all "pad previous instruction" variants). The major two are potentially increasing instruction fetch and perturbing uop caching. (i.e. the usual alignment risks) Specifically: * If we pad an instruction such that it crosses a fetch window (16 bytes on modern X86-64), we may cause the decoder to have to trigger a fetch it wouldn't have otherwise. This can effect both decode speed, and icache pressure. * Intel's uop caching have particular restrictions on instruction combinations which can fit in a particular way. By moving around instructions, we can both cause misses an change misses into hits. Many of the most painful cases are around branch density, so I don't expect this to be too bad on the whole. On the whole, I expect to see small swings (i.e. the typical alignment change problem), but nothing major or systematic in either direction. Differential Revision: https://reviews.llvm.org/D75203	2020-03-04 16:52:35 -08:00
Shengchen Kan	af57b139a0	Temporarily Revert [X86] Not track size of the boudaryalign fragment during the layout Summary: This reverts commit `2ac19feb15`. This commit causes some test cases to run fail when branch is aligned.	2020-03-03 11:15:56 +08:00

1 2 3 4 5

201 Commits