llvm-project

Commit Graph

Author	SHA1	Message	Date
ziliangzl	5b1f7d5875	[VENTUS][fix]Fix double add and mul operation Passed test_geometrics in CTS.	2024-07-24 10:32:18 +08:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit `17db0de330`. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Fangrui Song	4e62072ca1	[Passes] llvm::Optional => std::optional	2022-12-04 20:44:52 +00:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Kazu Hirata	998960ee1f	[CodeGen] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:08 -08:00
Jan Svoboda	abf0c6c0c0	Use CTAD on llvm::SaveAndRestore Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D139229	2022-12-02 15:36:12 -08:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit `6d12599fd4`.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Freddy Ye	89f36dd8f3	[X86] Add ExpandLargeFpConvert Pass and enable for X86 As stated in https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528, this implementation is very similar to ExpandLargeDivRem, which expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions with a bitwidth above a threshold into auto-generated functions. This is useful for targets like x86_64 that cannot lower fp convertions with more than 128 bits. The expanded nodes are referring from the IR generated by `compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`, and etc. Corner cases: 1. For fp16: as there is no related builtins added in compliler-rt. So I mainly utilized the fp32 <-> fp16 lib calls to implement. 2. For fp80: as this pass is soft fp emulation and no fp80 instructions can help in this problem. I recommend users to deprecate this usage. For now, the implementation uses fp128 as the temporary conversion type and inserts fptrunc/ext at top/end of the function. 3. For bf16: as clang FE currently doesn't support bf16 algorithm operations (convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for now. 4. For unsigned FPToI: since both default hardware behaviors and libgcc are ignoring "returns 0 for negative input" spec. This pass follows this old way to ignore unsigned FPToI. See this example: https://gcc.godbolt.org/z/bnv3jqW1M The end-to-end tests are uploaded at https://reviews.llvm.org/D138261 Reviewed By: LuoYuanke, mgehre-amd Differential Revision: https://reviews.llvm.org/D137241	2022-12-01 13:47:43 +08:00
Marco Elver	b95646fe70	Revert "Use-after-return sanitizer binary metadata" This reverts commit `d3c851d3fc`. Some bots broke: - https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview - https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio	2022-11-30 23:35:50 +01:00
Dmitry Vyukov	d3c851d3fc	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-11-30 14:50:22 +01:00
Dmitry Vyukov	0aedf9d714	Revert "Use-after-return sanitizer binary metadata" This reverts commit `e6aea4a5db`. Broke tests: https://lab.llvm.org/buildbot/#/builders/16/builds/38992	2022-11-30 09:38:56 +01:00
Dmitry Vyukov	e6aea4a5db	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-11-30 09:14:19 +01:00
Kazu Hirata	dbb1130966	Revert "Use-after-return sanitizer binary metadata" This reverts commit `a1255dc467`. This patch results in: llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member named 'size' in 'llvm::MDTuple'	2022-11-29 09:04:00 -08:00
Dmitry Vyukov	a1255dc467	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-11-29 17:37:36 +01:00
Kazu Hirata	d6f0ab47a7	[CodeGen] Use std::optional in TargetPassConfig.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-26 15:12:11 -08:00
Matthias Gehre	2090e85fee	[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64 This adds the ExpandLargeDivRem to the default pass pipeline. The limit at which it expands div/rem instructions is configured via a new TargetTransformInfo hook (default: no expansion) X86, Arm and AArch64 backends implement this hook to expand div/rem instructions with more than 128 bits. Differential Revision: https://reviews.llvm.org/D130076	2022-09-06 15:32:04 +01:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Luo, Yuanke	5cb0979870	[X86][AMX] Split greedy RA for tile register When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spilled by greedy RA. That cause we fill the shape to config memory after ldtilecfg is executed, so that the shape configuration would be wrong. This patch is to split the tile register allocation from greedy register allocation, so that after tile registers are allocated the shape registers are still virtual register. The shape register only may be redefined or multi-defined by phi elimination pass, two address pass. That doesn't affect tile register configuration. Differential Revision: https://reviews.llvm.org/D128584	2022-06-29 10:35:43 +08:00
Fangrui Song	95a134254a	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 01:07:51 -07:00
Rahman Lavaee	08cc058518	Reland "[Propeller] Promote functions with propeller profiles to .text.hot." This relands commit `4d8d2580c5`. The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests. Differential Revision: https://reviews.llvm.org/D126518	2022-05-26 19:53:14 -07:00
Rahman Lavaee	3aa249329f	Revert "[Propeller] Promote functions with propeller profiles to .text.hot." This reverts commit `4d8d2580c5`.	2022-05-26 18:45:40 -07:00
Rahman Lavaee	4d8d2580c5	[Propeller] Promote functions with propeller profiles to .text.hot. Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely). This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`. The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`. Differential Revision: https://reviews.llvm.org/D122930	2022-05-26 16:23:21 -07:00
Sotiris Apostolakis	ca7c307d18	[SelectOpti][1/5] Setup new select-optimize pass This is the first commit for the cmov-vs-branch optimization pass. The goal is to develop a new profile-guided and target-independent cost/benefit analysis for selecting conditional moves over branches when optimizing for performance. Initially, this new pass is expected to be enabled only for instrumentation-based PGO. RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D120230	2022-05-19 16:31:10 +00:00
Momchil Velikov	b4ad28da19	[CodeGen] Async unwind - add a pass to fix CFI information This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CGA aware) nature of the unwind tables. Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers. This pass takes advantage of the constraints taht LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`): * there is a single basic block, containing the function prologue * possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks. * prologue and epilogue blocks are outside of any loops Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states: - "has a call frame", if the function has executed the prologue, or has not executed any epilogue - "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue These properties can be computed for each basic block by a single RPO traversal. From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order. Where these states differ, we insert compensating CFI instructions, which come in two flavours: - CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers. - CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue. Reviewed By: MaskRay, danielkiss, chill Differential Revision: https://reviews.llvm.org/D114545	2022-04-11 13:27:26 +01:00
Muhammad Omair Javaid	0320115c16	Revert "[CodeGen] Async unwind - add a pass to fix CFI information" This reverts commit `980c3e6dd2`. This commit had failing tests with clang crashing across various AArch64/Linux buildots. https://lab.llvm.org/buildbot/#/builders/179/builds/3346 Differential Revision: https://reviews.llvm.org/D114545	2022-04-05 13:12:30 +05:00
Momchil Velikov	980c3e6dd2	[CodeGen] Async unwind - add a pass to fix CFI information This pass inserts the necessary CFI instructions to compensate for the inconsistency of the call-frame information caused by linear (non-CFG aware) nature of the unwind tables. Unlike the `CFIInstrInserer` pass, this one almost always emits only `.cfi_remember_state`/`.cfi_restore_state`, which results in smaller unwind tables and also transparently handles custom unwind info extensions like CFA offset adjustement and save locations of SVE registers. This pass takes advantage of the constraints that LLVM imposes on the placement of save/restore points (cf. `ShrinkWrap.cpp`): * there is a single basic block, containing the function prologue * possibly multiple epilogue blocks, where each epilogue block is complete and self-contained, i.e. CSR restore instructions (and the corresponding CFI instructions are not split across two or more blocks. * prologue and epilogue blocks are outside of any loops Thus, during execution, at the beginning and at the end of each basic block the function can be in one of two states: - "has a call frame", if the function has executed the prologue, or has not executed any epilogue - "does not have a call frame", if the function has not executed the prologue, or has executed an epilogue These properties can be computed for each basic block by a single RPO traversal. In order to accommodate backends which do not generate unwind info in epilogues we compute an additional property "strong no call frame on entry" which is set for the entry point of the function and for every block reachable from the entry along a path that does not execute the prologue. If this property holds, it takes precedence over the "has a call frame" property. From the point of view of the unwind tables, the "has/does not have call frame" state at beginning of each block is determined by the state at the end of the previous block, in layout order. Where these states differ, we insert compensating CFI instructions, which come in two flavours: - CFI instructions, which reset the unwind table state to the initial one. This is done by a target specific hook and is expected to be trivial to implement, for example it could be: ``` .cfi_def_cfa <sp>, 0 .cfi_same_value <rN> .cfi_same_value <rN-1> ... ``` where `<rN>` are the callee-saved registers. - CFI instructions, which reset the unwind table state to the one created by the function prologue. These are the sequence: ``` .cfi_restore_state .cfi_remember_state ``` In this case we also insert a `.cfi_remember_state` after the last CFI instruction in the function prologue. Reviewed By: MaskRay, danielkiss, chill Differential Revision: https://reviews.llvm.org/D114545	2022-04-04 14:38:22 +01:00
Julian Lettner	64902d335c	Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-23 18:36:55 -07:00
Zequan Wu	581dc3c729	Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" This reverts commit `22570bac69`.	2022-03-23 16:11:54 -07:00
Julian Lettner	22570bac69	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-17 10:47:13 -07:00
Shengchen Kan	ac64d0d230	[NFC][CodeGen] Remove redundant if clause in TargetPassConfig::addPass	2022-03-16 22:14:23 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Simon Pilgrim	7262eacd41	Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'	2022-03-15 13:01:35 +00:00
Julian Lettner	9c542a5a4e	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121327	2022-03-14 17:51:18 -07:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Xiang1 Zhang	c31014322c	TLS loads opimization (hoist) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120000	2022-03-10 09:29:06 +08:00
Xiang1 Zhang	65588a0776	Revert "TLS loads opimization (hoist)" Revert for more reviews This reverts commit `30e612ebdf`.	2022-03-02 14:10:11 +08:00
Xiang1 Zhang	30e612ebdf	TLS loads opimization (hoist) Reviewed By: Wang Pheobe, Topper Craig Differential Revision: https://reviews.llvm.org/D120000	2022-03-02 10:37:24 +08:00
Rong Xu	52d981a4c1	[SampleFDO] Enable FSAFDO loading passes if --enable-fs-discriminator is enabled FSAFDO profile loader is currently disabled even --enable-fs-discriminator is enabled. They need to be turned on by options which makes it cumbersome for experiments. This patch changes the FSAFDO profile loader enabled by default. Since they are guarded by EnableFSDiscriminator, they will only be turned on if --enable-fs-discriminator is enabled. Note that --enable-fs-discriminator is still disabled by default. Differential Revision: https://reviews.llvm.org/D119033	2022-02-05 22:37:09 -08:00
Mircea Trofin	e67430cca4	[MLGO] ML Regalloc Eviction Advisor The bulk of the implementation is common between 'release' mode (==AOT-ed model) and 'development' mode (for training), the main difference is that in development mode, we may also log features (for training logs), inject scoring information (currently after the Virtual Register Rewriter) and then produce the log file. This patch also introduces the score injection pass, 'Register Allocation Pass Scoring', which is trivially just logging the score in development mode. Differential Revision: https://reviews.llvm.org/D117147	2022-01-19 11:00:32 -08:00
Kazu Hirata	d09a284dfb	[CodeGen] Drop unnecessary const from return types (NFC) Identified with readability-const-return-type.	2021-12-28 00:38:11 -08:00
Jay Foad	012248b0bc	Remove the verifyAfter mechanism that was replaced by D111397 Differential Revision: https://reviews.llvm.org/D111872	2021-10-18 10:26:46 +01:00
Jay Foad	b84d9d299e	[TargetPassConfig] Remove an obsolete FIXME comment The "coloring with register" functionality it refers to was removed ten years ago in svn r144481 "Remove the -color-ss-with-regs option".	2021-10-08 07:34:25 +01:00
Jay Foad	097339b1ca	[TargetPassConfig] Enable machine verification after miscellaneous passes In a couple of places machine verification was disabled for no apparent reason, probably just because an "addPass(..., false)" line was cut and pasted from elsewhere. After this patch the only remaining place where machine verification is disabled in the generic TargetPassConfig code, is after addPreEmitPass.	2021-10-07 21:24:50 +01:00
Jay Foad	27c57e791a	[TwoAddressInstruction] Enable machine verification after this pass Differential Revision: https://reviews.llvm.org/D111007	2021-10-07 20:04:51 +01:00
Jay Foad	3ff0a5747d	[PHIElimination] Enable machine verification after this pass Differential Revision: https://reviews.llvm.org/D111006	2021-10-07 20:04:51 +01:00
Jay Foad	0bd4365445	[LiveIntervals] Fix verification of early-clobbered segments Enable verification of live intervals immediately after computing them (when -early-live-intervals is used) and fix a problem that that provokes: currently the verifier insists that a segment that ends at an early-clobber slot must be followed by another segment starting at the same slot. But before TwoAddressInstruction runs, the equivalent condition is: a segment that ends at an early-clobber slot must have its last use tied to an early-clobber def. That condition is harder to check here, so for now just disable this check until tied operands have been rewritten. Differential Revision: https://reviews.llvm.org/D111065	2021-10-05 08:17:56 +01:00
Jay Foad	31c92d515d	[MachineLoopInfo] Enable machine verification after this pass Enabling this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110703	2021-10-01 18:15:57 +01:00
Jay Foad	04787239c9	[LiveVariables] Skip verification of kills inside bundles LiveVariables does not examine the contents of bundles, so MachineVerifier should not expect it to know about kill flags on operands of instructions inside a bundle. With this fix we can enable machine verification after running the LiveVariables analysis. Doing this does not show any problems in check-llvm in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Differential Revision: https://reviews.llvm.org/D110700	2021-10-01 18:15:57 +01:00

1 2 3 4 5

244 Commits