Commit Graph

23058 Commits

Author SHA1 Message Date
Fangrui Song 4b1b9e22b3 Remove unused #include "llvm/ADT/Optional.h" 2022-12-05 04:21:08 +00:00
Fangrui Song 89fae41ef1 [IR] llvm::Optional => std::optional
Many llvm/IR/* files have been migrated by other contributors.
This migrates most remaining files.
2022-12-05 04:13:11 +00:00
Fangrui Song b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Fangrui Song f4c16c4473 [MC] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 21:36:08 +00:00
Krzysztof Parzyszek 0ca43d4488 DebugInfoMetadata: convert Optional to std::optional 2022-12-04 11:52:02 -06:00
Fangrui Song bac974278c CodeGen/CommandFlags: Convert Optional to std::optional 2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek 8c7c20f033 Convert Optional<CodeModel> to std::optional<CodeModel> 2022-12-03 12:08:47 -06:00
Kazu Hirata 20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek 86fe4dfdb6 TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
2022-12-02 11:42:15 -08:00
Krzysztof Parzyszek 4e12d1836a Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248c.

Some buildbots are failing.
2022-12-02 11:34:04 -08:00
Krzysztof Parzyszek b83711248c TargetTransformInfo: convert Optional to std::optional 2022-12-02 11:27:12 -08:00
Krzysztof Parzyszek 864aaa21b4 TargetLowering: convert Optional to std::optional 2022-12-01 16:19:10 -08:00
Fangrui Song c8508fa6dc [X86][MC] Remove "in directive" from diagnostics 2022-12-01 22:15:41 +00:00
Phoebe Wang 54ebf1c4a1 [X86][FP16] Do not combine fminnum/fmaxnum for FP16 emulation
Under the emulation situation, we lack native fmin/fmax instruction support.

Fixes #59258

Reviewed By: skan, spatel

Differential Revision: https://reviews.llvm.org/D139078
2022-12-01 23:24:40 +08:00
Simon Pilgrim 2ab7c7e50a [X86] Remove unnecessary RDRAND overrides from znver1/znver2 model
Reported by D138359 - the overrides matched the base class schedule definition (its been flagged as WriteMicrocoded instead of WriteSystem but the models define both the same)
2022-12-01 13:41:43 +00:00
Simon Pilgrim 4d98eb2196 [X86] Remove unnecessary INTO overrides from znver1/znver2 model
Reported by D138359 - the overrides matched the base class schedule definition (its been flagged as WriteMicrocoded instead of WriteSystem but the models define both the same)
2022-12-01 12:30:40 +00:00
Simon Pilgrim 19d1e4cd44 [X86] Remove unnecessary VPERMPS/VPERMDrr overrides from znver3 model
Reported by D138359 - the overrides matched the base class schedule definition (in the case of VPERMDYrr it was entirely replacing uses of WriteVarShuffle256 so could that could be adjusted directly)
2022-12-01 12:30:40 +00:00
Simon Pilgrim 74c0f57d0b [X86] Remove unnecessary XADD*rr overrides from bdver2 model
Reported by D138359 - the overrides matched the base class schedule definition
2022-12-01 12:30:39 +00:00
Freddy Ye 89f36dd8f3 [X86] Add ExpandLargeFpConvert Pass and enable for X86
As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
`compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`,
and etc.

Corner cases:
1. For fp16: as there is no related builtins added in compliler-rt. So I
mainly utilized the fp32 <-> fp16 lib calls to implement.
2. For fp80: as this pass is soft fp emulation and no fp80 instructions can
help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.
3. For bf16: as clang FE currently doesn't support bf16 algorithm operations
(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.
4. For unsigned FPToI: since both default hardware behaviors and libgcc are
ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M

The end-to-end tests are uploaded at https://reviews.llvm.org/D138261

Reviewed By: LuoYuanke, mgehre-amd

Differential Revision: https://reviews.llvm.org/D137241
2022-12-01 13:47:43 +08:00
Xiang1 Zhang 94c5df8a76 [AMX] Support AMX-FP16 new intrinsic interface
We support AMX-FP16 isa in https://reviews.llvm.org/D135941 now.
The old  intrinsic interface need to manually write tile registers.
So we support its new intrinsic interface to let it be able to do register allocation.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D138987
2022-12-01 09:47:53 +08:00
Simon Pilgrim b0468e3e22 [X86] Add missing PFM port mappings for Core2/Nehalem
This was an old patch from when I was trying to improve pre-AVX scheduler support as part of D103695, we were missing port mappings entirely for these targets - although tbh they don't map well to the SandyBridge model that they currently use.
2022-11-30 12:31:49 +00:00
Sylvain Audi 3f3438a596 [CodeGen][X86] Crash fixes for "patchable-function" pass
This patch fixes crashes related with how PatchableFunction selects the instruction to make patchable:
- Ensure PatchableFunction skips all instructions that don't generate actual machine instructions.
- Handle the case where the first MachineBasicBlock is empty
- Removed support for 16 bit x86 architectures.

Note: another issue remains related with PatchableFunction, in the lowering part.
See https://github.com/llvm/llvm-project/issues/59039

Differential Revision: https://reviews.llvm.org/D137642
2022-11-30 07:29:54 -05:00
Tim Northover b32280baf9 X86: relax EFLAGS liveness check when generating stack probes.
The probes are all inserted at the iterator passed into the functions, so
that's where any EFLAGS clobbering will happen and where we need it to be dead.

Fixes: https://github.com/llvm/llvm-project/issues/59121
2022-11-30 11:44:39 +00:00
Simon Pilgrim f51170bffd [X86] Fix SLM ldmxcsr/stmxcsr schedule classes
Fix a long standing FIXME comment using a mixture of llvm-exegesis and Agner numbers
2022-11-28 17:43:17 +00:00
Simon Pilgrim c65d5d4aec [X86] Remove unnecessary (V)?PBLENDW(Y)?rm overrides
The znver1/znver2 overrides shouldn't need 2uops for the xmm case (but znver1 should double-pump for the ymm case).

Found with the help of D138359
2022-11-28 16:32:55 +00:00
Guillaume Chatelet 702126aec5 [NFC] Add helper method to ensure min alignment on MCSection
Follow up on D138653.

Differential Revision: https://reviews.llvm.org/D138686
2022-11-28 10:00:34 +00:00
Simon Pilgrim 026df9514e [X86] Remove unnecessary VBLENDWYrr overrides
The znver2 override already matched the WriteBlendY class exactly, and the znver1 override wasn't accounting for ymm double-pumping.

Found with the help of D138359
2022-11-27 16:54:47 +00:00
Simon Pilgrim 2285ba9acc [X86] Fix uops counts for SLM extract/extract-store instructions
Matches Intel AoM + Agner
2022-11-27 16:16:36 +00:00
Kazu Hirata 589725f6e8 [llvm] Use std::size (NFC)
std::size, introduced in C++17, allows us to directly obtain the
number of elements of an array.
2022-11-26 13:47:32 -08:00
Kazu Hirata 3583f4ff4b [X86] Use std::optional in X86SpeculativeLoadHardening.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 23:13:23 -08:00
Simon Pilgrim c757780c62 [X86] lowerShuffleAsDecomposedShuffleMerge - try to match unpck(permute(x),permute(y)) for v4i32/v2i64 shuffles
We're using lowerShuffleAsPermuteAndUnpack, which can probably be improved to handle 256/512-bit types pretty easily.

First step towards trying to address the poor vector-shuffle-sse4a.ll pre-SSSE3 codegen mentioned on D127115
2022-11-25 16:24:56 +00:00
Simon Pilgrim 38275ab1b3 [X86] Move lowerShuffleAsPermuteAndUnpack earlier in the source next to similar helpers. NFC.
I'm currently investigating using this inside lowerShuffleAsDecomposedShuffleMerge
2022-11-25 14:56:38 +00:00
Simon Pilgrim 6fd0ae39be [X86] combineScalarAndWithMaskSetcc - handle (concat_vectors (and (vYi1 setcc, vYi1 x), undef)) patterns
If one of the AND operands is a setcc then we're implicitly zeroing the upper mask bits

Similar pattern to regressions identified in D127115 (masked comparisons)
2022-11-25 11:16:24 +00:00
Simon Pilgrim dbe2f44316 [X86] combineScalarAndWithMaskSetcc - optionally peek through (oneuse) any_extend node
Extend pass to handle: (and (any_extend (bitcast (vXi1 (concat_vectors (vYi1 setcc), undef,)))), C)

Fixes several regressions identified in D127115
2022-11-24 16:26:35 +00:00
Guillaume Chatelet 6c09ea3fdd [Alignment][NFC] Use Align in MCStreamer::emitValueToAlignment
Differential Revision: https://reviews.llvm.org/D138674
2022-11-24 16:09:44 +00:00
Guillaume Chatelet 4f17734175 [Alignment][NFC] Use Align in MCStreamer::emitCodeAlignment
This patch makes code less readable but it will clean itself after all functions are converted.

Differential Revision: https://reviews.llvm.org/D138665
2022-11-24 14:51:46 +00:00
Simon Pilgrim 25ea6fa484 [X86] Replace InstRW instregex single matches with instrs entries
This reduces diffs between znver1/znver2 and should marginally speed up tlbgen build time (Issue #35303)

Found by adding a temp check inside InstRegexOp::apply inside single matches
2022-11-24 14:08:40 +00:00
Guillaume Chatelet e647b4f519 [reland][Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
2022-11-24 13:19:18 +00:00
Guillaume Chatelet 3467f9c7d6 Revert D138653 [Alignment][NFC] Use the Align type in MCSection"
This breaks the bolt project.
This reverts commit 409f0dc4a4.
2022-11-24 12:42:30 +00:00
Guillaume Chatelet 409f0dc4a4 [Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
2022-11-24 12:32:58 +00:00
Haohai Wen 1215e86a0e [CostModel][X86] Fix permute latency cost
Avx512 permute latency should be 3 instead of 1.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138427
2022-11-23 19:17:16 +08:00
Alex Richardson 88218d5c52 [SelectionDAG] Remove deprecated MemSDNode->getAlignment()
I noticed a an assertion error when building MIPS code that loaded from
NULL. Loading from NULL ends up being a load with maximum alignment, and
due to integer truncation the value maximum was interpreted as 0 and the
assertion in MipsDAGToDAGISel::Select() failed. This previously happened
to work, but the maximum alignment was increased in
df84c1fe78, so it no longer fits into a 32
bit integer.
Instead of just fixing the one MIPS case, this patch removes all uses of
the deprecated getAlignment() call and replaces them with getAlign().

Differential Revision: https://reviews.llvm.org/D138420
2022-11-23 09:04:42 +00:00
Phoebe Wang 7218103bca [X86] Use lock add/sub/or/and/xor for cases that we only care about the EFLAGS (negated cases)
This fixes #58685

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D138428
2022-11-23 09:39:04 +08:00
Davide Italiano 0c011335c9 [X86] Don't lower f16->f80 fpext to libcall on darwin.
We don't provide __extendhfxf2, and only have the soft-float
__extendhfsf2 in compiler-rt.  This only changed recently with
655ba9c8a1, so this patch reverts back to the previous behavior.

However, the f80->f16 fptrunc is not easily implementable without
the compiler-rt __truncxfhf2, but that has always been true, and
isn't an immediate regression.

Patch by Ahmed Bougacha.

rdar://102194995
2022-11-22 12:32:22 -08:00
Simon Pilgrim 8fa57a715d [X86] Cleanup WriteBlend classes to match (V)PLENDW instruction
Minor cleanup toward fixing the unnecessary scheduler overrides warnings from D138359
2022-11-22 17:56:15 +00:00
Phoebe Wang b39b76f2ef [X86] Allow no X87 on 32-bit
This patch is an alternative of D100091. It solved the problems in `f80` type lowering.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D137946
2022-11-22 10:47:47 +08:00
Manuel Brito 1e55d5b1f2 Use poison instead of undef as placeholder for vector construction [NFC]
Differential Revision: https://reviews.llvm.org/D138450
2022-11-21 18:43:23 +00:00
Simon Pilgrim 746cf4f13f [X86] Synchronise scheduler classes of VPERM2F128/VBROADCASTF128/VEXTRACTF128/VINSERTF128 with I128 equivalents
znver1/znver2 has barely any difference in behaviour between the AVX1/2 variants of these instructions - it looks like it was a copy+paste mistake to miss the AVX2 integer domain instructions in the overrides.

Having said that the override numbers don't appear to match the numbers in the AMD 17h SoGs very well - for instance vperm2f128/vperm2i128 might be microcoded from the AMD sense of >3 uops, but it doesn't have a 100cy latency..... These will need to be further addressed.
2022-11-21 17:15:47 +00:00
Shengchen Kan 861f5dd688 [X86][NFC] Minor improvement in X86InstrInfo::optimizeCompareInstr
Before this patch, the code enumerated `getCondFromBranch`, `getCondFromSETCC` and `getCondFromFromCMov` to get the condition code of a `MachineInstr`, and assigned the result to variable `OldCC` when `MI || IsSwapped || ImmDelta != 0` was satisfiled.

After this patch, the `if-else` structure is eliminated by using `getCondFromMI`. Since `OldCC` is only used when  `MI || IsSwapped || ImmDelta != 0`  is true, it is initialized with `getCondFromMI` directly outside the scope of `if` now.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D138349
2022-11-21 21:00:07 +08:00
Simon Pilgrim 89365b159e [X86] IceLakeServer - PACKS instructions take latency 3cy
This appears to be a slow down vs Skylake (which the model was copied off) - confirmed with uops.info / instlatx64

Noticed as D138359 was reporting that many of the PACKS overrides were redundant, but were in fact incorrect
2022-11-20 19:28:35 +00:00