This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
The big change here is that now if you have threading disabled,
`llvm::get_physical_cores()` will return -1, as if it had not been able
to work out the right info. This is due to how Threading.cpp includes
OS-specific code/headers. This seems ok, as if threading is disabled,
LLVM should not need to know the number of physical cores.
Differential Revision: https://reviews.llvm.org/D137836
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
Differential Revision: https://reviews.llvm.org/D137836
This reverts commit 9969ceb36b.
On Windows:
lld-link: error: undefined symbol: int __cdecl computeHostNumPhysicalCores(void)
>>> referenced by LLVMSupport.lib(Support.Host.obj):(int __cdecl llvm::sys::getHostNumPhysicalCores(void))
I noticed these were missing, so this adds Host identifiers for
cortex-a55, cortex-a510, cortex-a710 and cortex-x2, taken from their
respective TRMs.
Differential Revision: https://reviews.llvm.org/D138497
Make the sched_getaffinity based implementation available to all architectures
(except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)`
code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`.
The function is mainly used by in-process ThinLTO to decide the default number
of threads. Returning -1 will use just one thread.
Android is excluded because of the higher API level requirement:
`sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`
As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction
AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4
While working on D118450 <https://reviews.llvm.org/D118450>, I noticed that
`sys::getHostCPUName` lacks SPARC support.
This patch implements it. The code is taken from/inspired by GCC's
`gcc/config/sparc/driver-sparc.cc`. There's one caveat: since LLVM, unlike
GCC, doesn't support the SPARC-M7, -S7, and -M8 CPUs, I map all those to
the latest supported one (UltraSparc T4/`niagara4`).
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu` by
running `savcov --version` on
- Netra SPARC S7-2 (SPARC-S7, Solaris 11.4)
- SPARC T5-2 (SPARC T5, Solaris 11.4)
- SPARC Enterprise T5220 (UltraSPARC T2, Solaris 11.3)
- SPARC T5 (UltraSPARC T5, Debian sid)
- SPARC T3 (UltraSPARC T3, Debian sid)
- SPARC Enterprise T5220 (Debian sid)
Differential Revision: https://reviews.llvm.org/D130272
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM. This patch adds support for
"z16" as an alternate architecture name for arch14.
This improves the getHostCPUName check for Apple M1 CPUs, which
previously would always be considered cyclone instead. This also enables
`-march=native` support when building on M1 CPUs which would previously
fail. This isn't as sophisticated as the X86 CPU feature checking which
consults the CPU via getHostCPUFeatures, but this is still better than
before. This CPU selection could also be invalid if this was run on an
iOS device instead, ideally we can improve those cases as they come up.
Differential Revision: https://reviews.llvm.org/D119788
The RISCV target doesn't define a "generic" cpu, only "generic-rv32" and
"generic-rv64". Define sys::getHostCPUName for RISC-V that returns the
correct cpu for the host.
Reviewed By: craig.topper, MaskRay
Differential Revision: https://reviews.llvm.org/D105274
d8faf03807 implemented general-regs-only for X86 by disabling all features
with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses
only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this
instruction and allows it to be used with general-regs-only.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D105462
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Code in getCPUNameFromS390Model currently assumes that the
numerical value of the model number always increases with
future hardware. While this has happened to be the case
with the last few machines, it is not guaranteed -- that
assumption was violated with (much) older machines, and
it can be violated again with future machines.
Fix by explicitly listing model numbers for all supported
machine models.
Fix a bug that `computeHostNumPhysicalCores` is fallback to default
unknown when building for Apple Silicon macs.
rdar://80533675
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106012
[llvm-exegesis] Disable the LBR check on AMD
https://bugs.llvm.org/show_bug.cgi?id=48918
The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW.
Theory we've got is that the lbr-checking code could be confused on AMD.
Differential Revision: https://reviews.llvm.org/D97504
New change:
- Surround usages of x86 helper in llvm-exegesis/X86/Target.cpp with ifdef
- Fix bug which caused the caller of getVendorSignature to not have a copy of EAX that it expected.