Commit Graph

339 Commits

Author SHA1 Message Date
Archibald Elliott 3c97f6cab9 [Support] Move getHostNumPhysicalCores to Threading.h
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.

The big change here is that now if you have threading disabled,
`llvm::get_physical_cores()` will return -1, as if it had not been able
to work out the right info. This is due to how Threading.cpp includes
OS-specific code/headers. This seems ok, as if threading is disabled,
LLVM should not need to know the number of physical cores.

Differential Revision: https://reviews.llvm.org/D137836
2022-11-29 13:14:13 +00:00
Florian Hahn 07ca9cc04b
Revert "[Support] Move getHostNumPhysicalCores to Threading.h"
This reverts commit 5577207d6d.

This breaks building LLVM on recent macOS. Error messages below:

llvm/lib/Support/Threading.cpp:190:3: error: use of undeclared
identifier 'sysctlbyname'
  sysctlbyname("hw.physicalcpu", &count, &len, NULL, 0);
    ^

llvm/lib/Support/Threading.cpp:193:13: error: use of undeclared
identifier 'CTL_HW'
    nm[0] = CTL_HW;
            ^

llvm/lib/Support/Threading.cpp:194:13: error: use of undeclared identifier 'HW_AVAILCPU'
    nm[1] = HW_AVAILCPU;
            ^

llvm/lib/Support/Threading.cpp:195:5: error: use of undeclared identifier 'sysctl'
    sysctl(nm, 2, &count, &len, NULL, 0);
    ^
2022-11-25 14:11:56 +00:00
Archibald Elliott 5577207d6d [Support] Move getHostNumPhysicalCores to Threading.h
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.

Differential Revision: https://reviews.llvm.org/D137836
2022-11-25 12:51:36 +00:00
Fangrui Song 875adb4007 Host: Internalize computeHostNumPhysicalCores/computeHostNumHardwareThreads
Windows computeHostNumPhysicalCores is defined by Threading.cpp. Leave it
unchanged.
2022-11-23 21:09:45 -08:00
Fangrui Song 93b553e3f2 Revert "Host: Internalize computeHostNumPhysicalCores/computeHostNumHardwareThreads"
This reverts commit 9969ceb36b.

On Windows:

lld-link: error: undefined symbol: int __cdecl computeHostNumPhysicalCores(void)
>>> referenced by LLVMSupport.lib(Support.Host.obj):(int __cdecl llvm::sys::getHostNumPhysicalCores(void))
2022-11-23 20:12:16 -08:00
Fangrui Song 9969ceb36b Host: Internalize computeHostNumPhysicalCores/computeHostNumHardwareThreads 2022-11-23 17:44:04 -08:00
David Green 7fefa99445 [AArch64] Add Host identifiers for cortex-a55, cortex-a510, cortex-a710 and cortex-x2.
I noticed these were missing, so this adds Host identifiers for
cortex-a55, cortex-a510, cortex-a710 and cortex-x2, taken from their
respective TRMs.

Differential Revision: https://reviews.llvm.org/D138497
2022-11-23 12:10:54 +00:00
Simon Pilgrim 261b3f71c0 [X86] Add missing Zen3 model subtypes
This patch adds support for detecting all current Zen/Zen3+ submodels

Based off a mixture of https://github.com/torvalds/linux/blob/master/drivers/hwmon/k10temp.c#L436 and InstLatx64 https://github.com/InstLatx64/InstLatx64/tree/master/AuthenticAMD CPUID dumps and confirmed by @GGanesh

Differential Revision: https://reviews.llvm.org/D137695
2022-11-10 10:36:09 +00:00
Victor Campos 9d1ff787e5 [AArch64] Add support for the Cortex-X3 CPU
Cortex-X3 is an Armv9-A AArch64 CPU.

This patch introduces support for Cortex-X3.

Technical Reference Manual: https://developer.arm.com/documentation/101593/latest

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D136589
2022-11-09 11:33:48 +00:00
Freddy Ye 84a18a260e [X86] Support -march=sierraforest, grandridge, graniterapids.
Reviewed By: skan, pengfei, MaskRay

Differential Revision: https://reviews.llvm.org/D137153
2022-11-09 16:56:03 +08:00
Fangrui Song 6c927f2a86 Canonicalize PowerPC detection macros to __powerpc__ 2022-11-06 17:29:45 -08:00
Freddy Ye a806fc2767 [X86] Support -march=raptorlake, meteorlake
Reviewed By: pengfei, skan, MaskRay

Differential Revision: https://reviews.llvm.org/D135937
2022-11-04 09:32:17 +08:00
Simi Pallipurath fa8aeab606 [AArch64] Add support for the Cortex-A715 CPU
Cortex-A715 is an Armv9-A AArch64 CPU.

This patch introduces support for Cortex-A715.

Technical Reference Manual: https://developer.arm.com/documentation/101590/latest.

Reviewed By: vhscampos

Differential Revision: https://reviews.llvm.org/D136957
2022-11-03 09:28:46 +00:00
Freddy Ye aee2a35ac4 [X86] Add AVX-NE-CONVERT instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D135930
2022-10-31 23:39:38 +08:00
Freddy Ye 23f02693ec [X86] Add AVX-VNNI-INT8 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135938
2022-10-28 10:39:54 +08:00
Freddy Ye 0e720e6ada [X86] Add AVX-IFMA instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135932
2022-10-28 09:42:30 +08:00
Phoebe Wang b51b90d6e2 [X86][1/2] SUPPORT RAO-INT
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Initial authored by Liu Chen (@LiuChen3)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135951
2022-10-27 17:20:07 +08:00
Freddy Ye fdac4c4e92 [X86] Add CMPCCXADD instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135933
2022-10-25 14:33:39 +08:00
Phoebe Wang 363047bef1 [X86] Fix a missing `-` from AMX-FP16 feature string
Fixes #58545
2022-10-22 23:08:07 +08:00
Xiang1 Zhang 661881d436 [X86] Add AMX-FP16 instructions.
Differential Revision: https://reviews.llvm.org/D135941
2022-10-22 08:05:22 +08:00
Phoebe Wang 62ca79102c [X86][1/2] Support PREFETCHI instructions
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136040
2022-10-20 08:46:01 +08:00
David Sherwood fbb119412f [AArch64] Add Neoverse V2 CPU support
Adds support for the Neoverse V2 CPU to the AArch64 backend.

Differential Revision: https://reviews.llvm.org/D134352
2022-09-27 07:56:08 +00:00
Fangrui Song e9b213131a [Support] computeHostNumPhysicalCores: use sched_getaffinity for all non-Android Linux with no custom implementation
Make the sched_getaffinity based implementation available to all architectures
(except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)`
code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`.

The function is mainly used by in-process ThinLTO to decide the default number
of threads. Returning -1 will use just one thread.

Android is excluded because of the higher API level requirement:
`sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`
2022-08-13 01:36:13 -07:00
Simon Pilgrim 08a880509e [X86] Add RDPRU instruction CPUID bit masks
As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction

AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4
2022-08-11 16:07:36 +01:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Rainer Orth 979ddfff37 [Support] Handle SPARC in sys::getHostCPUName
While working on D118450 <https://reviews.llvm.org/D118450>, I noticed that
`sys::getHostCPUName` lacks SPARC support.

This patch implements it.  The code is taken from/inspired by GCC's
`gcc/config/sparc/driver-sparc.cc`.  There's one caveat: since LLVM, unlike
GCC, doesn't support the SPARC-M7, -S7, and -M8 CPUs, I map all those to
the latest supported one (UltraSparc T4/`niagara4`).

Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu` by
running `savcov --version` on

- Netra SPARC S7-2 (SPARC-S7, Solaris 11.4)
- SPARC T5-2 (SPARC T5, Solaris 11.4)
- SPARC Enterprise T5220 (UltraSPARC T2, Solaris 11.3)
- SPARC T5 (UltraSPARC T5, Debian sid)
- SPARC T3 (UltraSPARC T3, Debian sid)
- SPARC Enterprise T5220 (Debian sid)

Differential Revision: https://reviews.llvm.org/D130272
2022-07-27 12:21:03 +02:00
luxufan 63c81b23be [RISCV] Support getHostCpuName for sifive-u74
Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D123978
2022-05-17 14:06:59 +08:00
Philipp Tomsich 7e02bc5237 [AArch64] Add native CPU detection for Ampere1
Map the IMPLEMENTOR ID 0xc0 (Ampere Computing) and CPU ID 0xac3
(Ampere1) to ampere1.

Differential Revision: https://reviews.llvm.org/D117111
2022-05-03 16:10:02 +01:00
Ulrich Weigand 1283ccb610 Support z16 processor name
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM.  This patch adds support for
"z16" as an alternate architecture name for arch14.
2022-04-21 19:58:22 +02:00
Keith Smiley 955cff803e reland: [AArch64] Add support for -march=native for Apple M1 CPU
This reverts commit fc3cdd0b29.

The issue was imports being scoped to specific architectures for Apple
platforms.
2022-03-23 15:19:17 -07:00
Keith Smiley fc3cdd0b29 Revert "[AArch64] Add support for -march=native for Apple M1 CPU"
This reverts commit fcca10c69a.
2022-03-23 14:27:02 -07:00
Keith Smiley fcca10c69a [AArch64] Add support for -march=native for Apple M1 CPU
This improves the getHostCPUName check for Apple M1 CPUs, which
previously would always be considered cyclone instead. This also enables
`-march=native` support when building on M1 CPUs which would previously
fail. This isn't as sophisticated as the X86 CPU feature checking which
consults the CPU via getHostCPUFeatures, but this is still better than
before. This CPU selection could also be invalid if this was run on an
iOS device instead, ideally we can improve those cases as they come up.

Differential Revision: https://reviews.llvm.org/D119788
2022-03-23 14:06:59 -07:00
serge-sans-paille 739572b40b Missing include in Support/Host.cpp under __MVS__ 2022-03-16 10:19:04 +01:00
Roman Lebedev c62746ac6e
[X86] Fix AMD Znver3 model checks
While `-march=` is correctly detected as `znver3` for the cpu,
apparently the model check is incorrect:
```
$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  32
  On-line CPU(s) list:   0-31
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 5950X 16-Core Processor
    CPU family:          25
    Model:               33
    Thread(s) per core:  2
    Core(s) per socket:  16
    Socket(s):           1
    Stepping:            0
    Frequency boost:     disabled
    CPU max MHz:         6017.8462
    CPU min MHz:         2200.0000
    BogoMIPS:            8050.07
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse
                         3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_p
                         state ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbn
                         oinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
  Virtualization:        AMD-V
Caches (sum of all):
  L1d:                   512 KiB (16 instances)
  L1i:                   512 KiB (16 instances)
  L2:                    8 MiB (16 instances)
  L3:                    64 MiB (2 instances)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-31
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling
  Srbds:                 Not affected
  Tsx async abort:       Not affected
```

Model is 33 (0x21), while the code was expecting it to be `0x00 .. 0x1F`.
https://github.com/torvalds/linux/blob/v5.17-rc8/drivers/hwmon/k10temp.c#L432-L453 agrees.
I'm not sure if other ranges listed here should also be accepted.

I noticed this while implementing CPU model detection
for halide (https://github.com/halide/Halide/pull/6648)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121708
2022-03-15 20:28:02 +03:00
serge-sans-paille fbbc41f8dd Cleanup include: TableGen
This also includes a few cleanup from Support.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121331
2022-03-11 11:41:32 +01:00
Danila Malyutin ff33b6f90a [Support][AArch64] Detect a few more host CPU features on AArch64
Add detecton for lse, sve and sve2 on linux

Differential Revision: https://reviews.llvm.org/D119435
2022-03-03 09:30:02 +03:00
Ties Stuij 6b1e844b69 [ARM] Add Cortex-X1C Support for Clang and LLVM
This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and
ARM.

For more information, see:
- https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c
- https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components

The following people contributed to this patch:
- Simon Tatham
- Ties Stuij

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D117202
2022-01-31 14:23:35 +00:00
Sander de Smalen b92102a6d7 [AArch64] Add native CPU detection for Neoverse-V1.
Map Main ID part number 0xd40 to neoverse-v1, as described in the
Neoverse-V1 Technical Reference Manual:

https://developer.arm.com/documentation/101427/0101/Register-descriptions/AArch64-system-registers/MIDR-EL1--Main-ID-Register--EL1

Differential Revision: https://reviews.llvm.org/D117207
2022-01-13 12:58:54 +00:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Andreas Schwab a706a5ef22 [Support] Define sys::getHostCPUName for RISC-V
The RISCV target doesn't define a "generic" cpu, only "generic-rv32" and
"generic-rv64".  Define sys::getHostCPUName for RISC-V that returns the
correct cpu for the host.

Reviewed By: craig.topper, MaskRay

Differential Revision: https://reviews.llvm.org/D105274
2021-10-08 10:08:39 -07:00
Tianqing Wang 12fa608af4 [X86] Add CRC32 feature.
d8faf03807 implemented general-regs-only for X86 by disabling all features
with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses
only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this
instruction and allows it to be used with general-regs-only.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D105462
2021-09-06 17:24:30 +08:00
Wang, Pengfei 6f7f5b54c8 [X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105263
2021-08-10 12:46:01 +08:00
Freddy Ye d268c20070 [X86] Support auto-detect for tigerlake and alderlake
Differential Revision: https://reviews.llvm.org/D107245
2021-08-02 11:01:01 +08:00
Ulrich Weigand 8cd8120a7b [SystemZ] Add support for new cpu architecture - arch14
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10304.

Note: No currently available Z system supports the arch14
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2021-07-26 16:57:28 +02:00
Ulrich Weigand e04c05e823 [SystemZ] Fix invalid assumption in getCPUNameFromS390Model
Code in getCPUNameFromS390Model currently assumes that the
numerical value of the model number always increases with
future hardware.  While this has happened to be the case
with the last few machines, it is not guaranteed -- that
assumption was violated with (much) older machines, and
it can be violated again with future machines.

Fix by explicitly listing model numbers for all supported
machine models.
2021-07-20 13:39:22 +02:00
Steven Wu e23dce6c97 [Support] Get correct number of physical cores on Apple Silicon
Fix a bug that `computeHostNumPhysicalCores` is fallback to default
unknown when building for Apple Silicon macs.

rdar://80533675

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D106012
2021-07-14 13:29:54 -07:00
Anirudh Prasad 993f38d0a7 [SystemZ][z/OS] Implement getHostCPUName for z/OS
- Currently, the host cpu information is not easily available on z/OS as in other platforms.
- This information is stored in the Communications Vector Table (https://www.ibm.com/docs/en/zos/2.2.0?topic=information-cvt-mapping)

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D102793
2021-05-25 11:18:12 -04:00
Freddy Ye 3fc1fe8db8 [X86] Support -march=rocketlake
Reviewed By: skan, craig.topper, MaskRay

Differential Revision: https://reviews.llvm.org/D100085
2021-04-13 09:48:13 +08:00
Vy Nguyen 64d2c326b7 [llvm] Fix thinko in getVendorSignature(), where expected values of ECX and EDX were flipped for the AMD case.
Follow up to D97504

Differential Revision: https://reviews.llvm.org/D98322
2021-03-10 21:39:19 -05:00
Vy Nguyen f8b01d54c3 Reland 293e8fa13d
[llvm-exegesis] Disable the LBR check on AMD

    https://bugs.llvm.org/show_bug.cgi?id=48918

    The bug reported a hang (or very very slow runtime) on a Zen2. Unfortunately, we don't have the hardware right now to debug it and I was not able to reproduce the bug on a HSW.
    Theory we've got is that the lbr-checking code could be confused on AMD.

    Differential Revision: https://reviews.llvm.org/D97504

New change:
 - Surround usages of x86 helper in llvm-exegesis/X86/Target.cpp with ifdef
 - Fix bug which caused the caller of getVendorSignature to not have a copy of EAX that it expected.
2021-03-05 13:23:42 -05:00