This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
It may be necessary to build additional targets before running
perf-training, the typical use case would be builtins and runtimes.
This change allows users to specify those dependencies as:
set(CLANG_PERF_TRAINING_DEPS builtins runtimes CACHE STRING "")
Differential Revision: https://reviews.llvm.org/D138974
Introduce a RVVTypeCache to hold the cache instead of using a local
static variable to maintain a cache.
Also made construct of RVVType to private, make sure that could be only
created by a cache manager.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D138429
The [Improving Clang's Diagnostics RFC][1] identifies eight broad fields
for Clang to surface, two of which are text-based. Since the current
diagnostics more closely map to the diagnostic summary (or headline), we
should rename them to ensure that there's no confusion when
Diagnostic.Reason is introduced in the near future.
[1]: https://discourse.llvm.org/t/rfc-improving-clang-s-diagnostics/62584
Reviewed By: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D135820
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
TableGen executables are supposed to never be linked against libLLVM-*.so,
even when LLVM_LINK_LLVM_DYLIB=ON, presumably for cross-compilation.
It turns out that clang-tblgen *did* link against libLLVM-*.so,
indirectly so via the clangSupport.
This lead to a regression in what should have been unrelated work of
cleaning up ManagedStatics in LLVMSupport. A running clang-tblgen
process ended up with two copies of a cl::opt static global:
- one from libLLVMSupport linked statically into clang-tblgen as a
direct dependency
- one from libLLVMSupport linked into libLLVM-*.so, which clang-tblgen
linked against due to the clangSupport dependency
For a bit more context, see the discussion at
https://discourse.llvm.org/t/flang-aarch64-dylib-buildbot-need-help-understanding-a-regression-in-clang-tblgen/64871/
None of the potential solutions I could find are perfect. Presumably one
possible solution would be to remove "Support" from the explicit
dependencies of clang-tblgen. However, relying on the transitive
inclusion via clangSupport seems risky, and in any case this wouldn't
address the issue of clang-tblgen surprisingly linking against libLLVM-*.so.
This change instead creates a second version of the clangSupport library
that is explicitly linked statically, to be used by clang-tblgen.
v2:
- define an alias so that clang-tblgen can always link against
"clangSupport_tablegen"
- use add_llvm_library(... BUILDTREE_ONLY ...)
v3:
- use the object library version of clangSupport if available
Differential Revision: https://reviews.llvm.org/D134637
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.
The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
is added to the definition of the function. Along with the existing
always_inline intrinsic, this will give a compile time error if the
function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
the target feature and gives an error if the builtin is found in a
function without the required features, similar to arm_sve.h.
The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.
Differential Revision: https://reviews.llvm.org/D132034
This patch adds `CLANG_BOLT_INSTRUMENT` option that applies BOLT instrumentation
to Clang, performs a bootstrap build with the resulting Clang, merges resulting
fdata files into a single profile file, and uses it to perform BOLT optimization
on the original Clang binary.
The projects and targets used for bootstrap/profile collection are configurable via
`CLANG_BOLT_INSTRUMENT_PROJECTS` and `CLANG_BOLT_INSTRUMENT_TARGETS`.
The defaults are "llvm" and "count" respectively, which results in a profile with
~5.3B dynamically executed instructions.
The intended use of the functionality is through BOLT CMake cache file, similar
to PGO 2-stage build:
```
cmake <llvm-project>/llvm -C <llvm-project>/clang/cmake/caches/BOLT.cmake
ninja clang++-bolt # pulls clang-bolt
```
Stats with a recent checkout (clang-16), pre-built BOLT and Clang, 72vCPU/224G
| CMake configure with host Clang + BOLT.cmake | 1m6.592s
| Instrumenting Clang with BOLT | 2m50.508s
| CMake configure `llvm` with instrumented Clang | 5m46.364s (~5x slowdown)
| CMake build `not` with instrumented Clang |0m6.456s
| Merging fdata files | 0m9.439s
| Optimizing Clang with BOLT | 0m39.201s
Building Clang:
```cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=... -DCMAKE_CXX_COMPILER=...
-DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang
-DLLVM_TARGETS_TO_BUILD=Native -GNinja```
| | Release | BOLT-optimized
| cmake | 0m24.016s | 0m22.333s
| ninja clang | 5m55.692s | 4m35.122s
I know it's not rigorous, but shows a ballpark figure.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D132975
Add Call() and CallVoid() ops and use them to call functions. Only
FunctionDecls are supported for now.
Differential Revision: https://reviews.llvm.org/D132286
This patch dumps every state trait in the egraph. Also
the empty state traits are no longer dumped, instead
they are treated as null by the egraph rewriter script,
which solves reverse compatibility issues.
Differential Revision: https://reviews.llvm.org/D131187
Attempt to fix the test failures observed in CI:
* Add Option dependency, which caused BUILD_SHARED_LIBS builds to fail
* Adapt tests that accidentally depended on the host platform: platforms
that don't use an integrated assembler (e.g. AIX) get a different set
of commands from the driver. Most dependency scanner tests can use
-fsyntax-only or -E instead of -c to avoid this, and in the rare case
we want to check -c specifically, set an explicit target so the
behaviour is independent of the host.
Original commit message follows.
---
Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.
This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.
In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.
Differential Revision: https://reviews.llvm.org/D132405
Instead of trying to "fix" the original driver invocation by appending
arguments to it, split it into multiple commands, and for each -cc1
command use a CompilerInvocation to give precise control over the
invocation.
This change should make it easier to (in the future) canonicalize the
command-line (e.g. to improve hits in something like ccache), apply
optimizations, or start supporting multi-arch builds, which would
require different modules for each arch.
In the long run it may make sense to treat the TU commands as a
dependency graph, each with their own dependencies on modules or earlier
TU commands, but for now they are simply a list that is executed in
order, and the dependencies are simply duplicated. Since we currently
only support single-arch builds, there is no parallelism available in
the execution.
Differential Revision: https://reviews.llvm.org/D132405
HLSL entry function parameters must have parameter annotations. This
allows appropriate intrinsic values to be populated into parameters
during code generation.
This does not handle entry function return values, which will be
handled in a subsequent commit because we don't currently support any
annotations that are valid for function returns.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D131625
In preparation for allowing the prefer_type list in the append_args clause,
use the OMPInteropInfo in the attribute for 'declare variant'.
This requires adding a new Argument kind to the attribute code. This change
adds a specific attribute to pass an array of OMPInteropInfo. It implements
new tablegen needed to handle the interop-type part of the structure. When
prefer_type is added, more work will be needed to dump, instantiate, and
serialize the PreferTypes field in OMPInteropInfo.
Differential Revision: https://reviews.llvm.org/D132270
Adds
* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`
These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.
[1]: http://wg21.link/meta.trans
Co-authored-by: zoecarver
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D116203
This fixes some fallout from D131282. Currently, add_tablegen() will add the tablegen target to LLVM_EXPORTS and associates the install with LLVMExports. For non-standalone builds, this means that you end up with mlir-tblgen and clang-tblgen in LLVMExports.
However, these projects should instead be using MLIR_EXPORTS/MLIRTargets and CLANG_EXPORTS/ClangTargets. To fix this, add an extra EXPORT option and make use of get_target_export_arg() to create the correct export argument.
Reviewed By: ashay-github
Differential Revision: https://reviews.llvm.org/D131565
Adds
* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`
These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.
[1]: http://wg21.link/meta.trans
Co-authored-by: zoecarver
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D116203
arm_sve.h defines and uses __ai macro which needs to be undefined (as it is
already in arm_neon.h).
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D131580
This encapsulates 3 changes:
- `DotDumpVisitor` now aggregates strings instead of *bytes* for both
`python2` and `python3`. This difference caused crashes when it tried
to write out the content as *strings*, similarly described at D71746.
- `graphviz.pipe()` expects the input in *bytes* instead of unicode
strings. And it results in *bytes*. Due to string concatenations and
similar operations, I'm using unicode string as the default, and
converting to *bytes* on demand.
- `write_temp_file()` now appends the `egraph-` prefix and more
importantly, it will create the temp file in the **current working
directory** instead of in the *temp*. This change makes `Firefox` be
able to open the file even if the `security.sandbox.content.level` is
set to the (default) most restricting `4`.
See https://support.mozilla.org/si/questions/1259285
An artifact of the bad byte handling was previously in the `HTML`
produced by the script that it displayed the `b'` string at the top left
corner. Now it won't anymore :)
I've tested that the following command works on `Ubuntu 22.04`:
```
exploded-graph-rewriter my-egraph.dot
```
Both `python2` and `python3` works as expected.
PS: I'm not adding tests, as the current test infra does not support
testing HTML outputs for this script.
Check the `clang/test/Analysis/exploded-graph-rewriter/lit.local.cfg`.
We always pass the `--dump-dot-only` flag to the script.
Along with that, the default invocation will not only create this HTML
report but also try to open it. In addition to this, I'm not sure if the
buildbots have `graphviz` installed and also if this package is installed
on `pip`.
Unless we change some of these, we cannot test this change.
Given that D71746 had no tests, I'm not too worried about this either.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D131553
Prior to this patch, the `add_tablegen()` macro in
llvm/cmake/modules/TableGen.cmake added the install rule only if
`project` matched `LLVM` or `MLIR`. This patch adds an optional
`DESTINATION` argument, which, if non-empty, decides whether (and where)
to install the tablegen tool, thus eliminating the need for
project-specific overrides. This patch also updates all other
invocations of the `add_tablegen()` macro.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D131282
Instructions.
We will switch all UndefValue to PoisonValue in follow up patches.
Thanks for Kito to help on verification with their interanl testsuite.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126748
1. Add policy functions support and tests for vadd, vmv, vfmv and all load
instructions except segment load. I didn't add all combination of policy
functions in test because it seem not to make sense.
2. Rename HasUnMaskedOverloaded to SupportOverloading.
3. vmv.s.x for ta policy could not have overloaded API.
4. This patch does not support all operations, I will have other follow-up
patches support all.
[RFC] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/137
Reviewed By: kito-cheng, fakepaper56, fakepaper56
Differential Revision: https://reviews.llvm.org/D126742
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
The code relied on ManagedStatic.h being included indirectly. This is
about to change as uses of ManagedStatic are removed throughout the
codebase.
Differential Revision: https://reviews.llvm.org/D130575
Remove MaskedPrototype and add several fields in RVVIntrinsicRecord,
compute Prototype in runtime.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D126741
Leverage the method OpenCL uses that adds C intrinsics when the lookup
failed. There is no need to define C intrinsics in the header file any
more. It could help to avoid the large header file to speed up the
compilation of RVV source code. Besides that, only the C intrinsics used
by the users will be added into the declaration table.
This patch is based on https://reviews.llvm.org/D103228 and inspired by
OpenCL implementation.
### Experimental Results
#### TL;DR:
- Binary size of clang increase ~200k, which is +0.07% for debug build and +0.13% for release build.
- Single file compilation speed up ~33x for debug build and ~8.5x for release build
- Regression time reduce ~10% (`ninja check-all`, enable all targets)
#### Header size change
```
| size | LoC |
------------------------------
Before | 4,434,725 | 69,749 |
After | 6,140 | 162 |
```
#### Single File Compilation Time
Testcase:
```
#include <riscv_vector.h>
vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2, size_t vl) {
return vadd(op1, op2, vl);
}
```
##### Debug build:
Before:
```
real 0m19.352s
user 0m19.252s
sys 0m0.092s
```
After:
```
real 0m0.576s
user 0m0.552s
sys 0m0.024s
```
~33x speed up for debug build
##### Release build:
Before:
```
real 0m0.773s
user 0m0.741s
sys 0m0.032s
```
After:
```
real 0m0.092s
user 0m0.080s
sys 0m0.012s
```
~8.5x speed up for release build
#### Regression time
Note: the failed case is `tools/llvm-debuginfod-find/debuginfod.test` which is unrelated to this patch.
##### Debug build
Before:
```
Testing Time: 1358.38s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
After
```
Testing Time: 1220.29s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
##### Release build
Before:
```
Testing Time: 381.98s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
After:
```
Testing Time: 346.25s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
#### Binary size of clang
##### Debug build
Before
```
text data bss dec hex filename
335261851 12726004 552812 348540667 14c64efb bin/clang
```
After
```
text data bss dec hex filename
335442803 12798708 552940 348794451 14ca2e53 bin/clang
```
+253K, +0.07% code size
##### Release build
Before
```
text data bss dec hex filename
144123975 8374648 483140 152981763 91e5103 bin/clang
```
After
```
text data bss dec hex filename
144255762 8447296 483268 153186326 9217016 bin/clang
```
+204K, +0.13%
Authored-by: Kito Cheng <kito.cheng@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Reviewed By: khchen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D111617
This patch aims to create a webpage to document
Flang's command line options on https://flang.llvm.org/docs/
in a similar way to Clang's
https://clang.llvm.org/docs/ClangCommandLineReference.html
This is done by using clang_tablegen to generate an .rst
file from Options.td (which is current shared with Clang)
For this to work, ClangOptionDocEmitter.cpp was updated
to allow specific Flang flags to be included,
rather than bulk excluding clang flags.
Note:
Some headings in the generated documentation will incorrectly
contain references to Clang, e.g.
"Flags controlling the behaviour of Clang during compilation"
This is because Options.td (Which is shared between both Clang and Flang)
contains hard-coded DocBrief sections. I couldn't find a non-intrusive way
to make this target-dependant, as such I've left this as is, and it will need revisiting later.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D129864
This patch aims to create a webpage to document
Flang's command line options on https://flang.llvm.org/docs/
in a similar way to Clang's
https://clang.llvm.org/docs/ClangCommandLineReference.html
This is done by using clang_tablegen to generate an .rst
file from Options.td (which is current shared with Clang)
For this to work, ClangOptionDocEmitter.cpp was updated
to allow specific Flang flags to be included,
rather than bulk excluding clang flags.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D129864
Introducing the support for evaluating the constructor
of every element in an array. The idea is to record the
index of the current array member being constructed and
create a loop during the analysis. We looping over the
same CXXConstructExpr as many times as many elements
the array has.
Differential Revision: https://reviews.llvm.org/D127973
RVV C intrinsics use pointers to scalar for base address and their corresponding
IR intrinsics but use pointers to vector. It makes some vector load intrinsics
need specific ManualCodegen and MaskedManualCodegen to just add bitcast for
transforming to IR.
For simplifying riscv_vector.td, the patch make RISCVEmitter detect pointer
operands and bitcast them.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D129043