This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch moves the outlined function registration, function attribute
configuration and function ID creation to the OpenMPIRBuilder. This will later
be used by flag as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D137587
This change moves the getName function from clang and moves the separator class
members from CGOpenMPRuntime into OMPIRBuilder. Also enusre all the getters
in the config class are const.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D137725
This patch introudces the OpenMPIRBuilderConfig class which contains various
flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to
keep the flags in one place so they do not have to be passed in every time.
The flags can be set optionally since some uses cases don't rely on functions
that depend on these flags.
Reviewed By: jdoerfert, tschuett
Differential Revision: https://reviews.llvm.org/D138220
This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef
Differential Revision: https://reviews.llvm.org/D137720
It is caused by regenerate captured var value when processing the
has_device_addr, the captured var value has been generated in
GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture.
The fix just use Arg instead regenerated just same as is_device_ptr
same line. Cases such as those in the associated lit tests, can now be
supported.
This adds a 'Count' field to TargetRegionEntryInfo to differentiate
regions with the same source position.
The OffloadEntriesInfoManager routines are updated to maintain a count of
regions seen at a location. The registration of regions proceeds that same as
before, but now the next available count is always determined and used in the
offload entry.
Fixes: https://github.com/llvm/llvm-project/issues/52707
Differential Revision: https://reviews.llvm.org/D134816
This patch moves the createOffloadEntriesAndInfoMetadata to OpenMPIRBuilder,
the createOffloadEntry helper function. The clang specific error handling is
invoked using a callback. This code will also be used by flang in the future.
Re-apply of: 3d0e9edd8e
Reverted in: 0cb65b0a58
A function parameter was using the wrong type 'llvm::TargetRegion' instead of
'const llvm:: TargetRegion&', which caused the error in the address sanitizer.
The correct type is now used.
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.
Reviewed By: jdoerfert, mikerice
Differential Revision: https://reviews.llvm.org/D136601
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.
Reviewed By: jdoerfert, mikerice
Differential Revision: https://reviews.llvm.org/D136601
This patch adds support for the `depend` clause for the `task`
construct.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135695
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136198
This patch moves the implementation of the OffloadEntriesInfoManager
to the OMPIRbuilder. This class will later be used by flang as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135786
This patch moves the emitOffloadingArraysArgument function and
supporting data structures to OpenMPIRBuilder. This will later be used
in flang as well. The TargetDataInfo class was split up into generic
information and clang-specific data, which remain in clang. Further
migration will be done in in the future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134662
It is data mapping ordering problem.
According omp spec
If one or more map clauses are present, the list item conversions that
are performed for any use_device_ptr or use_device_addr clause occur
after all variables are mapped on entry to the region according to those
map clauses.
The change is to put mapping data for use_device_addr at end of data
mapping array.
Differential Revision: https://reviews.llvm.org/D134556
This patch add codegen support for the has_device_addr clause. It use
the same logic of is_device_ptr. But passing &var instead pointer to var
to kernal.
Differential Revision: https://reviews.llvm.org/D134268
Summary: This patch add codegen support for the has_device_addr clause. It
use the same logic of is_device_ptr.
Differential Revision: https://reviews.llvm.org/D134186
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6, ABataev
Differential Revision: https://reviews.llvm.org/D102107
Currently, the field just emit map info for this pointer variable. It is
failed at run time. For the fields, the PartialStruct is created and it
needs call to emitCombinedEntry which create the base that covers all
the pieces.
The change is to generate map info as regular fields.
Differential Revision: https://reviews.llvm.org/D129608
This completes the implementation of P1091R3 and P1381R1.
This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.
In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.
We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.
In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.
Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.
at the request of @shafik, i can confirm the correct behavior of lldb wit this change.
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122768
This completes the implementation of P1091R3 and P1381R1.
This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.
In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.
We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.
In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.
Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.
at the request of @shafik, i can confirm the correct behavior of lldb wit this change.
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122768
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
Previously we added the `push_target_tripcount` function to send the
loop tripcount to the device runtime so we knew how to configure the
teams / threads for execute the loop for a teams distribute construct.
This was implemented as a separate function mostly to avoid changing the
interface for backwards compatbility. Now that we've changed it anyway
and the new interface can take an arbitrary number of arguments via the
struct without changing the ABI, we can move this to the new interface.
This will simplify the runtime by removing unnecessary state between
calls.
Depends on D128550
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D128816
This patch changes the code we generate to enter a target region on the
device. This is in-line with the new definition in the runtime that was
added previously. Additionally we implement this in the OpenMPIRBuilder
so that this code can be shared with Flang in the future.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D128550
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6.
So, this
```
for (int i=0; i<N; i++) {
res = res+i
}
```
will become
```
#pragma omp task in_reduction(+:res) if(0)
#pragma omp target map(res)
for (int i=0; i<N; i++) {
res = res+i
}
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D125669
Refactor the code that handles the align clause of 'omp allocate' so
it can be used with globals as well as local variables.
Differential Revision: https://reviews.llvm.org/D126426
This creates an entry with address=nullptr and flag=0x80.
When an 'omp_all_memory' entry is specified any other 'out' or
'inout' entries are not needed and are not passed to the runtime.
Differential Revision: https://reviews.llvm.org/D126321
Adds support for the reserved locator 'omp_all_memory' for use
in depend clauses with 'out' or 'inout' dependence-types.
Differential Revision: https://reviews.llvm.org/D125828
This adds support for variable stride with the val, uval, and ref linear
modifiers. Previously only the no modifer type ls<argno> was supported.
val -> Ls<argno>
uval -> Us<argno>
ref -> Rs<argno>
Differential Revision: https://reviews.llvm.org/D125330
Add mangling for linear parameters specified with ref, uval, and val
for 'omp declare simd' vector functions.
Add missing stride for linear this parameters.
Differential Revision: https://reviews.llvm.org/D125269
If alignment specified with align clause is less than natural alignment for
list item type, the alignment should be set to the natural alignment.
See OMP5.1 specification, page 185, lines 7-10
Differential Revision: https://reviews.llvm.org/D124676
This patch moves the logic for generating the offloading entries to the
OpenMPIRBuilder. This makes it easier to re-use in other places, such as
for OpenMP support in Flang or using the same method for generating
offloading entires for other languages like Cuda.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D123460
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.
Depends on D122504
Fixes#54091
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122515
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D122504
Currently the device kernels all have weak linkage to prevent linkage
errors on multiple defintions. However, this prevents some optimizations
from adequately analyzing them because of the nature of weak linkage.
This patch replaces the weak linkage with weak_odr linkage so we can
statically assert that multiple declarations of the same kernel will
have the same definition.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122443
Current clang generates extra set of simd variant function attribute
with extra 'v' encoding.
For example:
_ZGVbN2v__Z5add_1Pf vs _ZGVbN2vv__Z5add_1Pf
The problem is due to declaration of ParamAttrs following:
llvm::SmallVector<ParamAttrTy, 8> ParamAttrs(ParamPositions.size());
where ParamPositions.size() is grown after following assignment:
Pos = ParamPositions[PVD];
So the PVD is not find in ParamPositions.
The problem is ParamPositions need to set for each FD decl. To fix this
Move ParamPositions's init inside while loop for each FD.
Differential Revision: https://reviews.llvm.org/D122338
Currently we create offloading entries to register device variables with
the host. When we register a variable we will look up the symbol in the
device image and map the device address to the host address. This is a
problem when the symbol is declared with hidden visibility or internal
linkage. This means the symbol is not accessible externally and we
cannot get its address. We should still allow static variables to be
declared on the device, but ew should not create an offloading entry for
them so they exist independently on the host and device.
Fixes#54309
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D122352