Commit Graph

725 Commits

Author SHA1 Message Date
Kazu Hirata bb666c6930 [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 11:13:43 -08:00
Jan Sjodin 2166d9529a [OpenMP][OMPIRBuilder] Migrate target outlined function registration to OMPIRBuilder from clang
This patch moves the outlined function registration, function attribute
configuration and function ID creation to the OpenMPIRBuilder. This will later
be used by flag as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D137587
2022-11-29 14:20:34 -05:00
Jan Sjodin 2aa338f68e [OpenMP][OMPIRBuilder] Mirgrate getName from clang to OMPIRBuilder
This change moves the getName function from clang and moves the separator class
members from CGOpenMPRuntime into OMPIRBuilder. Also enusre all the getters
in the config class are const.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D137725
2022-11-24 10:11:13 -05:00
Jan Sjodin 969d787a47 [OpenMP][OMPIRBuilder] Add a configuration class that captures flags that affect codegen
This patch introudces the OpenMPIRBuilderConfig class which contains various
flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to
keep the flags in one place so they do not have to be passed in every time.
The flags can be set optionally since some uses cases don't rely on functions
that depend on these flags.

Reviewed By: jdoerfert, tschuett

Differential Revision: https://reviews.llvm.org/D138220
2022-11-22 09:25:04 -05:00
Doru Bercea 98bfd7f976 Fix declare target implementation to support enter. 2022-11-17 17:35:53 -06:00
Akash Banerjee 87f652d31f Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder.
This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef

Differential Revision: https://reviews.llvm.org/D137720
2022-11-14 17:18:10 +00:00
Jennifer Yu de14befa77 Remove redundant loads.
It is caused by regenerate captured var value when processing the
has_device_addr, the captured var value has been generated in
GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture.
The fix just use Arg instead regenerated just same as is_device_ptr
2022-11-04 15:22:25 -07:00
Mike Rice c954cfeb57 Some uses of the preprocessor can result in multiple target regions on the
same line. Cases such as those in the associated lit tests, can now be
supported.

This adds a 'Count' field to TargetRegionEntryInfo to differentiate
regions with the same source position.

The OffloadEntriesInfoManager routines are updated to maintain a count of
regions seen at a location. The registration of regions proceeds that same as
before, but now the next available count is always determined and used in the
offload entry.

Fixes: https://github.com/llvm/llvm-project/issues/52707

Differential Revision: https://reviews.llvm.org/D134816
2022-11-04 12:54:22 -07:00
Jan Sjodin 9ea2b150b5 [OpenMP][OMPIRBuilder] Migrate createOffloadEntriesAndInfoMetadata from clang to OpenMPIRBuilder
This patch moves the createOffloadEntriesAndInfoMetadata to OpenMPIRBuilder,
the createOffloadEntry helper function. The clang specific error handling is
invoked using a callback. This code will also be used by flang in the future.
2022-11-03 10:27:44 -04:00
Akash Banerjee a3463a9f5c [OpenMP][OpenMPIRBuilder] Migrate loadOffloadInfoMetadata from clang to OMPIRbuilder
This patch moves the implementation of the loadOffloadInfoMetadata to the OMPIRbuilder.

Differential Revision: https://reviews.llvm.org/D136872
2022-11-02 18:54:25 +00:00
Jan Sjodin 67f8521cd4 [OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info
Re-apply of: 3d0e9edd8e
Reverted in: 0cb65b0a58

A function parameter was using the wrong type 'llvm::TargetRegion' instead of
'const llvm:: TargetRegion&', which caused the error in the address sanitizer.
The correct type is now used.

This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.

Reviewed By: jdoerfert, mikerice

Differential Revision: https://reviews.llvm.org/D136601
2022-10-31 10:49:44 -04:00
Kevin Athey 0cb65b0a58 Revert "[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info"
This reverts commit 3d0e9edd8e.

Breaking HWASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/236/builds/786

Shown by targetted builds breaking at this patch:
Built at this patch: https://lab.llvm.org/buildbot/#/builders/236/builds/803
Built at prior patch: https://lab.llvm.org/buildbot/#/builders/236/builds/804
2022-10-27 13:57:25 -07:00
Jan Sjodin 3d0e9edd8e [OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.

Reviewed By: jdoerfert, mikerice

Differential Revision: https://reviews.llvm.org/D136601
2022-10-25 11:15:36 -04:00
Prabhdeep Singh Soni 6149589127 [OMPIRBuilder] Support depend clause for task
This patch adds support for the `depend` clause for the `task`
construct.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135695
2022-10-19 13:11:43 -04:00
Joseph Huber 8c1449a84d [OpenMP] Make kernels have protected visibility
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D136198
2022-10-18 16:37:28 -05:00
Jan Sjodin dd3d8ddb5f [OpenMP][OpenMPIRBuilder] Migrate OffloadEntriesInfoManager from clang to OMPIRbuilder
This patch moves the implementation of the OffloadEntriesInfoManager
to the OMPIRbuilder. This class will later be used by flang as well.

    Reviewed By: jdoerfert

    Differential Revision: https://reviews.llvm.org/D135786
2022-10-16 08:32:40 -04:00
Jan Sjodin 4627cef113 [OpenMP][OMPIRBuilder] Migrate emitOffloadingArraysArgument from clang
This patch moves the emitOffloadingArraysArgument function and
supporting data structures to OpenMPIRBuilder. This will later be used
in flang as well. The TargetDataInfo class was split up into generic
information and clang-specific data, which remain in clang. Further
migration will be done in in the future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D134662
2022-10-07 07:03:03 -05:00
Jennifer Yu 30cc712eb6 [Clang][OpenMP] Fix run time crash when use_device_addr is used.
It is data mapping ordering problem.

According omp spec
If one or more map clauses are present, the list item conversions that
are performed for any use_device_ptr or use_device_addr clause occur
after all variables are mapped on entry to the region according to those
map clauses.

The change is to put mapping data for use_device_addr at end of data
mapping array.

Differential Revision: https://reviews.llvm.org/D134556
2022-09-27 11:53:57 -07:00
Jennifer Yu 48ffd40ba2 [Clang][OpenMP] Codegen generation for has_device_addr claues.
This patch add codegen support for the has_device_addr clause. It use
the same logic of is_device_ptr. But passing &var instead pointer to var
to kernal.

Differential Revision: https://reviews.llvm.org/D134268
2022-09-20 21:12:30 -07:00
Ron Lieberman d5b5289561 revert 684f76643 [Clang][OpenMP] Codegen generation for has_device_addr claues.
breaks amdgpu buildbot
2022-09-20 01:37:27 +00:00
Jennifer Yu 684f766431 [Clang][OpenMP] Codegen generation for has_device_addr claues.
Summary: This patch add codegen support for the has_device_addr clause.  It
use the same logic of is_device_ptr.

Differential Revision: https://reviews.llvm.org/D134186
2022-09-19 16:14:57 -07:00
Dhruva Chakrabarti 839ac62c50 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 7539e9cf81.
2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis 7539e9cf81 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6, ABataev

Differential Revision: https://reviews.llvm.org/D102107
2022-09-15 00:54:05 +00:00
Vitaly Buka 36c9f5a58b [NFC][OpenMP] Simplify 2f9be69d84 2022-08-17 18:59:48 -07:00
Vitaly Buka 2f9be69d84 [OpenMP] Fix another after scope after D129608
https://lab.llvm.org/buildbot/#/builders/5/builds/26770
2022-08-13 12:13:54 -07:00
Vitaly Buka f385eaf48f [OpenMP] Fix use after scope after D129608
Broken builder https://lab.llvm.org/buildbot/#/builders/5/builds/26764
2022-08-13 09:40:51 -07:00
Jennifer Yu 2ca27206f9 [OpenMP] Fix segmentation fault when data field is used in is_device_pt
Currently, the field just emit map info for this pointer variable. It is
failed at run time. For the fields, the PartialStruct is created and it
needs call to emitCombinedEntry which create the base that covers all
the pieces.

The change is to generate map info as regular fields.

Differential Revision: https://reviews.llvm.org/D129608
2022-08-12 17:10:26 -07:00
Corentin Jabot 127bf44385 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-04 10:12:53 +02:00
Corentin Jabot a274219600 Revert "[Clang][C++20] Support capturing structured bindings in lambdas"
This reverts commit 44f2baa380.

Breaks self builds and seems to have conformance issues.
2022-08-03 21:00:29 +02:00
Corentin Jabot 44f2baa380 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-03 20:00:01 +02:00
Gabriel Ravier 5674a3c880 Fixed a number of typos
I went over the output of the following mess of a command:

(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
 parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
 grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
 grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Differential Revision: https://reviews.llvm.org/D130827
2022-08-01 13:13:18 -04:00
Joseph Huber 5300263c70 [OpenMP] Add loop tripcount argument to kernel launch and remove push function
Previously we added the `push_target_tripcount` function to send the
loop tripcount to the device runtime so we knew how to configure the
teams / threads for execute the loop for a teams distribute construct.
This was implemented as a separate function mostly to avoid changing the
interface for backwards compatbility. Now that we've changed it anyway
and the new interface can take an arbitrary number of arguments via the
struct without changing the ABI, we can move this to the new interface.
This will simplify the runtime by removing unnecessary state between
calls.

Depends on D128550

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D128816
2022-07-08 14:44:16 -04:00
Joseph Huber 1fff116645 [OpenMP] Change OpenMP code generation for target region entries
This patch changes the code we generate to enter a target region on the
device. This is in-line with the new definition in the runtime that was
added previously. Additionally we implement this in the OpenMPIRBuilder
so that this code can be shared with Flang in the future.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D128550
2022-07-08 14:44:11 -04:00
Ritanya B Bharadwaj 8322fe200d Adding support for target in_reduction
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6.
So, this

```
  for (int i=0; i<N; i++) {
    res = res+i
  }
```

will become

```

   #pragma omp task in_reduction(+:res) if(0)
   #pragma omp target map(res)
   for (int i=0; i<N; i++) {
     res = res+i
   }
```

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D125669
2022-06-27 10:36:46 -05:00
Kazu Hirata ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Mike Rice 0a5cfbf7b2 [OpenMP] Use the align clause value from 'omp allocate' for globals
Refactor the code that handles the align clause of 'omp allocate' so
it can be used with globals as well as local variables.

Differential Revision: https://reviews.llvm.org/D126426
2022-05-26 09:51:48 -07:00
Mike Rice 239094cdee [OpenMP] Add codegen for 'omp_all_memory' reserved locator.
This creates an entry with address=nullptr and flag=0x80.
When an 'omp_all_memory' entry is specified any other 'out' or
'inout' entries are not needed and are not passed to the runtime.

Differential Revision: https://reviews.llvm.org/D126321
2022-05-24 15:26:23 -07:00
Mike Rice 9ba937112f [OpenMP] Add parsing/sema support for omp_all_memory reserved locator
Adds support for the reserved locator 'omp_all_memory' for use
in depend clauses with 'out' or 'inout' dependence-types.

Differential Revision: https://reviews.llvm.org/D125828
2022-05-24 10:28:59 -07:00
Mike Rice 772b0c44a4 [OpenMP] Fix mangling for linear parameters with negative stride
The 'n' character is used in place of '-' in the mangled name.

Differential Revision: https://reviews.llvm.org/D125406
2022-05-11 14:02:09 -07:00
Mike Rice 0dbaef61b5 [OpenMP] Fix mangling for linear modifiers with variable stride
This adds support for variable stride with the val, uval, and ref linear
modifiers.  Previously only the no modifer type ls<argno> was supported.

  val  -> Ls<argno>
  uval -> Us<argno>
  ref  -> Rs<argno>

Differential Revision: https://reviews.llvm.org/D125330
2022-05-10 14:12:44 -07:00
Mike Rice 1a02519bc5 [OpenMP] Add mangling support for linear modifiers (ref,uval,val)
Add mangling for linear parameters specified with ref, uval, and val
for 'omp declare simd' vector functions.

Add missing stride for linear this parameters.

Differential Revision: https://reviews.llvm.org/D125269
2022-05-10 09:56:55 -07:00
David Pagan 37471cf2c3 [clang][OpenMP] Local variable alignment incorrect with align clause
If alignment specified with align clause is less than natural alignment for
list item type, the alignment should be set to the natural alignment.

See OMP5.1 specification, page 185, lines 7-10

Differential Revision: https://reviews.llvm.org/D124676
2022-05-03 13:10:01 -07:00
Joseph Huber 643c9b22ef [OpenMP] Make generating offloading entries more generic
This patch moves the logic for generating the offloading entries to the
OpenMPIRBuilder. This makes it easier to re-use in other places, such as
for OpenMP support in Flang or using the same method for generating
offloading entires for other languages like Cuda.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D123460
2022-04-29 09:14:31 -04:00
Joseph Huber 9d3550c517 [OpenMP] Add AMDGPU calling convention to ctor / dtor functions
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.

Depends on D122504

Fixes #54091

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122515
2022-03-25 22:44:20 -04:00
Joseph Huber 3c6d32ec6c [OpenMP] Make Ctor / Dtor functions have external visibility
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D122504
2022-03-25 22:44:17 -04:00
Joseph Huber b9f67d44ba [OpenMP] Replace device kernel linkage with weak_odr
Currently the device kernels all have weak linkage to prevent linkage
errors on multiple defintions. However, this prevents some optimizations
from adequately analyzing them because of the nature of weak linkage.
This patch replaces the weak linkage with weak_odr linkage so we can
statically assert that multiple declarations of the same kernel will
have the same definition.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122443
2022-03-25 11:29:15 -04:00
Jennifer Yu a6cdac48ff Eliminate extra set of simd variant function attribute.
Current clang generates extra set of simd variant function attribute
with extra 'v' encoding.
For example:
_ZGVbN2v__Z5add_1Pf vs _ZGVbN2vv__Z5add_1Pf
The problem is due to declaration of ParamAttrs following:
    llvm::SmallVector<ParamAttrTy, 8> ParamAttrs(ParamPositions.size());
where ParamPositions.size() is grown after following assignment:
    Pos = ParamPositions[PVD];
So the PVD is not find in ParamPositions.

The problem is ParamPositions need to set for each FD decl. To fix this

Move ParamPositions's init inside while loop for each FD.

Differential Revision: https://reviews.llvm.org/D122338
2022-03-24 13:27:28 -07:00
Dávid Bolvanský a683ba4ff5 [NFCI] Fix set-but-unused warning in CGOpenMPRuntime.cpp 2022-03-24 07:49:21 +01:00
Joseph Huber 0d16c23af1 [OpenMP] Do not create offloading entries for internal or hidden symbols
Currently we create offloading entries to register device variables with
the host. When we register a variable we will look up the symbol in the
device image and map the device address to the host address. This is a
problem when the symbol is declared with hidden visibility or internal
linkage. This means the symbol is not accessible externally and we
cannot get its address. We should still allow static variables to be
declared on the device, but ew should not create an offloading entry for
them so they exist independently on the host and device.

Fixes #54309

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122352
2022-03-23 18:27:16 -04:00
Nikita Popov c070d5ceff [CGOpenMPRuntime] Remove uses of deprecated Address constructor
And as these are the last remaining uses, also remove the
constructor itself.
2022-03-23 12:40:44 +01:00