Use unique_ptr to manage the lifetime of ABIInfo member inside TargetCodeGenInfo.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D79033
After speaking with Craig Topper about some recent defects, he pointed
out that _ExtInts should be passed indirectly if larger than the largest
int register, and like ints when smaller than that. This patch
implements that.
Note that this changed the way vaargs worked quite a bit, but they still
work.
Differential Revision: https://reviews.llvm.org/D78785
Summary:
Change the default ABI to be compatible with GCC. For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no change for
32-bit Linux, where Clang continues to return all structs in memory.
Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc. These options are only for PPC32; reject them on PPC64 and
other targets. The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.
To actually return a struct in registers, coerce it to an integer of the
same size. LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.
Fixes PR#40736
Patch by George Koehler!
Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sdesmalen, efriedma, krememek
Reviewed By: sdesmalen, efriedma
Subscribers: dexonsmith, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77257
Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
surface/texture reference support. So far, both the host and device
use the same reference type, which could be revised later when
interface/implementation is stablized.
Reviewers: yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77583
Summary:
- Even though the bindless surface/texture interfaces are promoted,
there are still code using surface/texture references. For example,
[PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
compilation issue for code using `tex2D` with texture references. For
better compatibility, this patch proposes the support of
surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
`nvcc` does use builtins for texture support. From the limited NVVM
documentation[^nvvm] and NVPTX backend texture/surface related
tests[^test], it's believed that surface/texture references are
supported by replacing their reference types, which are annotated with
`device_builtin_surface_type`/`device_builtin_texture_type`, with the
corresponding handle-like object types, `cudaSurfaceObject_t` or
`cudaTextureObject_t`, in the device-side compilation. On the host
side, that global handle variables are registered and will be
established and updated later when corresponding binding/unbinding
APIs are called[^bind]. Surface/texture references are most like
device global variables but represented in different types on the host
and device sides.
- In this patch, the following changes are proposed to support that
behavior:
+ Refine `device_builtin_surface_type` and
`device_builtin_texture_type` attributes to be applied on `Type`
decl only to check whether a variable is of the surface/texture
reference type.
+ Add hooks in code generation to replace that reference types with
the correponding object types as well as all accesses to them. In
particular, `nvvm.texsurf.handle.internal` should be used to load
object handles from global reference variables[^texsurf] as well as
metadata annotations.
+ Generate host-side registration with proper template argument
parsing.
---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later.
Reviewers: tra, rjmccall, yaxunl, a.sidorin
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76365
This has no effect on how LLVM passes the arguments, but it prevents
rewriteWithInAlloca from thinking that these parameters should be part
of the inalloca pack.
Follow-up to D72114
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D74452
This brings back 2af74e27ed and reverts
eaabaf7e04.
The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
struct alignas(uint64_t) Wrapper { uint64_t P };
void f(uint64_t p);
void f(Wrapper p);
MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
Summary:
For now, this ABI simply expands all possible aggregate arguments and
returns all possible aggregates directly. This ABI will change rapidly
as we prototype and benchmark a new ABI that takes advantage of
multivalue return and possibly other changes from the MVP ABI.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72972
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned
However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.
The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
passed indirectly
Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.
There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors
For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.
For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.
Related to D71915 and D72110
Fixes most of PR44395
Reviewed By: rjmccall, craig.topper, erichkeane
Differential Revision: https://reviews.llvm.org/D72114
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases. The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:
void __vectorcall vectorcall_indirect_vec(
double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
__m128 xmm5,
__m128 ecx,
int edx,
__m128 mem);
classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.
I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.
Reviewers: erichkeane, craig.topper
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72110
Summary:
Previously, since these aggregates are > 2*XLen, Clang would think they
were being returned indirectly and thus would decrease the number of
available GPRs available by 1. For long argument lists this could lead
to a struct argument incorrectly being passed indirectly.
Reviewers: asb, lenary
Reviewed By: asb, lenary
Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69590
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.
Also adds support for operand modifiers in Intel-style inline assembly.
Reviewers: rnk
Reviewed By: rnk
Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71677
This is equivalent to the existing `import_name` and `import_module`
attributes which control the import names in the final wasm binary
produced by lld.
This maps the existing
This attribute currently requires a string rather than using the
symbol name for a couple of reasons:
1. Avoid confusion with static and dynamic linking which is
based on symbol name. Exporting a function from a wasm module using
this directive is orthogonal to both static and dynamic linking.
2. Avoids name mangling.
Differential Revision: https://reviews.llvm.org/D70520
AggValueSlot
This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.
Original commit message:
This is needed for the pointer authentication work we plan to do in the
near future.
a63a81bd99/clang/docs/PointerAuthentication.rst
Summary:
- As variadic parameters have the lowest rank in overload resolution,
without real usage of `va_arg`, they are commonly used as the
catch-all fallbacks in SFINAE. As the front-end still reports errors
on calls to `va_arg`, the declaration of functions with variadic
arguments should be allowed in general.
Reviewers: jlebar, tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69389
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.
llvm-svn: 373918
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<RecordType> directly and if not assert will fire for us.
llvm-svn: 373584
As far as I can tell, gcc passes 256/512 bit vectors __int128 in memory. And passes a vector of 1 _int128 in an xmm register. The backend considers <X x i128> as an illegal type and will scalarize any arguments with that type. So we need to coerce the argument types in the frontend to match to avoid the illegal type.
I'm restricting this to change to Linux and NetBSD based on the
how similar ABI changes have been handled in the past.
PS4, FreeBSD, and Darwin are unaffected. I've also added a
new -fclang-abi-compat version to restore the old behavior.
This issue was identified in PR42607. Though even with the types changed, we still seem to be doing some unnecessary stack realignment.
llvm-svn: 371169
Summary:
r337347 added support for the Signal Processing Engine (SPE) to LLVM.
This follows that up with the clang side.
This adds -mspe and -mno-spe, to match GCC.
Subscribers: nemanjai, kbarton, cfe-commits
Differential Revision: https://reviews.llvm.org/D49754
llvm-svn: 371066
Summary:
D66168 passes size 0 structs indirectly, while the wasm backend expects it to
be passed directly. This causes subsequent variadic arguments to be read
incorrectly.
This diff changes it so that size 0 structs are passed directly.
Reviewers: dschuff, tlively, sbc100
Reviewed By: dschuff
Subscribers: jgravelle-google, aheejin, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66255
llvm-svn: 369042
Summary:
In the WebAssembly backend, when lowering variadic function calls, non-single
member aggregate type arguments are always passed by pointer.
However, when emitting va_arg code in clang, the arguments are instead read as
if they are passed directly. This results in the pointer being read as the
actual structure.
Fixes https://github.com/emscripten-core/emscripten/issues/9042.
Reviewers: tlively, sbc100, kripken, aheejin, dschuff
Reviewed By: dschuff
Subscribers: dschuff, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66168
llvm-svn: 368750
The RISC-V hard float calling convention requires the frontend to:
* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s)
* Track usage of GPRs and FPRs in order to gate the above, and to
determine when signext/zeroext attributes must be added to integer
scalars
This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.
Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.
Re-landed after backing out 366450 due to missed hunks.
Differential Revision: https://reviews.llvm.org/D60456
llvm-svn: 366480
The RISC-V hard float calling convention requires the frontend to:
* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s) * Track usage of GPRs and FPRs in order to gate the
above, and to
determine when signext/zeroext attributes must be added to integer
scalars
This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.
Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.
Differential Revision: https://reviews.llvm.org/D60456
llvm-svn: 366450
To enable a new implicit kernel argument,
increased the number of argument bytes from 48 to 56.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D63756
llvm-svn: 365643
This patch introduces support of hip_pinned_shadow variable for HIP.
A hip_pinned_shadow variable is a global variable with attribute hip_pinned_shadow.
It has external linkage on device side and has no initializer. It has internal
linkage on host side and has initializer or static constructor. It can be accessed
in both device code and host code.
This allows HIP runtime to implement support of HIP texture reference.
Differential Revision: https://reviews.llvm.org/D62738
llvm-svn: 364381
This introduced MMX instructions in code that wasn't previously using
them, breaking programs using 64-bit vectors and x87 floating-point in
the same application. See discussion on the code review for more
details.
> According to System V i386 ABI: the __m64 type paramater and return
> value are passed by MMX registers. But current implementation treats
> __m64 as i64 which results in parameter passing by stack and returning
> by EDX and EAX.
>
> This patch fixes the bug (https://bugs.llvm.org/show_bug.cgi?id=41029)
> for Linux and NetBSD.
>
> Patch by Wei Xiao (wxiao3)
>
> Differential Revision: https://reviews.llvm.org/D59744
llvm-svn: 363790
If the host uses 128 bit long doubles, the compiler should generate correct code for NVPTX devices. If the return type has 128 bit long doubles, in LLVM IR this type must be coerced to int array instead.
llvm-svn: 363720
Summary:
When a function argument or return type is a homogeneous aggregate
which contains an FP16 vector but the target does not support FP16
operations natively, the type must be converted into an array of
integer vectors by then front end (otherwise LLVM will handle FP16
vectors incorrectly by scalarizing them and promoting FP16 to float,
see https://reviews.llvm.org/D50507).
Currently the logic for checking whether or not a given homogeneous
aggregate contains FP16 vectors is incorrect: it only looks at the
type of the first vector.
This patch fixes the issue by adding a new method
ARMABIInfo::containsAnyFP16Vectors and using it. The traversal logic
of this method is largely the same as in
ABIInfo::isHomogeneousAggregate.
Reviewers: eli.friedman, olista01, ostannard
Reviewed By: ostannard
Subscribers: ostannard, john.brawn, javed.absar, kristof.beyls, pbarrio, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63437
llvm-svn: 363687
Enable 48-bytes of implicit arguments for HIP as well. Earlier it was enabled for OpenCL. This code is specific to AMDGPU target.
Differential Revision: https://reviews.llvm.org/D62244
llvm-svn: 363414
According to System V i386 ABI: the __m64 type paramater and return
value are passed by MMX registers. But current implementation treats
__m64 as i64 which results in parameter passing by stack and returning
by EDX and EAX.
This patch fixes the bug (https://bugs.llvm.org/show_bug.cgi?id=41029)
for Linux and NetBSD.
Patch by Wei Xiao (wxiao3)
Differential Revision: https://reviews.llvm.org/D59744
llvm-svn: 363116
According to i386 System V ABI 2.1: Structures and unions assume the
alignment of their most strictly aligned component. But current
implementation always takes them as 4-byte aligned which will result
in incorrect code, e.g:
1 #include <immintrin.h>
2 typedef union {
3 int d[4];
4 __m128 m;
5 } M128;
6 extern void foo(int, ...);
7 void test(void)
8 {
9 M128 a;
10 foo(1, a);
11 foo(1, a.m);
12 }
The first call (line 10) takes the second arg as 4-byte aligned while
the second call (line 11) takes the second arg as 16-byte aligned.
There is oxymoron for the alignment of the 2 calls because they should
be the same.
This patch fixes the bug by following i386 System V ABI and apply it to
Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks
at present.
Patch by Wei Xiao (wxiao3)
Differential Revision: https://reviews.llvm.org/D60748
llvm-svn: 361934
Overaligned and underaligned types (i.e. types where the alignment has been
increased or decreased using the aligned and packed attributes) weren't being
correctly handled in all cases, as the unadjusted alignment should be used.
This patch also adjusts getTypeUnadjustedAlign to correctly handle typedefs of
non-aggregate types, which it appears it never had to handle before.
Differential Revision: https://reviews.llvm.org/D62152
llvm-svn: 361372
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.
Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.
The design goals were to provide:
- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
environments (MSVC in particular).
Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.
In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:
1. There are no competing defined symbols in a given set of libraries, or
if they exist, the program owner doesn't care which is linked to their
program.
2. There may be circular dependencies between libraries.
The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:
.section ".deplibs","MS",@llvm_dependent_libraries,1
.asciz "foo"
For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.
LLD processes the dependent library specifiers in the following way:
1. Dependent libraries which are found from the specifiers in .deplibs sections
of relocatable object files are added when the linker decides to include that
file (which could itself be in a library) in the link. Dependent libraries
behave as if they were appended to the command line after all other options. As
a consequence the set of dependent libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
strings in a .deplibs section by; first, handling the string as if it was
specified on the command line; second, by looking for the string in each of the
library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of the
library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
dependent libraries.
Rationale for the above points:
1. Adding the dependent libraries last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
will affect all of the dependent libraries. There is a potential problem of
surprise for developers, who might not realize that these options would apply
to these "invisible" input files; however, despite the potential for surprise,
this is easy for developers to reason about and gives developers the control
that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
find input files. The different search methods are tried by the linker in most
obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
is not necessary: if finer control is required developers can fall back to using
the command line directly.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.
Differential Revision: https://reviews.llvm.org/D60274
llvm-svn: 360984
Summary:
- `__constant__` variables should not be `hidden` as the linker may turn
them into `LOCAL` symbols.
Reviewers: yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61194
llvm-svn: 359344
with notail on x86-64.
On x86-64, the epilogue code inserted before the tail jump blocks the
autoreleased return optimization.
rdar://problem/38675807
Differential Revision: https://reviews.llvm.org/D59656
llvm-svn: 356705
This allows the global visibility controls to be restrictive while still
populating the dynamic symbol table where required.
Differential Revision: https://reviews.llvm.org/D56871
llvm-svn: 353870
The various EltSize, Offset, DataLayout, and StructLayout arguments
are all computable from the Address's element type and the DataLayout
which the CGBuilder already has access to.
After having previously asserted that the computed values are the same
as those passed in, now remove the redundant arguments from
CGBuilder's Create*GEP functions.
Differential Revision: https://reviews.llvm.org/D57767
llvm-svn: 353629
Some of these functions take some extraneous arguments, e.g. EltSize,
Offset, which are computable from the Type and DataLayout.
Add some asserts to ensure that the computed values are consistent
with the passed-in values, in preparation for eliminating the
extraneous arguments. This also asserts that the Type is an Array for
the calls named "Array" and a Struct for the calls named "Struct".
Then, correct a couple of errors:
1. Using CreateStructGEP on an array type. (this causes the majority
of the test differences, as struct GEPs are created with i32
indices, while array GEPs are created with i64 indices)
2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on
x86-64 NACL (which uses 32-bit pointers).
Differential Revision: https://reviews.llvm.org/D57766
llvm-svn: 353529
This is similar to import_module, but sets the import field name
instead.
By default, the import field name is the same as the C/asm/.o symbol
name. However, there are situations where it's useful to have it be
different. For example, suppose I have a wasm API with a module named
"pwsix" and a field named "read". There's no risk of namespace
collisions with user code at the wasm level because the generic name
"read" is qualified by the module name "pwsix". However in the C/asm/.o
namespaces, the module name is not used, so if I have a global function
named "read", it is intruding on the user's namespace.
With the import_field module, I can declare my function (in libc) to be
"__read", and then set the wasm import module to be "pwsix" and the wasm
import field to be "read". So at the C/asm/.o levels, my symbol is
outside the user namespace.
Differential Revision: https://reviews.llvm.org/D57602
llvm-svn: 352930
This adds a C/C++ attribute which corresponds to the LLVM IR wasm-import-module
attribute. It allows code to specify an explicit import module.
Differential Revision: https://reviews.llvm.org/D57160
llvm-svn: 352106
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
* Accept as an argument constants in range 0..63 (aligned with TI headers and linker scripts provided with TI GCC toolchain).
* Emit function attribute 'interrupt'='xx' instead of aliases (used in the backend to create a section for particular interrupt vector).
* Add more diagnostics.
Patch by Kristina Bessonova!
Differential Revision: https://reviews.llvm.org/D56663
llvm-svn: 351344
Summary:
- This adopts SwiftABIInfo as the base class for WebAssemblyABIInfo, which is in keeping with what is done for other targets for which Swift is supported.
- This is a minimal patch to unblock exploration of WASM support for Swift (https://bugs.swift.org/browse/SR-9307)
Reviewers: rjmccall, sunfish
Reviewed By: rjmccall
Subscribers: ahti, dschuff, sbc100, jgravelle-google, aheejin, cfe-commits
Differential Revision: https://reviews.llvm.org/D56188
llvm-svn: 350372
For arguments, pass it indirectly, since the ABI doc says pretty clearly
that arguments larger than 8 bytes are passed indirectly. This makes
va_list handling easier, anyway.
When returning, GCC returns in XMM0, and we match them.
Fixes PR39492.
llvm-svn: 345676
Adds support for -mno-stack-arg-probe and -mstack-probe-size.
(Not really happy copy-pasting code, but that's what we do for all the
other Windows targets.)
Differential Revision: https://reviews.llvm.org/D53617
llvm-svn: 345354
Summary:
On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:
void foo(int, ...);
typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
void bar(float16x4_t x) {
foo(42, x);
}
According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.
Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.
One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).
Reviewers: eli.friedman, olista01, SjoerdMeijer, javed.absar, efriedma
Reviewed By: javed.absar, efriedma
Subscribers: efriedma, kristof.beyls, cfe-commits, chrib
Differential Revision: https://reviews.llvm.org/D50507
llvm-svn: 342034
- Add a command line options -msign-return-address to enable return address
signing
- Armv8.3a added instructions to sign the return address to help mitigate
against ROP attacks
- This patch adds command line options to generate function attributes that
signal to the back whether return address signing instructions should be
added
Differential revision: https://reviews.llvm.org/D49793
llvm-svn: 340019
The "Procedure Call Procedure Call Standard for the ARM® Architecture"
(https://static.docs.arm.com/ihi0042/f/IHI0042F_aapcs.pdf), specifies that
composite types are passed according to their "natural alignment", i.e. the
alignment before alignment adjustment on the entire composite is applied.
The same applies for AArch64 ABI.
Clang, however, used the adjusted alignment.
GCC already implements the ABI correctly. With this patch Clang becomes
compatible with GCC and passes such arguments in accordance with AAPCS.
Differential Revision: https://reviews.llvm.org/D46013
llvm-svn: 338279
Summary:
Clang supports the GNU style ``__attribute__((interrupt))`` attribute on RISCV targets.
Permissible values for this parameter are user, supervisor, and machine.
If there is no parameter, then it defaults to machine.
Reference: https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html
Based on initial patch by Zhaoshi Zheng.
Reviewers: asb, aaron.ballman
Reviewed By: asb, aaron.ballman
Subscribers: rkruppe, the_o, aaron.ballman, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, cfe-commits
Differential Revision: https://reviews.llvm.org/D48412
llvm-svn: 338045
Update clang to treat fp128 as a valid base type for homogeneous aggregate
passing and returning.
Differential Revision: https://reviews.llvm.org/D48044
llvm-svn: 336308
The WebAssembly backend in particular benefits from being
able to distinguish between varargs functions (...) and prototype-less
C functions.
Differential Revision: https://reviews.llvm.org/D48443
llvm-svn: 335510
Currently clang set kernel calling convention for CUDA/HIP after
arranging function, which causes incorrect kernel function type since
it depends on calling convention.
This patch moves setting kernel convention before arranging
function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447
The force_align_arg_pointer attribute was using a hardcoded 16-byte
alignment value which in combination with -mstack-alignment=32 (or
larger) would produce a misaligned stack which could result in crashes
when accessing stack buffers using aligned AVX load/store instructions.
Fix the issue by using the "stackrealign" function attribute instead
of using a hardcoded 16-byte alignment.
Patch By: Gramner
Differential Revision: https://reviews.llvm.org/D45812
llvm-svn: 330331
This reverts r328795 which introduced an issue with referencing __global__
function templates. More details in the original review D44747.
llvm-svn: 329099
This patch sets target specific calling convention for CUDA kernels in IR.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44747
llvm-svn: 328795
ObjC and ObjC++ pass non-trivial structs in a way that is incompatible
with each other. For example:
typedef struct {
id f0;
__weak id f1;
} S;
// this code is compiled in c++.
extern "C" {
void foo(S s);
}
void caller() {
// the caller passes the parameter indirectly and destructs it.
foo(S());
}
// this function is compiled in c.
// 'a' is passed directly and is destructed in the callee.
void foo(S a) {
}
This patch fixes the incompatibility by passing and returning structs
with __strong or weak fields using the C ABI in C++ mode. __strong and
__weak fields in a struct do not cause the struct to be destructed in
the caller and __strong fields do not cause the struct to be passed
indirectly.
Also, this patch fixes the microsoft ABI bug mentioned here:
https://reviews.llvm.org/D41039?id=128767#inline-364710
rdar://problem/38887866
Differential Revision: https://reviews.llvm.org/D44908
llvm-svn: 328731
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting. This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.
llvm-svn: 328636
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue.
Differential Revision: https://reviews.llvm.org/D44696
llvm-svn: 328350
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use a function attribute to communicate to the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D43735
llvm-svn: 328347
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
This recommits r327206, which was reverted because it caused
module-enabled builders to fail. I discovered that the
CXXRecordDecl::CanPassInRegisters flag wasn't being set correctly in
some cases after I moved it to RecordDecl.
Thanks to Eric Liu for helping me investigate the bug.
rdar://problem/33599681
https://reviews.llvm.org/D44095
llvm-svn: 327870
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D44095
llvm-svn: 327206
Summary:
This patch is a fix for following issue:
https://bugs.llvm.org/show_bug.cgi?id=31362 The problem was caused by front end
lowering C calling conventions without taking into account calling conventions
enforced by attribute. In this case win64cc was no correctly lowered on targets
other than Windows.
Reviewed By: rnk (Reid Kleckner)
Differential Revision: https://reviews.llvm.org/D43016
Author: belickim <mateusz.belicki@intel.com>
llvm-svn: 324594
I found this while looking at the ppc failures caused by the dso_local
change.
The issue was that the patch would produce the wrong answer for
available_externally. Having ForDefinition_t available in places where
the code can just check the linkage is a bit of a foot gun.
This patch removes the ForDefiniton_t argument in places where the
linkage is already know.
llvm-svn: 324499
When trying to track down a different bug, we discovered
that calling __builtin_va_arg on a vec3f type caused
the SROA pass to issue a warning that there was an illegal
access.
Further research showed that the vec3f type is
alloca'ed as size '12', but the _builtin_va_arg code
on x86_64 was always loading this out of registers as
{double, double}. Thus, the 2nd store into the vec3f
was storing in bytes 12-15!
This patch alters the original implementation which always
assumed {double, double} to use the actual coerced type
instead, so the LLVM-IR generated is a load/GEP/store of
a <2 x float> and a float, rather than a double and a double.
Tests were added for all combinations I could think of that
would fit in 2 FP registers, and all work exactly as expected.
Differential Revision: https://reviews.llvm.org/D42811
llvm-svn: 324098
Pass and return _Float16 as if it were an int or float for ARM, but with the
top 16 bits unspecified, similarly like we already do for __fp16.
We will implement proper half-precision function argument lowering in the ARM
backend soon, but want to use this workaround in the mean time.
Differential Revision: https://reviews.llvm.org/D42318
llvm-svn: 323185
RISCVABIInfo is implemented in terms of XLen, supporting both RV32 and RV64.
Unfortunately we need to count argument registers in the frontend in order to
determine when to emit signext and zeroext attributes. Integer scalars are
extended according to their type up to 32-bits and then sign-extended to XLen
when passed in registers, but are anyext when passed on the stack. This patch
only implements the base integer (soft float) ABIs.
For more information on the RISC-V ABI, see [the ABI
doc](https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md),
my [golden model](https://github.com/lowRISC/riscv-calling-conv-model), and
the [LLVM RISC-V calling convention
patch](https://reviews.llvm.org/D39898#2d1595b4) (specifically the comment
documenting frontend expectations).
Differential Revision: https://reviews.llvm.org/D40023
llvm-svn: 322494
As @rjmccall suggested in D40023, we can get rid of
ABIInfo::shouldSignExtUnsignedType (used to handle cases like the Mips calling
convention where 32-bit integers are always sign extended regardless of the
sign of the type) by adding a SignExt field to ABIArgInfo. In the common case,
this new field is set automatically by ABIArgInfo::getExtend based on the sign
of the type. For targets that want greater control, they can use
ABIArgInfo::getSignExtend or ABIArgInfo::getZeroExtend when necessary. This
change also cleans up logic in CGCall.cpp.
There is no functional change intended in this patch, and all tests pass
unchanged. As noted in D40023, Mips might want to sign-extend unsigned 32-bit
integer return types. A future patch might modify
MipsABIInfo::classifyReturnType to use MipsABIInfo::extendType.
Differential Revision: https://reviews.llvm.org/D41999
llvm-svn: 322396
Darwin uses char * for the variadic list type (va_list). We use the PPC
SVR4 ABI for PPC, which uses a structure type for the va_list. When
constructing the GEP, we would fail due to the incorrect handling for
the va_list. Correct this to use the right type.
llvm-svn: 316599
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
This change will make it possible to use -fsanitize=function on Darwin and
possibly on other platforms. It fixes an issue with the way RTTI is stored into
function prologue data.
On Darwin, addresses stored in prologue data can't require run-time fixups and
must be PC-relative. Run-time fixups are undesirable because they necessitate
writable text segments, which can lead to security issues. And absolute
addresses are undesirable because they break PIE mode.
The fix is to create a private global which points to the RTTI, and then to
encode a PC-relative reference to the global into prologue data.
Differential Revision: https://reviews.llvm.org/D37597
llvm-svn: 313096
This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
llvm-svn: 311823
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643
The comment markers accepted by the assembler vary between different targets,
but '//' is always accepted, so we should use that for consistency.
Differential revision: https://reviews.llvm.org/D36666
llvm-svn: 311325
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
llvm-svn: 310704
This is an improvement over always using byval for
structs.
This will use registers until ~16 are used, and then
switch back to byval. This needs more work, since I'm
not sure it ever really makes sense to use byval. If
the register limit is exceeded, the arguments still
end up passed on the stack, but with a different ABI.
It also may make sense to base this on number of
registers used for non-struct arguments, rather than
just arguments that appear first in the argument list.
llvm-svn: 310527
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
This option when combined with -mgpopt and -membedded-data places all
uninitialized constant variables in the read-only section.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D35917
llvm-svn: 309940
The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
helper functions in 4.1.2 The floating-point helper functions. These
functions always use the base PCS (soft-fp). However helper functions
defined outside of this document such as the complex-number multiply and
divide helpers are not covered by this requirement and should use
hard-float PCS if the target is hard-float as both compiler-rt and libgcc
for a hard-float sysroot implement these functions with a hard-float PCS.
All of the floating point helper functions that are explicitly soft float
are expanded in the llvm ARM backend. This change makes clang not force the
BuiltinCC to AAPCS for AAPCS_VFP. With this change the ARM compiler-rt
tests involving _Complex pass with both hard-fp and soft-fp targets.
Differential Revision: https://reviews.llvm.org/D35538
llvm-svn: 309257
This change is part of the RegCall calling convention support for LLVM.
Existing RegCall implementation was extended to include correct handling of
Complex Long Double type. Complex long double types should be returned/passed
in memory and not register stack. This patch implements this behavior.
Patch by: eandrews
Differential Revision: https://reviews.llvm.org/D35259
llvm-svn: 308769
This patch adds support for the `long_call`, `far`, and `near` attributes
for MIPS targets. The `long_call` and `far` attributes are synonyms. All
these attributes override `-mlong-calls` / `-mno-long-calls` command
line options for particular function.
Differential revision: https://reviews.llvm.org/D35479
llvm-svn: 308667
Certain targets (e.g. amdgcn) require global variable to stay in global or constant address
space. In C or C++ global variables are emitted in the default (generic) address space.
This patch introduces virtual functions TargetCodeGenInfo::getGlobalVarAddressSpace
and TargetInfo::getConstantAddressSpace to handle this in a general approach.
It only affects IR generated for amdgcn target.
Differential Revision: https://reviews.llvm.org/D33842
llvm-svn: 307470
In running some internal vectorcall tests in 32 bit mode, we discovered that the
behavior I'd previously implemented for x64 (and applied to x32) regarding the
assignment of SSE registers was incorrect. See spec here:
https://msdn.microsoft.com/en-us/library/dn375768.aspx
My previous implementation applied register argument position from the x64
version to both. This isn't correct for x86, so this removes and refactors that
section. Additionally, it corrects the integer/int-pointer assignments. Unlike
x64, x86 permits integers to be assigned independent of position.
Finally, the code for 32 bit was cleaned up a little to clarify the intent,
as well as given a descriptive comment.
Differential Revision: https://reviews.llvm.org/D34455
llvm-svn: 305928
Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value.
Reviewers: Anastasia, yaxunl
Reviewed By: Anastasia
Subscribers: ahatanak, cfe-commits
Differential Revision: https://reviews.llvm.org/D34235
llvm-svn: 305796
Rationale: OpenCL kernels are called via an explicit runtime API
with arguments set with clSetKernelArg(), not as normal sub-functions.
Return SPIR_KERNEL by default as the kernel calling convention to ensure
the fingerprint is fixed such way that each OpenCL argument gets one
matching argument in the produced kernel function argument list to enable
feasible implementation of clSetKernelArg() with aggregates etc. In case
we would use the default C calling conv here, clSetKernelArg() might
break depending on the target-specific conventions; different targets
might split structs passed as values to multiple function arguments etc.
https://reviews.llvm.org/D33639
llvm-svn: 304389
This patch adds support for the `micromips` and `nomicromips` attributes
for MIPS targets.
Differential revision: https://reviews.llvm.org/D33363
llvm-svn: 303546
Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.
Differential Revision: https://reviews.llvm.org/D32248
llvm-svn: 303370
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than
16/64 bytes, depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
with minor changes (use regexp instead of the hardcoded values) to the test.
llvm-svn: 302670
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.
Differential revision: https://reviews.llvm.org/D32550
llvm-svn: 302572
Reverting
Modified MipsABIInfo::classifyArgumentType so that it now coerces
aggregate structures only if the size of said aggregate is less than 16/64
bytes, depending on the ABI.
as it broke clang-with-lto-ubuntu builder.
llvm-svn: 302555
Modified MipsABIInfo::classifyArgumentType so that it now coerces aggregate
structures only if the size of said aggregate is less than 16/64 bytes,
depending on the ABI.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D32900
llvm-svn: 302547
It turns out there are some sort-of-but-not-quite empty structs that break all
the rules. For example:
struct SuperEmpty { int arr[0]; };
struct SortOfEmpty { struct SuperEmpty e; };
Both of these have sizeof == 0, even in C++ mode, for GCC compatibility. The
first one also doesn't occupy a register when passed by value in GNU C++ mode,
unlike everything else.
On Darwin, we want to ignore the lot (and especially don't want to try to use
an i0 as we were).
llvm-svn: 302313
These two attributes specify the same info in a different way.
AMGPU BE only checks the latter as a target specific attribute
as opposed to language specific reqd_work_group_size.
This change produces amdgpu_flat_work_group_size out of
reqd_work_group_size if specified.
Differential Revision: https://reviews.llvm.org/D31728
llvm-svn: 299678
Use # as the comment leader for AArch64 auto-release elision marker.
This is to keep it in sync with the value used in swift. When building
libdispatch for Linux AArch64, the auto-release elision marker was
emitted. However, ELF uses # as the comment leader while MachO accepts
both ; and #. Use the common marker for it instead.
llvm-svn: 294877
Summary:
This teaches clang how to parse and lower the 'interrupt' and 'naked'
attributes.
This allows interrupt signal handlers to be written.
Reviewers: aaron.ballman
Subscribers: malcolm.parsons, cfe-commits
Differential Revision: https://reviews.llvm.org/D28451
llvm-svn: 294402
This comes up in V8, which has a Handle template class that wraps a
typed pointer, and is frequently passed by value. The pointer is stored
in the base, HandleBase. This change allows us to pass the struct as a
pointer instead of using byval. This avoids creating tons of temporary
allocas that we copy from during call lowering.
Eventually, it would be good to use FCAs here instead.
llvm-svn: 291917
Front end component (back end changes are D27392). The vectorcall
calling convention was broken subtly in two cases. First,
it didn't properly handle homogeneous vector aggregates (HVAs).
Second, the vectorcall specification requires that only the
first 6 parameters be eligible for register assignment.
This patch fixes both issues.
Differential Revision: https://reviews.llvm.org/D27529
llvm-svn: 291041
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.
Differential Revision: https://reviews.llvm.org/D26671
llvm-svn: 289647
In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space.
Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value.
Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast.
Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values.
This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way.
This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally.
Differential Revision: https://reviews.llvm.org/D26196
llvm-svn: 289252
This patch implements the register call calling convention, which ensures
as many values as possible are passed in registers. CodeGen changes
were committed in https://reviews.llvm.org/rL284108.
Differential Revision: https://reviews.llvm.org/D25204
llvm-svn: 285849
Enable soft-float support on PPC64, as the backend now supports it. Also, the
backend now uses -hard-float instead of +soft-float, so set the target features
accordingly.
Fixes PR26970.
llvm-svn: 283061
__attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size
__attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit
Differential Revision: https://reviews.llvm.org/D24513
llvm-svn: 282371
Move the definition of `getTriple()` into the header. It would just call
`getTarget().getTriple()`. Inline the definition to allow the compiler to see
the same amount of the layout as previously. Remove the more verbose
`getTarget().getTriple()` in favour of `getTriple()`.
llvm-svn: 281487
The PPC64 DWARF register-size table did not match the ABI specification (or
GCC, for that matter). Fix that, and add a regression test.
Fixes PR27931.
llvm-svn: 280053
Structs are currently handled as pointer + byval, which makes AMDGPU
LLVM backend generate incorrect code when structs are used. This patch
changes struct argument to be handled directly and without flattening,
which Clover (Mesa 3D Gallium OpenCL state tracker) will be able to
handle. Flattening would expand the struct to individual elements and
pass each as a separate argument, which Clover can not
handle. Furthermore, such expansion does not fit the OpenCL
programming model which requires to explicitely specify each argument
index, size and memory location.
Patch by Vedran Miletić
llvm-svn: 279463
We processed unnamed bitfields after our logic for non-vector field
elements in records larger than 128 bits. The vector logic would
determine that the bit-field disqualifies the record from occupying a
register despite the unnamed bit-field not participating in the record
size nor its alignment.
N.B. This behavior matches GCC and ICC.
llvm-svn: 278656
An __m512 vector type wrapped in a structure should be passed in a
vector register.
Our prior implementation was based on a draft version of the psABI.
This fixes PR28975.
N.B. The update to the ABI was made here:
https://github.com/hjl-tools/x86-psABI/commit/30f9c9
llvm-svn: 278655
Summary:
Based on a patch by Michael Mueller.
This attribute specifies that a function can be hooked or patched. This
mechanism was originally devised by Microsoft for hotpatching their
binaries (which they're constantly updating to stay ahead of crackers,
script kiddies, and other ne'er-do-wells on the Internet), but it's now
commonly abused by Windows programs that want to hook API functions. It
is for this reason that this attribute was added to GCC--hence the name,
`ms_hook_prologue`.
Depends on D19908.
Reviewers: rnk, aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D19909
llvm-svn: 278050
The size of image type is reported incorrectly as size of a pointer to address space 0, which causes error when casting image type to pointers by __builtin_astype.
The fix is to get image address space from TargetInfo then report the size accordingly.
Differential Revision: https://reviews.llvm.org/D22927
llvm-svn: 277647
Summary:
In RenderScript, the size of the argument or return value emitted in the
IR is expected to be the same as the size of corresponding qualified
type. For ARM and AArch64, the coercion performed by Clang can
change the parameter or return value to a type whose size is different
(usually larger) than the original aggregate type. Specifically, this
can happen in the following cases:
- Aggregate parameters of size <= 64 bytes and return values smaller
than 4 bytes on ARM
- Aggregate parameters and return values smaller than bytes on
AArch64
This patch coerces the cases above to an integer array that is the same
size and alignment as the original aggregate. A new field is added to
TargetInfo to detect a RenderScript target and limit this coercion just
to that case.
Tests added to test/CodeGen/renderscript.c
Reviewers: rsmith
Subscribers: aemerson, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D22822
llvm-svn: 276904
Allows AMDGCN target to generate images (such as %opencl.image2d_t) in constant address space.
Images will still be generated in global address space by default.
Added tests to existing opencl-types.cl in test\CodeGenOpenCL.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22523
llvm-svn: 276161
Added the opencl.ocl.version metadata to be emitted with amdgcn. Created a static function emitOCLVerMD which is shared between triple spir and target amdgcn.
Also added new testcases to existing test file, spir_version.cl inside test/CodeGenOpenCL.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22424
llvm-svn: 276010
Summary:
Summary:
Change Clang calling convention SpirKernel to OpenCLKernel.
Set calling convention OpenCLKernel for amdgcn as well.
Add virtual method .getOpenCLKernelCallingConv() to TargetCodeGenInfo
and use it to set target calling convention for AMDGPU and SPIR.
Update tests.
Reviewers: rsmith, tstellarAMD, Anastasia, yaxunl
Subscribers: kzhuravl, cfe-commits
Differential Revision: http://reviews.llvm.org/D21367
llvm-svn: 274220
smaller than register as argument in variadic functions on
big endian architectures.
Differential Revision: http://reviews.llvm.org/D21611
llvm-svn: 273665