Mixing LLVM and Clang address spaces can result in subtle bugs, and there
is no need for this hook to use the LLVM IR level address spaces.
Most of this change is just replacing zero with LangAS::Default,
but it also allows us to remove a few calls to getTargetAddressSpace().
This also removes a stale comment+workaround in
CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does
return the expected size for ReferenceType (and handles address spaces).
Differential Revision: https://reviews.llvm.org/D138295
A scalar which exceeds 4 bytes should be returned via a stack slot,
on an AVRTiny device.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D138125
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
short will be promoted to int in UsualUnaryConversions.
Disable it for HLSL to keep int16_t as 16bit.
Reviewed By: aaron.ballman, rjmccall
Differential Revision: https://reviews.llvm.org/D133668
PPC64 ABI pass aggregates smaller than a register into the least
significant bits of the register. In the case of variadic functions,
they will end up right-aligned in their argument slots in the argument
area on big-endian targets. Apply right-alignment for these aggregates.
Fixes#55900.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D133338
The `cl-uniform-work-group` attribute asserts that the global work-size
be a multiple of the work-group specified work group size. This should
allow optimizations. It is already present by default in the AMD
compiler and for HIP kernels so it should be safe to allow this for
OpenMP kernels by default.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135374
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
backward compatibility with previous releases.
To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.
Differential Revision: https://reviews.llvm.org/D133848
When passing arguments with `__fastcall` or `__vectorcall` in 32-bit MSVC, the following arguments have chance to be passed by register if the current one failed. `__regcall` from ICC is on the contrary: https://godbolt.org/z/4MPbzhaMG
All the three calling conversions are not supported in GCC.
Fixes: #57737
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D133920
Reuse most of RISCV's implementation with several exceptions:
1. Assign signext/zeroext attribute to args passed in stack.
On RISCV, integer scalars passed in registers have signext/zeroext
when promoted, but are anyext if passed on the stack. This is defined
in early RISCV ABI specification. But after this change [1], integers
should also be signext/zeroext if passed on the stack. So I think
RISCV's ABI lowering should be updated [2].
While in LoongArch ABI spec, we can see that integer scalars narrower
than GRLEN bits are zero/sign-extended no matter passed in registers
or on the stack.
2. Zero-width bit fields are ignored.
This matches GCC's behavior but it hasn't been documented in ABI sepc.
See https://gcc.gnu.org/r12-8294.
3. `char` is signed by default.
There is another difference worth mentioning is that `char` is signed
by default on LoongArch while it is unsigned on RISCV.
This patch also adds `_BitInt` type support to LoongArch and handle it
in LoongArchABIInfo::classifyArgumentType.
[1] cec39a064e
[2] https://github.com/llvm/llvm-project/issues/57261
Differential Revision: https://reviews.llvm.org/D132285
This patch replaces clamp idioms with std::clamp where the range is
obviously valid from the source code (that is, low <= high) to avoid
introducing undefined behavior.
The hard float ABIs have a rule that if a flattened struct contains
either a single fp value, or an int+fp, or fp+fp then it may be passed
in a pair of registers (if sufficient GPRs+FPRs are available).
detectFPCCEligibleStruct and the helper it calls,
detectFPCCEligibleStructHelper examine the type of the argument/return
value to determine if it complies with the requirements for this ABI
rule.
As reported in bug #57084, this logic produces incorrect results for C++
structs that inherit from other structs. This is because only the fields
of the struct were examined, but enumerating RD->fields misses any
fields in inherited C++ structs. This patch corrects that issue by
adding appropriate logic to enumerate any included base structs.
Differential Revision: https://reviews.llvm.org/D131677
Currently, record arguments are always passed by reference by allocating
space for record values in the caller. This is less efficient for
small records which may take one or two registers. For example,
for x86_64 and aarch64, for a record size up to 16 bytes, the record
values can be passed by values directly on the registers.
This patch added BPF support of record argument with direct values
for up to 16 byte record size. If record size is 0, that record
will not take any register, which is the same behavior for x86_64
and aarch64. If the record size is greater than 16 bytes, the
record argument will be passed by reference.
Differential Revision: https://reviews.llvm.org/D132144
Currently bpf supports calling kernel functions (x86_64, arm64, etc.)
in bpf programs. Tejun discovered a problem where the x86_64 func
return value (a unsigned char type) is stored in 8-bit subregister %al
and the other 56-bits in %rax might be garbage. But based on current
bpf ABI, the bpf program assumes the whole %rax holds the correct value
as the callee is supposed to do necessary sign/zero extension.
This mismatch between bpf and x86_64 caused the incorrect results.
To resolve this problem, this patch forced caller to do needed
sign/zero extension for 8/16-bit return values as well. Note that
32-bit return values already had sign/zero extension even without
this patch.
For example, for the test case attached to this patch:
$ cat t.c
_Bool bar_bool(void);
unsigned char bar_char(void);
short bar_short(void);
int bar_int(void);
int foo_bool(void) {
if (bar_bool() != 1) return 0; else return 1;
}
int foo_char(void) {
if (bar_char() != 10) return 0; else return 1;
}
int foo_short(void) {
if (bar_short() != 10) return 0; else return 1;
}
int foo_int(void) {
if (bar_int() != 10) return 0; else return 1;
}
Without this patch, generated call insns in IR looks like:
%call = call zeroext i1 @bar_bool()
%call = call zeroext i8 @bar_char()
%call = call signext i16 @bar_short()
%call = call i32 @bar_int()
So it is assumed that zero extension has been done for return values of
bar_bool()and bar_char(). Sign extension has been done for the return
value of bar_short(). The return value of bar_int() does not have any
assumption so caller needs to do necessary shifting to get correct
32bit values.
With this patch, generated call insns in IR looks like:
%call = call i1 @bar_bool()
%call = call i8 @bar_char()
%call = call i16 @bar_short()
%call = call i32 @bar_int()
There are no assumptions for return values of the above four function calls,
so necessary shifting is necessary for all of them.
The following is the objdump file difference for function foo_char().
Without this patch:
0000000000000010 <foo_char>:
2: 85 10 00 00 ff ff ff ff call -1
3: bf 01 00 00 00 00 00 00 r1 = r0
4: b7 00 00 00 01 00 00 00 r0 = 1
5: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
6: b7 00 00 00 00 00 00 00 r0 = 0
0000000000000038 <LBB1_2>:
7: 95 00 00 00 00 00 00 00 exit
With this patch:
0000000000000018 <foo_char>:
3: 85 10 00 00 ff ff ff ff call -1
4: bf 01 00 00 00 00 00 00 r1 = r0
5: 57 01 00 00 ff 00 00 00 r1 &= 255
6: b7 00 00 00 01 00 00 00 r0 = 1
7: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
8: b7 00 00 00 00 00 00 00 r0 = 0
0000000000000048 <LBB1_2>:
9: 95 00 00 00 00 00 00 00 exit
The zero extension of the return 'char' value is done here.
Differential Revision: https://reviews.llvm.org/D131598
Swift calling conventions stands out in the way that they are lowered in
mostly target-independent manner, with very few customization points.
As such, swift-related methods of ABIInfo do not reference the rest of
ABIInfo and vice versa.
This change follows interface segregation principle; it removes
dependency of SwiftABIInfo on ABIInfo. Targets must now implement
SwiftABIInfo separately if they support Swift calling conventions.
Almost all targets implemented `shouldPassIndirectly` the same way. This
de-facto default implementation has been moved into the base class.
`isSwiftErrorInRegister` used to be virtual, now it is not. It didn't
accept any arguments which could have an effect on the returned value.
This is now a static property of the target ABI.
Reviewed By: rusyaev-roman, inclyc
Differential Revision: https://reviews.llvm.org/D130394
llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). However llvm::sort is not usable everywhere, for
example, in compiler-rt.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D130406
By both AAPCS32 and AAPCS64, the test for whether an aggregate
qualifies as homogeneous (either HFA or HVA) is based on the data
layout alone. So any logical member of the structure that does not
affect the data layout also should not affect homogeneity. In
particular, an empty bitfield ('int : 0') should make no difference.
In fact, clang considered it to make a difference in C but not in C++,
and justified that policy as compatible with gcc. But that's
considered a bug in gcc as well (at least for Arm targets), and it's
fixed in gcc 12.1.
This fix mimics gcc's: zero-sized bitfields are now ignored in all
languages for the Arm (32- and 64-bit) ABIs. But I've left the
previous behaviour unchanged in other ABIs, by means of adding an
ABIInfo::isZeroLengthBitfieldPermittedInHomogeneousAggregate query
method which the Arm subclasses override.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D127197
A struct like { float a; int :0; } should per the SystemZ ABI be passed in a
GPR, but to match a bug in GCC it has been passed in an FPR (see 759449c).
GCC has now corrected the C++ ABI for this case, and this patch for clang
follows suit.
Reviewed By: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D122388
Currently, the regcall calling conversion in Clang doesn't match with
ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7
This patch tries to fix the problem to match with ICC.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D122104
(resubmit https://reviews.llvm.org/D119207 after fixing the test for
some build settings)
This patch converts CUDA pointer kernel arguments with default address
space to CrossWorkGroup address space (__global in OpenCL). This is
because Generic or Function (OpenCL's private) is not supported as
storage class for kernel pointer types.
Differential revision: https://reviews.llvm.org/D120366
This patch converts CUDA pointer kernel arguments with default address space to
CrossWorkGroup address space (__global in OpenCL). This is because Generic or
Function (OpenCL's private) is not supported as storage class for kernel pointer types.
Differential Revision: https://reviews.llvm.org/D119207
To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().
While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.
Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types.
This patch makes sure we use memory representation for `va_arg`.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D118904
While investigating the failures of `symbolize_pc.cpp` and
`symbolize_pc_inline.cpp` on SPARC (both Solaris and Linux), I noticed that
`__builtin_extract_return_addr` is a no-op in `clang` on all targets, while
`gcc` has non-default implementations for arm, mips, s390, and sparc.
This patch provides the SPARC implementation. For background see
`SparcISelLowering.cpp` (`SparcTargetLowering::LowerReturn_32`), the SPARC
psABI p.3-12, `%i7` and p.3-16/17, and SCD 2.4.1, p.3P-10, `%i7` and
p.3P-15.
Tested (after enabling the `sanitizer_common` tests on SPARC) on
`sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D91607
AVR is baremetal environment, so the avr-libc does not support
'__cxa_atexit()'.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D118445
Branch protection in M-class is supported by
- Armv8.1-M.Main
- Armv8-M.Main
- Armv7-M
Attempting to enable this for other architectures, either by
command-line (e.g -mbranch-protection=bti) or by target attribute
in source code (e.g. __attribute__((target("branch-protection=..."))) )
will generate a warning.
In both cases function attributes related to branch protection will not
be emitted. Regardless of the warning, module level attributes related to
branch protection will be emitted when it is enabled via the command-line.
The following people also contributed to this patch:
- Victor Campos
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D115501
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
Since 2959e082e1, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
All kernels can be called from the host as per the SPIR_KERNEL calling
convention. As such, all kernels should have external linkage, but
block enqueue kernels were created with internal linkage.
Reported-by: Pedro Olsen Ferreira
Differential Revision: https://reviews.llvm.org/D115523