Correct the pseudo atomic instruction size for branch
relaxation and branch folding passes.
Inspired by D118175, D118009 and D117970.
Depends on D138481
Reviewed By: SixWeining, gonglingqin, xen0n
Differential Revision: https://reviews.llvm.org/D138469
After D137316 implements the intrinsics of the first crc check instruction
and related diagnosis, this patch implements the intrinsics of all remaining
crc check instructions.
Differential Revision: https://reviews.llvm.org/D138418
This patch is required by OpenMP. After applying this patch, OpenMP regression
test passed. To reduce review difficulty caused by too large patches,
atomicrmw min/max operations on LA32 will be added later.
Differential Revision: https://reviews.llvm.org/D138177
This patch also implements not emit fence in atomic binary operation
when AtomicOrdering is monotonic and fixes the issue of loading from
non ptr parameters.
The processing of other levels of AtomicOrdering will be added later.
Differential Revision: https://reviews.llvm.org/D138481
When reading or writing a register that does not conform to the size of a
hardware register, an error message is generated instead of a compiler crash.
Differential Revision: https://reviews.llvm.org/D138008
As discussed in D137541, it supports processing when the depth of
__builtin_frame_address is greater than 0 instead of reporting an error.
Unsafe calls rely on the '-Wframe-address' option for diagnosis.
Differential Revision: https://reviews.llvm.org/D138084
The `li.[wd]` pseudo instructions are used to load an immediate value
into a GPR. These expand directly during asm parsing. As the result,
only real MC instructions are emitted to the MCStreamer. The actual
expansion to real instructions is similar to the expansion performed by
the GAS.
Note: The `li.w` always treats the imm operand as a 32-bit signed value.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D138086
This patch makes `IAS` compatible with `GAS`. It accepts `la*` pseudo
instructions, and expands `la{,.local,.global}` into different
instructions according to different features.
```
Default:
la = la.global = la.got
la.local = la.pcrel
With feature "+la-global-with-pcrel":
la = la.global = la.pcrel
With feature "+la-global-with-abs":
la = la.global = la.abs
With feature "+la-local-with-abs":
la.local = la.abs
With features "+la-global-with-pcrel,+la-global-with-abs"(disorder):
la = la.global = la.pcrel
```
Note: To keep consistent with `GAS` behavior, the "la" can only have
one register operand.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D138021
This patch adds tail call support to the LoongArch backend. When
appropriate, use the `b` or `jr` instruction for tail calls (the
`pcalau12i+jirl` instruction pair when use medium codemodel).
This patch also modifies the inappropriate operand name:
simm26_bl -> simm26_symbol
This has been modeled after RISCV's tail call opt.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137889
When all registers have been allocated and CFR needs to be saved on the
stack, an emergency spill slot is required. Because CFR's spill and
reload require a general purpose register to transfer.
The attached test case was bugpoint-reduced down from
`MultiSource/Benchmarks/mafft/Lalignmm.c` in the test-suite.
Without this patch, llc will crash and report the following errors:
```
LLVM ERROR: Error while trying to spill R4 from class GPR: Cannot scavenge register without an emergency spill slot!
```
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D138007
These instructions always output the canonical mnemonic. The GNU tools
emit the canonical mnemonic for the branch pseudo instructions as well
(e.g. "bgt" will be recognised by the assembler but never printed by
objdump).
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D138100
When expanding a PseudoCALL, the corresponding flags (e.g. nomerge)
need to be passed to the new instruction.
This patch also adds test for the nomerge attribute.
The `nomerge` attribute was added during `LowerCall`, but was lost
during expand PseudoCALL. Now add it back.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137888
When the range of the unconditional branch is overflow, the indirect
branch way is used. The case when there is no scavenged register for
indirect branch needs to spill register to stack.
Reviewed By: SixWeining, wangleiat
Differential Revision: https://reviews.llvm.org/D137821
In LoongArch, when `CodeModel=Medium`, it just increases the jumping
ability of function calls relative to PC, from 2^28 to 2^32.
Depends on D137393
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137394
This patch moves the expansion of the `PseudoCALL` insturction to
`LoongArchPreRAExpandPseudo` pass. This helps to expand into different
instruction sequences according to different CodeModels.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137393
When using `llvm.returnaddress` intrinsic, special handling is required
for the spill of the `RA` register. Otherwise it will cause the verifier
fail in some cases (e.g. pr17377.c of the GCC C Torture Suite).
Specifically:
```
*** Bad machine code: Using an undefined physical register ***
- function: f
- basic block: %bb.0 entry (0xd94d18)
- instruction: ST_D killed $r1, $r22, -40 :: (store (s64) into %stack.2)
- operand 0: killed $r1
```
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137387
These hooks ensure that the LoongArch backend can serialize and parse
MIR correctly.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D137482
1, spill/reload
When a function call is made immediately after a floating point
comparison, the result of the comparison needs to be spilled before
function call and reloaded after the function returns.
2, copy
Support `GPR` to `CFR` and `CFR` to `GRP` copys. Therefore, the correct
register class can be used in the pattern template, and the hard-coding
of mutual coping of `CFR` and `GRP` is eliminated, reducing redundant
comparison instructions.
Note: Since the `COPY` instruction between CFRs is not provided in
LoongArch, we only use `$fcc0` in the register allocation.
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D137004
When the branch target is out of the range represented by the current
branch instruction's immediate, branch relaxation is required. There
are three types of immediate for branch instructions on LoongArch,
including simm16, simm21 and simm26. And the real branch target
address is PC + sext(simmXX << 2). In addition, the indirect branch
way is implemented to support larger branch target.
BranchRelaxation pass calls `RenumberBlocks` to renumber all of the
machine basic blocks in the function. So the machine basic blocks
number changed in some test cases.
Differential Revision: https://reviews.llvm.org/D137233
This patch fixes codegen for `[su]itofp` instructions.
In LoongArch, a legal int-to-float conversion is done in two steps:
1. Move the data from `GPR` to `FPR`. (FRLen >= GRLen)
2. Conversion in `FPR`. (the data in `FPR` is treated as a signed value)
Based on the above features, when the type's BitWidth meets the
requirements, all `SINT_TO_FP` are legal, all `UINT_TO_FP` are expand
and lowered to libcall when appropriate.
The only special case is, LoongArch64 with `+f,-d` features. At this
point, custom processing is required for `[SU]INT_TO_FP`. Of course, we
can also ignore it and use libcall directly.
Differential Revision: https://reviews.llvm.org/D136916
There are three reduplicate error messages for different conditions. I
add meaningful information to make them more informative.
Differential Revision: https://reviews.llvm.org/D136742
An emergency spill slot is created when the stack size cannot be
represented by an 11-bit signed number.
This patch also modifies how the `sp` is adjusted in the prologue.
`RegScavenger` will place the spill instruction before the prologue
if a VReg is created in the prologue. This will pollute the caller's
stack data. Therefore, until there is better way, we just use the
`addi.w/d` instruction for stack adjustment to ensure that VReg will
not be created. (RISCV has the same issue #58286)
Due to the addition of emergency spill slot, some test cases that use
exact stacksize need to be updated.
Differential Revision: https://reviews.llvm.org/D135757
This patch split the SP adjustment to reduce the instructions in
prologue and epilogue. In this way, the offset of the callee saved
register could fit in a single store.
Similar to D68011(RISCV).
Differential Revision: https://reviews.llvm.org/D136222
Modify the ParserMethod of `simm26_b` operand type to `parseImmediate`.
Before that, for the `simm26_b` operand type, the same ParserMethod
was used as `simm26_bl`. When using the internal assembler to process
the blockaddress with `asm` instruction, the wrong blockaddress symbol
would be generated due to the call to the `getOrCreateSymbol()`
interface.
Differential Revision: https://reviews.llvm.org/D136073