Commit Graph

54 Commits

Author SHA1 Message Date
Matt Arsenault 09d38dd770 AMDGPU: Fix assert when trying to overextend liverange
This was trying to add segments beyond the new and use,
so skip additional segments.

This would hit (S < E && "Cannot create empty or backwards segment").
2022-11-04 15:14:43 -07:00
Carl Ritson 547e3cba7d [AMDGPU] Improve liveness copying in si-optimize-exec-masking-pre-ra
Further improve liveness copying for CC register post optimization
by mirroring live internal splits.
The fixes a bug in register allocation when CC register liveness
is extended across a branches instead of split.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129557
2022-07-17 17:34:05 +09:00
Carl Ritson 8bc5e7ac51 [AMDGPU] Additional liveness tests for si-optimize-exec-masking-pre-ra
Merge tests and fixes from D128110 and D128315 on top of already
committed D128800.

Original author: arsenm

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D128882
2022-07-06 15:05:32 +09:00
Carl Ritson d0f6641615 [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
Follow up to D127894, new liveness update code needs to handle
the case where S_ANDN2 input must be extended through loops when
V_CNDMASK_B32 has been hoisted.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D128800
2022-06-30 15:26:50 +09:00
Matt Arsenault b03d902b61 AMDGPU: Fix invalid liveness after si-optimize-exec-masking-pre-ra
This was leaving behind a use at the deleted instruction which the
verifier would fail during allocation.
2022-06-22 20:49:03 -04:00
Sebastian Neubauer 6527b2a4d5 [AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235
2022-02-18 15:05:21 +01:00
alex-t dccf83acf9 [AMDGPU] SIOptimizeExecMaskingPreRA should check constant bus constraint when folds EXEC copy
Folding EXEC copy into it's single use may lead to constant bus constraint violation as it adds one more SGPR operand.
         This change makes it validate the user instruction with the new SGPR operand and only fold it if it is legal.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D98888
2021-03-24 14:14:13 +03:00
dfukalov 560d7e0411 [NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
2021-01-20 22:22:45 +03:00
dfukalov 6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Mircea Trofin 5dc47541f9 [NFC] Use Register/MCRegister
Differential Revision: https://reviews.llvm.org/D90724
2020-11-04 12:20:17 -08:00
Carl Ritson be2afbd019 [AMDGPU] Remove fix up operand from SI_ELSE
Remove immediate operand from SI_ELSE which indicates if EXEC has
been modified.  Instead always emit code that handles EXEC and
remove unnecessary instructions during pre-RA optimisation.

This facilitates passes (i.e. SIWholeQuadMode) adding exec mask
manipulation post control flow lowering, and pre control flow
lower passes do not need to be aware of SI_ELSE handling.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D89644
2020-10-20 19:15:21 +09:00
Carl Ritson 6aabbeadae [AMDGPU][NFC] Tidy SIOptimizeExecMaskingPreRA for extensibility
Remove duplicate code and move things around to make it easier to
add additional optimisations to the pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D89619
2020-10-20 17:22:43 +09:00
Jay Foad 3497860203 [AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
2020-08-20 17:59:11 +01:00
Matt Arsenault e9b236f411 AMDGPU: Check for other defs when folding conditions into s_andn2_b64
We can't fold the masked compare value through the select if the
select condition is re-defed after the and instruction. Fixes a
verifier error and trying to use the outgoing value defined in the
block.

I'm not sure why this pass is bothering to handle physregs. It's
making this more complex and forces extra liveness computation.
2020-07-28 16:36:23 -04:00
Carl Ritson 74c14202d9 [AMDGPU] Propagate dead flag during pre-RA exec mask optimizations
Preserve SCC dead flags in SIOptimizeExecMaskingPreRA.
This helps with removing redundant s_andn2 instructions later.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D83637
2020-07-14 12:53:43 +09:00
Austin Kerbow eab9a4f119 [AMDGPU] Don't assert on partial exec copy
After Machine CSE and coalescing we can end up with copies of exec to
subregister SGPRs.

Differential Revision: https://reviews.llvm.org/D77992
2020-04-12 21:14:36 -07:00
Stanislav Mekhanoshin a73528649c [AMDGPU] Simplify exec copies
The patch removes late endcf handling and only leaves the
related portion with redundant exec mask copy elimination.

Differential Revision: https://reviews.llvm.org/D76095
2020-03-12 14:54:19 -07:00
Stanislav Mekhanoshin 9801e5469b [AMDGPU] Disable nested endcf collapse
The assumption is that conditional regions are perfectly nested
and a mask restored at the exit from the inner block will be
completely covered by a mask restored in the outer.

It turns out with our current structurizer this is not always
the case.

Disable the optimization for now, but I want to keep it around
for a while to either try after further structurizer changes or
to move it into control flow lowering where we have more info
and reuse the test.

Differential Revision: https://reviews.llvm.org/D75958
2020-03-11 11:24:20 -07:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Nicolai Haehnle df6e67697b AMDGPU: Propagate undef flag during pre-RA exec mask optimizations
Summary: Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68184

llvm-svn: 374041
2019-10-08 12:46:32 +00:00
Matt Arsenault 4b7fc85c0b Revert "AMDGPU: Fix iterator error when lowering SI_END_CF"
This reverts r367500 and r369203. This is causing various test
failures.

llvm-svn: 369417
2019-08-20 17:45:25 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Daniel Sanders 2bea69bf65 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
2019-08-01 23:27:28 +00:00
Matt Arsenault d48324ff6f Reapply "AMDGPU: Split block for si_end_cf"
This reverts commit r359363, reapplying r357634

llvm-svn: 367500
2019-08-01 01:25:27 +00:00
Stanislav Mekhanoshin 5250021672 [AMDGPU] gfx10 conditional registers handling
This is cpp source part of wave32 support, excluding overriden
getRegClass().

Differential Revision: https://reviews.llvm.org/D63351

llvm-svn: 363513
2019-06-16 17:13:09 +00:00
Mark Searles 76c5b62988 Revert "AMDGPU: Split block for si_end_cf"
This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.

We discovered some internal test failures, so reverting for now.

Differential Revision: https://reviews.llvm.org/D61213

llvm-svn: 359363
2019-04-27 00:51:18 +00:00
Stanislav Mekhanoshin c464dddccb [AMDGPU] Fixed addReg() in SIOptimizeExecMaskingPreRA.cpp
The second argument is flags, not subreg.

Differential Revision: https://reviews.llvm.org/D61031

llvm-svn: 359017
2019-04-23 17:59:26 +00:00
Matt Arsenault 70346d127b AMDGPU: Fix not checking for copy when looking at copy src
Effectively reverts r356956. The check for isFullCopy was excessive,
but there still needs to be a check that this is a copy.

llvm-svn: 358890
2019-04-22 14:54:39 +00:00
Tim Renouf 842be38162 [AMDGPU] Fixed incorrect test in vcnd/vcmp optimization
This fixes a test I introduced in change D59191 (that added src0 and
src1 modifiers to the v_cndmask instruction for disassembly purposes).

Spotted by David Binderman in bug 41488.

Differential Revision: https://reviews.llvm.org/D60652

Change-Id: I6ac95e66cd84e812ed3359ad57bcd0e13198ba0c
llvm-svn: 358392
2019-04-15 10:36:24 +00:00
Matt Arsenault 396653f8a1 AMDGPU: Split block for si_end_cf
Relying on no spill or other code being inserted before this was
precarious. It relied on code diligently checking isBasicBlockPrologue
which is likely to be forgotten.

Ideally this could be done earlier, but this doesn't work because of
phis. Any other instruction can't be placed before them, so we have to
accept the position being incorrect during SSA.

This avoids regressions in the fast register allocator rewrite from
inverting the direction.

llvm-svn: 357634
2019-04-03 20:53:20 +00:00
Matt Arsenault a353fd572a AMDGPU: Make exec mask optimzations more resistant to block splits
Also improve the check for SALU instructions to also ignore
implicit_def and other fake instructions.

llvm-svn: 357170
2019-03-28 14:01:39 +00:00
Matt Arsenault 4ab28b64b4 AMDGPU: Skip debug_instr when collapsing end_cf
Based on how these are inserted, I doubt this was causing a problem in
practice.

llvm-svn: 357090
2019-03-27 16:58:27 +00:00
Matt Arsenault 77bf2e3704 AMDGPU: Remove unnecessary check for isFullCopy
Subregister indexes are not used for physical register operands, so
isFullCopy is implied by the physical register check.

llvm-svn: 356956
2019-03-25 21:28:53 +00:00
Tim Renouf 2e94f6e584 [AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers
This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.

This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.

To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.

Differential Revision: https://reviews.llvm.org/D59191

Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
llvm-svn: 356398
2019-03-18 19:25:39 +00:00
David Stuttard 20ea21c6ed [AMDGPU] Add support for immediate operand for S_ENDPGM
Summary:
Add support for immediate operand in S_ENDPGM

Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6

Reviewers: alexshap

Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59213

llvm-svn: 355902
2019-03-12 09:52:58 +00:00
Matt Arsenault 476e26b5d3 AMDGPU: Use removeAllRegUnitsForPhysReg
llvm-svn: 354686
2019-02-22 19:03:36 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Stanislav Mekhanoshin d933c2ced7 [AMDGPU] Fix build failure, second attempt
Some compilers complain that variable is captured and some
complain when it is not. Switch to [&].

llvm-svn: 349006
2018-12-13 05:52:11 +00:00
Stanislav Mekhanoshin 5225746e03 [AMDGPU] Fix build failure
Fixed error 'lambda capture 'CondReg' is not required to be captured
for this use'.

llvm-svn: 349005
2018-12-13 05:21:25 +00:00
Stanislav Mekhanoshin 6071e1aa58 [AMDGPU] Simplify negated condition
Optimize sequence:

  %sel = V_CNDMASK_B32_e64 0, 1, %cc
  %cmp = V_CMP_NE_U32 1, %1
  $vcc = S_AND_B64 $exec, %cmp
  S_CBRANCH_VCC[N]Z
=>
  $vcc = S_ANDN2_B64 $exec, %cc
  S_CBRANCH_VCC[N]Z

It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the
rebuildSetCC().

Differential Revision: https://reviews.llvm.org/D55402

llvm-svn: 349003
2018-12-13 03:17:40 +00:00
Matt Arsenault 755f41f3a2 AMDGPU: Don't delete instructions if S_ENDPGM has implicit uses
This can leave behind the uses with the defs removed.
Since this should only really happen in tests, it's not worth the
effort of trying to handle this.

llvm-svn: 340866
2018-08-28 18:55:55 +00:00
Tom Stellard 5bfbae5cb1 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

llvm-svn: 336851
2018-07-11 20:59:01 +00:00
Tom Stellard 44b30b4537 AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930
2018-05-22 02:03:23 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Shiva Chen 801bf7ebbe [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

llvm-svn: 331844
2018-05-09 02:42:00 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Matthias Braun f842297d50 Rename LiveIntervalAnalysis.h to LiveIntervals.h
Headers/Implementation files should be named after the class they
declare/define.

Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`

llvm-svn: 320546
2017-12-13 02:51:04 +00:00
Matt Arsenault 7f0a527300 AMDGPU: Fix infinite loop with dbg_value
Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.

llvm-svn: 319808
2017-12-05 18:23:17 +00:00
Matt Arsenault 2f4df7ec41 AMDGPU: Recompute scc liveness
The various scalar bit operations set SCC,
so one is erased or moved it needs to be recomputed.
Not sure why the existing tests don't fail on this.

llvm-svn: 312819
2017-09-08 18:51:26 +00:00