Commit Graph

562 Commits

Author SHA1 Message Date
Bill Wendling f3baad3ee1 Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, are
now cerr, cout, and NullStream resp.

llvm-svn: 32298
2006-12-07 01:30:32 +00:00
Reid Spencer 4ae56f3086 Update ConstantIntegral Max/Min tests for new interface.
llvm-svn: 32288
2006-12-06 20:39:57 +00:00
Chris Lattner 700b873130 Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner c209b584eb add an instcombine xform. This speeds up 462.libquantum from 9.78s to
7.48s.  This regression is due to unforseen consequences of the cast patch.

llvm-svn: 32209
2006-12-05 01:26:29 +00:00
Reid Spencer 14fbdd5523 Update call to CastInst::getCastOpcode for its new signature.
llvm-svn: 32166
2006-12-04 02:48:01 +00:00
Chris Lattner 7a002fec1f disable transformations that are invalid for fp vectors. This fixes
Transforms/InstCombine/2006-12-01-BadFPVectorXform.ll

llvm-svn: 32112
2006-12-02 00:13:08 +00:00
Reid Spencer ad05ee9f39 Remove 4 FIXMEs to hack around cast-to-bool problems which no longer exist.
llvm-svn: 32051
2006-11-30 23:13:36 +00:00
Chris Lattner 960acb008b implement cast.ll:test35. With this, we recognize:
unsigned short swp(unsigned short a) {
       return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}

as an idiom for bswap.

llvm-svn: 32011
2006-11-29 07:18:39 +00:00
Chris Lattner d747f015ff Teach instcombine to turn trunc(srl x, c) -> srl (trunc(x), c) when safe.
This implements InstCombine/cast.ll:test34.  It fires hundreds of times on
176.gcc.

llvm-svn: 32009
2006-11-29 07:04:07 +00:00
Chris Lattner a7942b7bbd Implement Regression/Transforms/InstCombine/bswap-fold.ll,
folding   seteq (bswap(x)), c -> seteq(x,bswap(c))

llvm-svn: 32006
2006-11-29 05:02:16 +00:00
Reid Spencer a736fdf216 Join a split line.
llvm-svn: 31996
2006-11-29 01:11:01 +00:00
Reid Spencer 116ad83aa0 Undo the last patch until 253.perlbmk passes with these changes.
llvm-svn: 31977
2006-11-28 20:23:51 +00:00
Reid Spencer 59fe2d89ae Remove 4 FIXME's from the CAST patch now that the back end is correctly
producing code for "trunc to bool". This passes all tests on Linux.

llvm-svn: 31963
2006-11-28 07:23:01 +00:00
Chris Lattner 8e9a7b73d9 Fix PR1014 and InstCombine/2006-11-27-XorBug.ll.
llvm-svn: 31941
2006-11-27 19:55:07 +00:00
Reid Spencer 6c38f0bb07 For PR950:
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.

llvm-svn: 31931
2006-11-27 01:05:10 +00:00
Bill Wendling 5dbf43c983 Removed #include <iostream> and replaced with llvm_* streams.
llvm-svn: 31923
2006-11-26 09:46:52 +00:00
Chris Lattner ec45a4c88c This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way.
llvm-svn: 31889
2006-11-21 17:05:13 +00:00
Chris Lattner e3a63d136d Fix a gcc 4.2 warning.
llvm-svn: 31751
2006-11-15 04:53:24 +00:00
Chris Lattner f05d69ae72 implement InstCombine/shift-simplify.ll by transforming:
(X >> Z) op (Y >> Z)  -> (X op Y) >> Z

for all shifts and all ops={and/or/xor}.

llvm-svn: 31729
2006-11-14 07:46:50 +00:00
Chris Lattner d12a4bf799 implement InstCombine/and-compare.ll:test1. This compiles:
typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common;
int foo(tree_common *a, tree_common *b) { return a->code == b->code; }

into:

_foo:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl (%eax), %eax
        xorl (%ecx), %eax
        # TRUNCATE movb %al, %al
        shrb $4, %al
        testb %al, %al
        sete %al
        movzbl %al, %eax
        ret

instead of:

_foo:
        movl 8(%esp), %eax
        movb (%eax), %al
        shrb $4, %al
        movl 4(%esp), %ecx
        movb (%ecx), %cl
        shrb $4, %cl
        cmpb %al, %cl
        sete %al
        movzbl %al, %eax
        ret

saving one cycle by eliminating a shift.

llvm-svn: 31727
2006-11-14 06:06:06 +00:00
Chris Lattner d4dee405cb Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introduced
by the shr -> [al]shr patch.  This was reduced from 176.gcc.

llvm-svn: 31653
2006-11-10 23:38:52 +00:00
Chris Lattner 6e2c15c158 Teach ShrinkDemandedConstant how to handle X+C. This implements:
add.ll:test33, add.ll:test34, shift-sra.ll:test2

llvm-svn: 31586
2006-11-09 05:12:27 +00:00
Chris Lattner 4f218d56f5 reenable factoring of GEP expressions, being more precise about the
case that it bad to do.

llvm-svn: 31563
2006-11-08 19:42:28 +00:00
Chris Lattner cd62f11227 make this code more efficient by not creating a phi node we are just going to
delete in the first place.  This also makes it simpler.

llvm-svn: 31562
2006-11-08 19:29:23 +00:00
Chris Lattner a3acfca920 disable this factoring optzn for GEPs for now, this severely pessimizes some
loops.

llvm-svn: 31560
2006-11-08 18:49:31 +00:00
Reid Spencer fdff938a7e For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Andrew Lenharth 0ebb0b03e6 The wrong parameter was being tested to deturmine i32 vs i64
llvm-svn: 31431
2006-11-03 22:45:50 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Reid Spencer 7eb55b395f For PR950:
Replace the REM instruction with UREM, SREM and FREM.

llvm-svn: 31369
2006-11-02 01:53:59 +00:00
Chris Lattner eebea43b48 Factor gep instructions through phi nodes.
llvm-svn: 31346
2006-11-01 07:43:41 +00:00
Chris Lattner 14f82c7dcd Turn a phi of many loads into a phi of the address and a single load of the
result.  This can significantly shrink code and exposes identities more
aggressively.

llvm-svn: 31344
2006-11-01 07:13:54 +00:00
Chris Lattner dc826fc068 Fix a bug in the previous patch
llvm-svn: 31342
2006-11-01 04:55:47 +00:00
Chris Lattner cadac0c5c3 Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add.
This triggers thousands of times on multisource.

llvm-svn: 31341
2006-11-01 04:51:18 +00:00
Reid Spencer 00c482b7a2 Simplify code a bit by changing instances of:
InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I)
into:
   InsertCastBefore(Val, ValTy, I)

llvm-svn: 31204
2006-10-26 19:19:06 +00:00
Reid Spencer 7e80b0b31e For PR950:
Make necessary changes to support DIV -> [SUF]Div. This changes llvm to
have three division instructions: signed, unsigned, floating point. The
bytecode and assembler are bacwards compatible, however.

llvm-svn: 31195
2006-10-26 06:15:43 +00:00
Chris Lattner 5dee3b2526 Fix miscompilation of MallocBench/espresso which code review pointed out
but apparently didn't make it into the final patch.

llvm-svn: 31070
2006-10-20 18:20:21 +00:00
Reid Spencer e0fc4dfc22 For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Devang Patel 5d417e35bc While creating mask, use 1ULL instead of 1.
llvm-svn: 31062
2006-10-20 01:16:56 +00:00
Devang Patel 5d6df959e3 It is OK to remove extra cast if operation is EQ/NE even though source
and destination sign may not match but other conditions are met.

llvm-svn: 31056
2006-10-19 20:59:13 +00:00
Devang Patel 88afd00d1d Typo Typo.
llvm-svn: 31055
2006-10-19 19:21:36 +00:00
Devang Patel 472530d9fc Typo.
llvm-svn: 31054
2006-10-19 19:05:38 +00:00
Devang Patel b42aef4925 Fix bug in PR454 resolution. Added new test case.
This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin.

llvm-svn: 31053
2006-10-19 18:54:08 +00:00
Reid Spencer 3c514959dd Undo Chris' last patch, it caused a regression.
llvm-svn: 30991
2006-10-16 23:08:08 +00:00
Chris Lattner 9a1c7dd27a fix a buggy check that accidentally disabled this xform
llvm-svn: 30967
2006-10-15 22:42:15 +00:00
Chris Lattner 2deeaeaca7 add a new SimplifyDemandedVectorElts method, which works similarly to
SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752
2006-10-05 06:55:50 +00:00
Chris Lattner 7d19067c42 Fix a bug from r1.391 of this file, where we checked the size instead of
the alignment when promoting allocations.  This implements
InstCombine/cast.ll:test32

llvm-svn: 30682
2006-10-01 19:40:58 +00:00
Chris Lattner 6ab03f6a08 Eliminate ConstantBool::True and ConstantBool::False. Instead, provide
ConstantBool::getTrue() and ConstantBool::getFalse().

llvm-svn: 30665
2006-09-28 23:35:22 +00:00
Andrew Lenharth 44cb67af5c simplify
llvm-svn: 30535
2006-09-20 15:37:57 +00:00
Chris Lattner 380c7e9a59 We went through all that trouble to compute whether it was safe to transform
this comparison, but never checked it.  Whoops, no wonder we miscompiled
177.mesa!

llvm-svn: 30511
2006-09-20 04:44:59 +00:00
Evan Cheng cd3f6ff0e5 Back out Chris' last set of changes. This breaks 177.mesa and povray somehow.
llvm-svn: 30505
2006-09-20 01:39:40 +00:00