Commit Graph

14 Commits

Author SHA1 Message Date
Raphael Isemann 89ef7d9570 [analyzer] Increase minimum complexity filter of the CloneChecker.
Summary:
So far we used a value of 10 which was useful for testing but produces many false-positives in real programs. The usual suspicious clones we find seem to be at around a complexity value of 70 and for normal clone-reporting everything above 50 seems to be a valid normal clone for users, so let's just go with 50 for now and set this as the new default value.

This patch also explicitly sets the complexity value for the regression tests as they serve more of a regression testing/debugging purpose and shouldn't really be reported by default in real programs. I'll add more tests that reflect actual found bugs that then need to pass with the default setting in the future.

Reviewers: NoQ

Subscribers: cfe-commits, javed.absar, xazax.hun, v.g.vassilev

Differential Revision: https://reviews.llvm.org/D34178

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@312468 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 05:56:36 +00:00
Raphael Isemann 9811ba37bd [analyzer] Performance optimizations for the CloneChecker
Summary:
This patch  aims at optimizing the CloneChecker for larger programs. Before this
patch we took around 102 seconds to analyze sqlite3 with a complexity value of
50. After this patch we now take 2.1 seconds to analyze sqlite3.

The biggest performance optimization is that we now put the constraint for group
size before the constraint for the complexity. The group size constraint is much
faster in comparison to the complexity constraint as it only does a simple
integer comparison. The complexity constraint on the other hand actually
traverses each Stmt and even checks the macro stack, so it is obviously not able
to handle larger amounts of incoming clones. The new order filters out all the
single-clone groups that the type II constraint generates in a faster way before
passing the fewer remaining clones to the complexity constraint. This reduced
runtime by around 95%.

The other change is that we also delay the verification part of the type II
clones back in the chain of constraints. This required to split up the
constraint into two parts - a verification and a hash constraint (which is also
making it more similar to the original design of the clone detection algorithm).
The reasoning for this is the same as before: The verification constraint has to
traverse many statements and shouldn't be at the start of the constraint chain.
However, as the type II hashing has to be the first step in our algorithm, we
have no other choice but split this constrain into two different ones. Now our
group size and complexity constrains filter out a chunk of the clones before
they reach the slow verification step, which reduces the runtime by around 8%.

I also kept the full type II constraint around - that now just calls it's two
sub-constraints - in case someone doesn't care about the performance benefits
of doing this.

Reviewers: NoQ

Reviewed By: NoQ

Subscribers: klimek, v.g.vassilev, xazax.hun, cfe-commits

Differential Revision: https://reviews.llvm.org/D34182

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@312222 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 07:10:46 +00:00
Leslie Zhai a1efca534d [analyzer] Teach CloneDetection about Qt Meta-Object Compiler to filter auto generated files
Reviewers: v.g.vassilev, teemperor

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D34353


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@305774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 06:44:46 +00:00
Leslie Zhai 52b6305306 [analyzer] Teach CloneDetection about Qt Meta-Object Compiler
Reviewers: v.g.vassilev, zaks.anna, NoQ, teemperor

Reviewed By: v.g.vassilev, zaks.anna, NoQ, teemperor

Differential Revision: https://reviews.llvm.org/D31320


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@305659 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-19 01:55:50 +00:00
Artem Dergachev 59d4b1de1a [analyzer] Reland r299544 "Add a modular constraint system to the CloneDetector"
Hopefully fix crashes by unshadowing the variable.


Original commit message:

A big part of the clone detection code is functionality for filtering clones and
clone groups based on different criteria. So far this filtering process was
hardcoded into the CloneDetector class, which made it hard to understand and,
ultimately, to extend.

This patch splits the CloneDetector's logic into a sequence of reusable
constraints that are used for filtering clone groups. These constraints
can be turned on and off and reodreder at will, and new constraints are easy
to implement if necessary.

Unit tests are added for the new constraint interface.

This is a refactoring patch - no functional change intended.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23418


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@299653 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 14:34:07 +00:00
Artem Dergachev 727ea63e6e Revert "[analyzer] Add a modular constraint system to the CloneDetector"
This reverts commit r299544.

Crashes on tests on some buildbots.


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@299550 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 15:06:17 +00:00
Artem Dergachev a61d7c54b0 [analyzer] Add a modular constraint system to the CloneDetector
A big part of the clone detection code is functionality for filtering clones and
clone groups based on different criteria. So far this filtering process was
hardcoded into the CloneDetector class, which made it hard to understand and,
ultimately, to extend.

This patch splits the CloneDetector's logic into a sequence of reusable
constraints that are used for filtering clone groups. These constraints
can be turned on and off and reodreder at will, and new constraints are easy
to implement if necessary.

Unit tests are added for the new constraint interface.

This is a refactoring patch - no functional change intended.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23418


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@299544 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 14:17:36 +00:00
Artem Dergachev b1769b767f [analyzer] Re-apply r283094 "Improve CloneChecker diagnostics"
The parent commit (r283092) was reverted before and now finally landed.


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283661 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-08 10:54:30 +00:00
Vitaly Buka b01e9e0290 Revert "[analyzer] Improve CloneChecker diagnostics" as its depends on reverted r283092
This reverts commit r283094.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283182 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-04 02:40:35 +00:00
Artem Dergachev 071811a24c [analyzer] Improve CloneChecker diagnostics
Highlight code clones referenced by the warning message with the help of
the extra notes feature recently introduced in r283092.

Change warning text to more clang-ish. Remove suggestions from the copy-paste
error checker diagnostics, because currently our suggestions are strictly 50%
wrong (we do not know which of the two code clones contains the error), and
for that reason we should not sound as if we're actually suggesting this.
Hopefully a better solution would bring them back.

Make sure the suspicious clone pair structure always mentions
the correct variable for the second clone.

Differential Revision: https://reviews.llvm.org/D24916


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@283094 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 08:11:50 +00:00
Artem Dergachev 656204ffb4 [analyzer] Use faster hashing (MD5) in CloneDetector.
This replaces the old approach of fingerprinting every AST node into a string,
which avoided collisions and was simple to implement, but turned out to be
extremely ineffective with respect to both performance and memory.

The collisions are now dealt with in a separate pass, which no longer causes
performance problems because collisions are rare.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D22515


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@279378 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20 17:35:53 +00:00
Artem Dergachev ac4644401f [analyzer] Teach CloneDetector to find clones that look like copy-paste errors.
The original clone checker tries to find copy-pasted code that is exactly
identical to the original code, up to minor details.

As an example, if the copy-pasted code has all references to variable 'a'
replaced with references to variable 'b', it is still considered to be
an exact clone.

The new check finds copy-pasted code in which exactly one variable seems
out of place compared to the original code, which likely indicates
a copy-paste error (a variable was forgotten to be renamed in one place).

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23314


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@279056 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-18 12:29:41 +00:00
Artem Dergachev 9f96fceac1 [analyzer] Hotfix for build failure due to declaration shadowing in r276782.
CloneDetector member variable is shadowing the class with the same name,
which causes build failures on some platforms.


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@276791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 19:05:22 +00:00
Artem Dergachev 07b4e2f407 [analyzer] Add basic capabilities to detect source code clones.
This patch adds the CloneDetector class which allows searching source code
for clones.

For every statement or group of statements within a compound statement,
CloneDetector computes a hash value, and finds clones by detecting
identical hash values.

This initial patch only provides a simple hashing mechanism
that hashes the kind of each sub-statement.

This patch also adds CloneChecker - a simple static analyzer checker
that uses CloneDetector to report copy-pasted code.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D20795


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@276782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 18:13:12 +00:00