28 lines
1.2 KiB
TeX
Executable File
28 lines
1.2 KiB
TeX
Executable File
\begin{abstract}
|
|
% pr重要
|
|
The widespread use of pull-requests boosts the development
|
|
and evolution for many open source software projects.
|
|
% dup pr
|
|
However, due to the parallel and uncoordinated nature of development process in GitHub,
|
|
duplicate pull-requests may be submitted by different contributors to solve the same problem.
|
|
% dup pr 危害
|
|
Duplicate pull-requests increase the maintenance cost of GitHub,
|
|
result in the waste of time spent on the redundant effort of code review,
|
|
and even frustrate developers' willing to offer continuous contribution.
|
|
%我们的方案
|
|
In this paper,
|
|
we investigate using text information
|
|
to automatically detect duplicate pull-requests in GitHub.
|
|
For a new-arriving pull-request,
|
|
we compare the textual similarity between it and other existing pull-requests,
|
|
and then return a candidate list of the most similar ones.
|
|
%结果
|
|
We evaluate our approach on three popular projects hosted in GitHub,
|
|
namely Rails, Elasticsearch and Angular.JS.
|
|
%!!!!提我们怎么构造的测试集
|
|
The evaluation shows that about 55.3\% - 71.0\% of the duplicates can be found
|
|
when we use the combination of title similarity and description similarity.
|
|
\end{abstract}
|
|
|
|
|