duppr/8_conlusion.tex

32 lines
1.3 KiB
TeX

\section{Conclusion}
\label{sec:Concl}
%方法
In this paper, we proposed an approach to automatically
detect duplicate pull-requests in GitHub.
Our method uses natural language text and diff information
to calculate the similarity between two pull-requests
and return a candidate list of the most similar one with the given pull-request.
%结果
We constructed a test dataset of duplicates through a semi-automatic way
from three popular projects hosted in GitHub including Rails, Elasticsearch and Angular.JS.
The evaluation result shows that
combining textual similarity and line-level diff similarity can achieve the best performance
which found about **\% - **\% of the duplicates
compared to **\% - **\% using only natural language text
and **\% - **\% using only diff information.
%未来工作
In the future, we plan to enrich our test dataset
and evaluate our method with datasets from more software projects.
In addition,
we would also develop better techniques to improve
the detection effectiveness of our method.
\section*{Acknowledgment}
% We would like to thank the anonymous reviewers for their comments.
The research is supported by the National Natural Science Foundation of China
(Grant No.61432020, 61303064, 61472430, 61502512)
and National Grand R\&D Plan (Grant No. 2016YFB1000805).