some improvement according to review comments
This commit is contained in:
parent
dd5849d702
commit
189e4e52be
10
1-intro.tex
10
1-intro.tex
|
@ -34,7 +34,7 @@ tend to lack information of others progress.
|
|||
% 危害-平台
|
||||
Duplicate PRs increase the maintenance cost of GitHub
|
||||
and result in the waste of time spent on
|
||||
the redundant effort of reviewing each of them separately.
|
||||
the redundant effort of reviewing each of them separately~\cite{Gousios:2014,Gousios:2016}.
|
||||
Moreover,
|
||||
contributors may iteratively update and improve their PRs
|
||||
in several rounds of code reviews~\cite{Yu:2015}
|
||||
|
@ -70,14 +70,18 @@ Each pair of duplicate PRs in \textit{DupPR}
|
|||
has been manually verified after automatic identification
|
||||
which guarantees the quality of this dataset.
|
||||
The dataset and the source code used to recreate it is available
|
||||
online~\footnote{\url{https://github.com/whystar/MSR2018-DupPR}}.
|
||||
online.~\footnote{\url{https://github.com/whystar/MSR2018-DupPR}}
|
||||
Based on this dataset,
|
||||
the following research would be more available.
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
% \item Analyzing how much redundant effort would be wasted by duplicate PRs.
|
||||
% This would give researchers a clear idea of the issues that duplicate PRs have introduced.
|
||||
\item Analyzing how much redundant effort would be wasted by duplicate PRs.
|
||||
This would give researchers a clear idea of the issues that duplicate PRs have introduced.
|
||||
This would give researchers a straightforward impression
|
||||
about how duplicate PRs negatively impact software development process.
|
||||
|
||||
|
||||
\item Investigating how reviewers make the choice between two duplicate PRs.
|
||||
This is necessary to build automatic tools to make more targeted comparison between duplicates
|
||||
|
|
|
@ -28,12 +28,12 @@ and the fields in these tables are defined as follows.
|
|||
|
||||
\begin{itemize}[leftmargin=0em,itemindent=2em]
|
||||
\item Table \texttt{Project} stores the basic information of studied projects.
|
||||
Field \texttt{user\_name} is the name of the user or the organization which owns the project in GitHub,
|
||||
Field \texttt{user\_name} is the name of the user owning the project in GitHub,
|
||||
and field \texttt{repo\_name} is the name of the project.
|
||||
These two fields, together with the host of GitHub (\ie ``https://github.com/''),
|
||||
These two fields, together with the domain name of GitHub,
|
||||
can be used to compose the resource locator of the project in GitHub.
|
||||
Other fields in table \texttt{Project} present some statistical characteristics of a project
|
||||
such as \texttt{fork\_count} (the number of forks) and \texttt{star\_count} (the number of stars).
|
||||
Other fields in table \texttt{Project} present some statistical characteristics of a project,
|
||||
for example \texttt{fork\_count} is the number of forks.
|
||||
|
||||
\item For each project,
|
||||
all the PRs belonged to it are stored in table \texttt{Pull-request}.
|
||||
|
|
19
4-applct.tex
19
4-applct.tex
|
@ -21,21 +21,12 @@ and the property \texttt{created\_at} of $idn\_cmt$ in table \texttt{Comment} is
|
|||
We calculated the detection latency of all the duplicates in our dataset
|
||||
and the statistic result is shown in Figure~\ref{fig:delay_time_bar}.
|
||||
The figure presents that
|
||||
1,474 (63\%) duplicates are identified less than one day,
|
||||
while 865 (37.0\%) duplicates are detected after longer latency which is more than one day.
|
||||
% \hl{This .....}
|
||||
% 找特殊数据介绍一下
|
||||
% Among them, 39 (1.7\%) duplicates are detected less than one minute,
|
||||
% 681 (29.1\%) between one minue and one hour,
|
||||
% and 754 (32.2\%) between one hour and one day.
|
||||
% [39, 681, 754, 379, 239, 231, 16]
|
||||
37.0\% (865) duplicates are detected after long latency which is more than one day.
|
||||
|
||||
% 1,474 (63\%) duplicates are identified less than one day,
|
||||
% while 865 (37.0\%) duplicates are detected after longer latency which is more than one day.
|
||||
% \hl{This .....}
|
||||
|
||||
% \begin{figure}[ht]
|
||||
% \centering
|
||||
% \includegraphics[width=0.5\textwidth]{figs/delay_time.png}
|
||||
% \caption{Distribution of detection latency of duplicate pull-requests}
|
||||
% \label{fig:delay_time}
|
||||
% \end{figure}
|
||||
|
||||
|
||||
\begin{figure}[ht]
|
||||
|
|
Binary file not shown.
28
main.tex
28
main.tex
|
@ -1,4 +1,4 @@
|
|||
\documentclass[sigconf, anonymous]{acmart}
|
||||
\documentclass[sigconf]{acmart}
|
||||
|
||||
\usepackage{booktabs} % For formal tables
|
||||
\usepackage{setspace}
|
||||
|
@ -27,20 +27,16 @@
|
|||
\begin{document}
|
||||
\title{A Dataset of Duplicate Pull-requests in GitHub}
|
||||
|
||||
% \author{Zhixing Li, Yue Yu$^*$, Gang Yin, Tao Wang, Huaimin Wang}
|
||||
% \affiliation{%
|
||||
% \institution{College of Computer, National University of Defense Technology}
|
||||
% \city{Changsha, China}
|
||||
% \postcode{410073}
|
||||
% }
|
||||
% \email{{lizhixing15, yuyue, yingang, taowang2005, hmwang}@nudt.edu.cn}
|
||||
% \renewcommand{\shortauthors}{Z. Li et al.}
|
||||
\author{Yue Yu$^*$, Zhixing Li$^*$, Gang Yin, Tao Wang, Huaimin Wang}
|
||||
\affiliation{%
|
||||
\institution{College of Computer, National University of Defense Technology}
|
||||
\city{Changsha, China}
|
||||
\postcode{410073}
|
||||
}
|
||||
\email{{yuyue, lizhixing15, yingang, taowang2005, hmwang}@nudt.edu.cn}
|
||||
\renewcommand{\shortauthors}{Z. Li et al.}
|
||||
|
||||
|
||||
\author{Anonymous authors}
|
||||
\affiliation{Institutions}
|
||||
\email{Emails}
|
||||
|
||||
\input{0-abstract}
|
||||
%\begin{CCSXML}
|
||||
%<ccs2012>
|
||||
|
@ -61,11 +57,11 @@
|
|||
%\ccsdesc[500]{Software and its engineering~Collaboration in software development}
|
||||
|
||||
\keywords{duplicate pull-request, GitHub, dataset}
|
||||
|
||||
|
||||
\maketitle
|
||||
\renewcommand{\thefootnote}{}
|
||||
%\footnotetext{$^*$Corresponding author}
|
||||
\footnotetext{
|
||||
$^*$ Yue Yu and Zhixing Li are both first authors,
|
||||
and contributed equally to the work.}
|
||||
\renewcommand{\thefootnote}{\arabic{footnote}}
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue