76 lines
4.7 KiB
TeX
76 lines
4.7 KiB
TeX
\section{RELATED WORK}
|
|
\label{sec:RelatedW}
|
|
|
|
|
|
\subsection{Duplicate Detection}
|
|
Although very few work studies on duplicate detection of pull-requests,
|
|
many work investigate how to recognise duplicate bug reports.
|
|
Runeson \etal~\cite{Runeson2007Detection} is one of the first such studies.
|
|
They evaluated how NLP techniques support duplicates identification and
|
|
found NLP techniques can found 40\% marked duplicates.
|
|
Wang \etal~\cite{Wang2008} proposed an approach to detect duplicate bug reports
|
|
by comparing the natural language information and execution information between
|
|
the new report and the existing reports.
|
|
Sun \etal~\cite{Sun2010A} used discriminative models to detect duplicates and
|
|
their evaluation on three large software bug repositories showed that their method
|
|
achieved improvements compared with methods using natural language.
|
|
Later, Sun \etal~\cite{Sun2011Towards} proposed a retrieval function,
|
|
which fully utilized the information available in a bug report, to measure
|
|
the similarity beween two bug reports.
|
|
Nguyen \etal~\cite{Nguyen2012Duplicate} modeled each bug report as a textual document
|
|
and took advantage of both IR-based features and topic-based features
|
|
to learn the sets of different terms used to describe the same problems.
|
|
Thung \etal~\cite{Thung2014DupFinder} developed a tool
|
|
implementing the approach proposed by Runeson \etal~\cite{Runeson2007Detection}
|
|
and integrated it into the existing bug tracking systems.
|
|
Lazar \etal~\cite{Lazar2014Improving} made use of a set of new textual features
|
|
and trained several binary classification models to improve the detection performance.
|
|
Moreover, Zhang \etal~\cite{Zhang2015Multi} investigated to
|
|
detect duplicate questions in Stack Overflow.
|
|
They measured the similarity of two questions by comparing observable factors including titles, descriptions, and tags of the questions and latent factors corresponding to the topic distributions learned from the descriptions of the questions.
|
|
|
|
|
|
|
|
\subsection{Pull-request \& code review}
|
|
Although research on pull-requests is in its early stages,
|
|
several relevant studies have been conducted.
|
|
Gousios \etal~\cite{Gousios:2014,Rigby:2014} conducted a statistical analysis
|
|
of millions of pull-requests from GitHub and analyzed the popularity of pull-requests,
|
|
the factors affecting the decision to merge or reject a pull-request,
|
|
and the time to merge a pull-request.
|
|
Tsay \etal~\cite{Tsay:2014b} examined how social
|
|
and technical information are used to evaluate pull-requests.
|
|
Yu \etal~\cite{Yu:2016} conducted a quantitative study on pull-request evaluation in the context of CI.
|
|
Moreover, Yue \etal~\cite{Yu:2015} proposed an approach that combines information retrieval
|
|
and social network analysis to recommend potential reviewers.
|
|
Veen \etal~\cite{Veen:2015} presented PRioritizer,
|
|
a prototype pull-request prioritization tool,
|
|
which works to recommend the top pull-requests the project owner should focus on.
|
|
|
|
Code review is employed by many software projects to
|
|
examine the change made by others in source codes,
|
|
find potential defects,
|
|
and ensure software quality before they are merged~\cite{Baysal:2015,Mcintosh:***}.
|
|
Traditional code review,
|
|
which is also well known as the code inspection proposed by Fagan~\cite{Fagan:1976},
|
|
has been performed since the 1970s.
|
|
However, its cumbersome and synchronous characteristics have hampered
|
|
its universal application in practice~\cite{Bacchelli:2013}.
|
|
With the occurrence and development of VCS and collaboration tools,
|
|
Modern Code Review (MCR)~\cite{Rigby:2011} is adopted by many software companies and teams.
|
|
Different from formal code inspections,
|
|
MCR is a lightweight mechanism~\cite{t2015cq,Thongtanunam:2015}
|
|
that is less time consuming and supported by various tools.
|
|
Code review has been widely studied by from several perspectives
|
|
including automation of review task~\cite{Thongtanunam:2015,jiang:2015,rahman2016correct,Lima},
|
|
factors influencing review outcomes~\cite{Tsay:2014b,Baysal:2015,Baum:2016} and
|
|
problems found in code review~\cite{Bacchelli:2013,Beller:2014}.
|
|
The impact of code review on software quality~\cite{Mcintosh:***,Morales2015Do} is also investigated by
|
|
many studies in terms of code review coverage and code review participation~\cite{Mcintosh:2014}
|
|
and code ownership~\cite{T2016Revisiting}.
|
|
While the main motivation for code review was believed
|
|
to be finding defects to control software quality,
|
|
recent research has revealed that defect elimination is not the sole motivation.
|
|
Bacchelli \etal~\cite{Bacchelli:2013} reported additional expectations,
|
|
including knowledge transfer, increased team awareness,
|
|
and creation of alternative solutions to problems. |