minor
This commit is contained in:
parent
00dca29521
commit
5f6eeaf735
|
@ -9,7 +9,7 @@ His work interests include open source software engineering, data mining, and kn
|
|||
is an associate professor in the College of Computer at National University of Defense Technology.
|
||||
He received his Ph. D. degree in Computer Science from National University of Defense Technology (NUDT) in 2016.
|
||||
He has visited UC Davis supported by CSC scholarship.
|
||||
His research findings have been published on MSR, FSE, IST, ICSME APSEC and SEKE.
|
||||
His research findings have been published on MSR, FSE, IST, and ICSME.
|
||||
His current research interests include software engineering, spanning from mining software repositories
|
||||
and analyzing social coding networks.
|
||||
|
||||
|
|
|
@ -60,10 +60,13 @@ based on an investigation of developers and projects at Microsoft.
|
|||
They found that although defect finding is still the most motivation in code reviews,
|
||||
defect related comments comprise a small proportion.
|
||||
Their study focused on commercial projects and
|
||||
the reviewers of the these projects are mainly the employees at Microsoft.
|
||||
It is very different from the projects using pull-based model in GitHub,
|
||||
which exist in a more transparent and public environment
|
||||
and the code reviewers of such projects come from the community.
|
||||
the reviewers of these projects are mainly the employees at Microsoft.
|
||||
However, projects using pull-based model in GitHub are developed in a more transparent and open environment
|
||||
and code reviews are executed by the community users.
|
||||
Community reviewers may serve for different organizations, use the project for various purposes,
|
||||
and hold specific consideration in code review processes.
|
||||
Therefore different review topics may exist in pull-based development model
|
||||
compared to those in commercial development model.
|
||||
Tsay \etal~\cite{Tsay:2014a} explored issues raised around code contributions in GitHub.
|
||||
They reported that reviewers discuss on both
|
||||
the appropriateness of the contribution proposal and the correctness of the implemented solution.
|
||||
|
|
|
@ -368,77 +368,6 @@ This binary feature refers to if a comment includes ping activity
|
|||
(occurring by the form of ``@ user-name'').
|
||||
|
||||
|
||||
\begin{table*}[ht]
|
||||
\scriptsize
|
||||
\caption{Complete taxonomy}
|
||||
\begin{tabular}{r r p{9.8cm}}
|
||||
% \toprule
|
||||
\hline
|
||||
\rowcolor[HTML]{000000}
|
||||
&&\\
|
||||
\rowcolor[HTML]{000000}
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-1 Categories}}} &
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-2 Subcategories}}} &
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Description \& Example}}}\\
|
||||
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{Code Improvement}\\(L1-1)}}
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Points out extra blank line, improper indention, inconsistent naming convention, etc. \\
|
||||
&\cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Style Checking\\(L2-1)}} &\cellcolor{lightgray}e.g., \emph{``scissors: this blank line''}\\
|
||||
|
||||
|
||||
&\multirow{3}*{\tabincell{r}{Defect Detecting\\(L2-2)}} & Figures out runtime program errors or evolvability defects and etc.\\
|
||||
& &e.g., \emph{``''he default should be `false`''} and \emph{``let's extract this into a constant. No need to initialize it on every call.''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Demands submitter to provide test case for changed codes, report test result, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Code Testing\\(L2-3)}} &\cellcolor{lightgray}e.g., \emph{``this PR will need a unit test, I'm afraid, before it can be merged.''}\\
|
||||
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{PR Decision-making}\\(L1-2)}}
|
||||
&\multirow{2}*{\tabincell{r}{Value Affirming\\(L2-4)}} &Satisfied with the pull-request and agree to merge it\\
|
||||
& &e.g., \emph{``PR looks good to me. Can you \dots''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Rejects to merge the pull-request for duplicate proposal, undesired feature, etc.\\
|
||||
& \cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Solution Disagreeing\\(L2-5)}} & \cellcolor{lightgray} e.g., \emph{``I do not think this is a feature we'd like to accept. \dots''}\\
|
||||
|
||||
&\multirow{3}*{\tabincell{r}{Further Questioning\\(L2-6)}} & Confused with the purpose of the pull-request and ask for more details or use cases\\
|
||||
& &e.g., \emph{``Can you provide a use case for this change?''}\\
|
||||
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{Project Management}\\(L1-3)}}
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray} States what type of changes a specific version is expected to merge, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Roadmap Managing\\(L2-7)}} &\cellcolor{lightgray}e.g., \emph{``Closing as 3-2-stable is security fixes only now''}\\
|
||||
|
||||
&\multirow{2}*{\tabincell{r}{Reviewer Assigning\\(L2-8)}} &Ping other one(s) to review this pull-request\\
|
||||
& &e.g., \emph{``/cc @fxn can you take a look please?''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Needs to squash or rebase the commits, formulate the message, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Convention Checking\\(L2-9)}} &\cellcolor{lightgray}e.g., \emph{``\dots Can you squash the two commits into one? \dots''}\\
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\multirow{4}*{\tabincell{r}{\textbf{Social Interaction}\\(L1-4)}}
|
||||
&\multirow{3}*{\tabincell{r}{Politely Responding\\(L2-10)}} & Thanks for what other people do, apologize for mistakes, etc.\\
|
||||
& &e.g.,\emph{``Thank you. This feature was already proposed and it was rejected. See \#xxx''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray} Agrees with others' opinion, compliments others' work, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Contribution Encouraging\\(L2-11)}} &\cellcolor{lightgray}e.g.,\emph{``:+1: nice one @cristianbica.''}\\
|
||||
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\textbf{Others} &\multicolumn{2}{l}{Short sentence without clear or exact meaning like \textit{``@xxxx will do''} and \textit{``The same here :)''}} \\
|
||||
|
||||
|
||||
\bottomrule
|
||||
\multicolumn{3}{l}{\emph{Note: word beginning and ending with colon like ``:scissors:'' is a markdown grammar for emoji in GitHub}}
|
||||
\end{tabular}
|
||||
\label{tab:taxonomy}
|
||||
\end{table*}
|
||||
|
||||
\texttt{\small{Code\_inclusion:}}
|
||||
% \textit{Code\_inclusion:}
|
||||
This binary feature denotes if a comment includes the code elements.
|
||||
|
@ -485,5 +414,5 @@ based on this automatically classified dataset.
|
|||
In total, we classified 147,367 comments of 27,339 pull-requests.
|
||||
Based on this dataset, we conducted a preliminary quantitative study.
|
||||
We first explored the distribution of review comments on each category
|
||||
and then we reported some of the interesting findings derived from the distribution.
|
||||
and then we reported some of the findings derived from the distribution.
|
||||
|
||||
|
|
|
@ -3,6 +3,77 @@
|
|||
|
||||
\subsection{RQ1: What is the taxonomy for review comments on pull-requests?}
|
||||
|
||||
\begin{table*}[ht]
|
||||
\scriptsize
|
||||
\caption{Complete taxonomy}
|
||||
\begin{tabular}{r r p{9.8cm}}
|
||||
% \toprule
|
||||
\hline
|
||||
\rowcolor[HTML]{000000}
|
||||
&&\\
|
||||
\rowcolor[HTML]{000000}
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-1 Categories}}} &
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-2 Subcategories}}} &
|
||||
\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Description \& Example}}}\\
|
||||
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{Code Improvement}\\(L1-1)}}
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Points out extra blank line, improper indention, inconsistent naming convention, etc. \\
|
||||
&\cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Style Checking\\(L2-1)}} &\cellcolor{lightgray}e.g., \emph{``scissors: this blank line''}\\
|
||||
|
||||
|
||||
&\multirow{3}*{\tabincell{r}{Defect Detecting\\(L2-2)}} & Figures out runtime program errors or evolvability defects and etc.\\
|
||||
& &e.g., \emph{``''he default should be `false`''} and \emph{``let's extract this into a constant. No need to initialize it on every call.''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Demands submitter to provide test case for changed codes, report test result, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Code Testing\\(L2-3)}} &\cellcolor{lightgray}e.g., \emph{``this PR will need a unit test, I'm afraid, before it can be merged.''}\\
|
||||
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{PR Decision-making}\\(L1-2)}}
|
||||
&\multirow{2}*{\tabincell{r}{Value Affirming\\(L2-4)}} &Satisfied with the pull-request and agree to merge it\\
|
||||
& &e.g., \emph{``PR looks good to me. Can you \dots''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Rejects to merge the pull-request for duplicate proposal, undesired feature, etc.\\
|
||||
& \cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Solution Disagreeing\\(L2-5)}} & \cellcolor{lightgray} e.g., \emph{``I do not think this is a feature we'd like to accept. \dots''}\\
|
||||
|
||||
&\multirow{3}*{\tabincell{r}{Further Questioning\\(L2-6)}} & Confused with the purpose of the pull-request and ask for more details or use cases\\
|
||||
& &e.g., \emph{``Can you provide a use case for this change?''}\\
|
||||
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\multirow{6}*{\tabincell{r}{\textbf{Project Management}\\(L1-3)}}
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray} States what type of changes a specific version is expected to merge, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Roadmap Managing\\(L2-7)}} &\cellcolor{lightgray}e.g., \emph{``Closing as 3-2-stable is security fixes only now''}\\
|
||||
|
||||
&\multirow{2}*{\tabincell{r}{Reviewer Assigning\\(L2-8)}} &Ping other one(s) to review this pull-request\\
|
||||
& &e.g., \emph{``/cc @fxn can you take a look please?''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray}Needs to squash or rebase the commits, formulate the message, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Convention Checking\\(L2-9)}} &\cellcolor{lightgray}e.g., \emph{``\dots Can you squash the two commits into one? \dots''}\\
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\multirow{4}*{\tabincell{r}{\textbf{Social Interaction}\\(L1-4)}}
|
||||
&\multirow{3}*{\tabincell{r}{Politely Responding\\(L2-10)}} & Thanks for what other people do, apologize for mistakes, etc.\\
|
||||
& &e.g.,\emph{``Thank you. This feature was already proposed and it was rejected. See \#xxx''}\\
|
||||
|
||||
&\cellcolor{lightgray} &\cellcolor{lightgray} Agrees with others' opinion, compliments others' work, etc.\\
|
||||
&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Contribution Encouraging\\(L2-11)}} &\cellcolor{lightgray}e.g.,\emph{``:+1: nice one @cristianbica.''}\\
|
||||
|
||||
%%%%%%
|
||||
% \midrule
|
||||
\hline
|
||||
\textbf{Others} &\multicolumn{2}{l}{Short sentence without clear or exact meaning like \textit{``@xxxx will do''} and \textit{``The same here :)''}} \\
|
||||
|
||||
|
||||
\bottomrule
|
||||
\multicolumn{3}{l}{\emph{Note: word beginning and ending with colon like ``:scissors:'' is a markdown grammar for emoji in GitHub}}
|
||||
\end{tabular}
|
||||
\label{tab:taxonomy}
|
||||
\end{table*}
|
||||
|
||||
Table \ref{tab:taxonomy} shows the complete taxonomy.
|
||||
We identified four Level-1 categories,
|
||||
namely \textit{Code Correctness (L1-1)},
|
||||
|
@ -65,13 +136,20 @@ its large-scale sampling, thorough discussion and multiple validation.
|
|||
Therefore, we exclude them from our analysis and do not adjust our taxonomy.
|
||||
|
||||
|
||||
\begin{framed}
|
||||
\noindent
|
||||
\textbf{RQ1:} {}
|
||||
\textit{From the qualitative study, we identified a two-level taxonomy for
|
||||
\textbf{In Summary.}
|
||||
From the qualitative study, we identified a two-level taxonomy for
|
||||
review comments which consists of 4 categories in Level 1
|
||||
and 11 sub-categories in Level 2.}
|
||||
\end{framed}
|
||||
and 11 sub-categories in Level 2.
|
||||
|
||||
|
||||
% \begin{framed}
|
||||
% \noindent
|
||||
% \textbf{RQ1:} {}
|
||||
% \textit{From the qualitative study, we identified a two-level taxonomy for
|
||||
% review comments which consists of 4 categories in Level 1
|
||||
% and 11 sub-categories in Level 2.}
|
||||
% \end{framed}
|
||||
|
||||
\subsection{RQ2: Is it possible to automatically classify
|
||||
review comments according to the defined taxonomy}
|
||||
|
@ -414,18 +492,16 @@ it is not sufficient to differentiate the two types of comments.
|
|||
We plan to address the issue by extending the manually labeled dataset
|
||||
and introducing a sentiment analysis.
|
||||
|
||||
\begin{framed}
|
||||
\noindent
|
||||
\textbf{RQ2:}
|
||||
\textit{
|
||||
\textbf{In Summary.}
|
||||
Compared with the baseline,
|
||||
\TSHC achieves higher performance
|
||||
in terms of the weighted average F-measure, namely
|
||||
0.78 in Rails, 0.82 in Elasticsearch,
|
||||
and 0.75 in Angular.js,
|
||||
which indicates \TSHC is applicable
|
||||
in practical automatic classification.}
|
||||
\end{framed}
|
||||
in practical automatic classification.
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -478,7 +554,7 @@ which has the larger data size and therefore makes our findings more convincing.
|
|||
|
||||
|
||||
|
||||
\subsubsection{Cost-free Comments are Less frequent }
|
||||
\subsubsection{Cost-free comments are less frequent }
|
||||
It is strange that although inspecting code style (L2-1)
|
||||
and testing code (L2-3) cost reviewers very few effort,
|
||||
which do not require too much understanding for code changes,
|
||||
|
@ -588,7 +664,7 @@ This reality justifies the necessary of manual code review
|
|||
even although an increasing number of automatic tools are coming into being.
|
||||
|
||||
|
||||
\subsubsection{Convention are not always maintained}
|
||||
\subsubsection{Convention are sometimes overlooked}
|
||||
|
||||
\begin{figure*}[htbp]
|
||||
\centering
|
||||
|
@ -601,7 +677,7 @@ even although an increasing number of automatic tools are coming into being.
|
|||
|
||||
Also most projects set specific development conventions
|
||||
which do not require too much effort to maintain,
|
||||
they are, somehow, ignored by contributors which can be seen from the following examples.
|
||||
they are, somehow, overlooked by contributors which can be seen from the following examples.
|
||||
|
||||
|
||||
\textbf{Example\_1:}
|
||||
|
@ -660,14 +736,10 @@ Another alternative solution for GitHub is to provide automatic reviewing tools
|
|||
that can be configured with predefined convention rules
|
||||
and triggered as a new pull-request is submitted.
|
||||
|
||||
\begin{framed}
|
||||
\noindent
|
||||
\textbf{RQ3:}
|
||||
\textit{
|
||||
\textbf{In Summary.}
|
||||
Most comments are discussing about code correcting and social interactions.
|
||||
External contributors are more likely to
|
||||
break project conventions in their early contributions,
|
||||
and their pull-requests might contain potential issues,
|
||||
even though they have passed the tests.
|
||||
}
|
||||
\end{framed}
|
BIN
Response.docx
BIN
Response.docx
Binary file not shown.
Loading…
Reference in New Issue