init

2017-10-08 14:27:02 +08:00 · 2017-10-08 14:27:02 +08:00 · ae77e6ce60
commit ae77e6ce60
36 changed files with 3602 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,11 @@
 *.fls
 *.fdb_latexmk
 *.gz
 *.aux
 *.pdf
 *.log
 *.bbl
 *.blg
 *(busy)
 *DS_Store
 ~*
--- a/PRCCE/0_abstract.tex
+++ b/PRCCE/0_abstract.tex
@ -0,0 +1,45 @@
 \begin{abstract}
 The pull-based model, widely used in distributed software development, 
 allows any contributor to fork a public repository, package contributions 
 as a pull-request, and then merge back to the original repository.
 Code review, as a well-established software quality practiceis,  
 ensures that only high-quality pull-requests are accepted,
 based on the in-depth discussion among reviewers.
 {\color{red}
 With the evolution of collaboration tools and environment, 
 development activities in the communities like GitHub are more transparent and social. 
 Various participants are taking part in the review discussions and 
 what they are talking about are not only limited to improvement of code contributions 
 but also project evolution and social interaction.} 
 A comprehensive understanding of the reviewers' motivations 
 of joining the discussions would be useful to 
 better organize and optimize the socialized code review processes.
 In this paper, we first conducted a case study on three popular 
 open-source software projects hosted on GitHub 
 and constructed a fine-grained taxonomy covering 11 categories 
 (\eg error detecting, reviewer assigning, contribution encouraging, \etc)
 for the review comments generated in the discussions.
 Then, we manually labeled over 5,600 review comments 
 according to the defined taxonomy, 
 and proposed a Two-Stage Hybrid Classification (TSHC) algorithm
 to classify review comments automatically
 by combining rule-based and machine-learning techniques.
 Comparative experiments with a text-based method
 achieved a reasonable improvement on each project
 (9.2\% in Rails, 5.3\% in Elasticsearch, and 7.2\% in Angular.js respectively)
 in terms of the weighted average F-measure.
 Moreover, in the preliminary quantitative analysis of 
 a large set of labeled review comments, 
 we found that the three projects represent similar comments distribution on each category 
 and most comments are discussing about {\color{red}code improvement and social interactions.} 
 We also found pull-requests submitted by contributors 
 who fail to acquire proficient development skill
 tend to contain potential issues even though they have passed the tests.
 Furthermore, external contributors are more likely to break project conventions 
 in their early contributions.
 \end{abstract}
--- a/PRCCE/1_introduction.tex
+++ b/PRCCE/1_introduction.tex
@ -0,0 +1,103 @@
 \section{Introduction}
 %Pull –request
 The pull-based development model is becoming increasingly popular in distributed collaboration
 for open-source software (OSS) development~\cite{Barr:2012,Gousios:2014,Gousios:2014b,Gousios:2016}.
 On GitHub~\footnote{\url{https://github.com/}} alone,
 the largest social-coding community, nearly half of collaborative projects
 (already over 1 million~\cite{Gousios:2016} in January 2016) have adopted this model.
 In pull-based development,
 any contributor can freely \emph{fork} (\ie clone) an interesting public project
 and modify the forked repository locally (\eg fixing bugs and adding new features)
 without asking for the access to the central repository.
 When the changes are ready to merge back to the master branch,
 the contributors submit a \emph{pull-request},
 and then a rigorous code review process is performed
 before the pull-request get accepted.
 %Review
 Code review is a communication channel where integrators,
 who are core members of a project, can express their concern
 for the contribution~\cite{Tsay:2014a,Gousios:2014b,Marlow:2013,yu2015wait}.
 If they doubt the quality of a submitted pull-request,
 integrators make comments that
 {\color{red}ask the contributors to improve the implementation}. 
 With pull-requests becoming increasingly popular,
 most large OSS projects allow for crowd sourcing of pull-request reviews
 to a large number of external developers~\cite{Yu:2015,yu2014reviewer}
 to reduce the workload of integrators. 
 {\color{red} These external developers are interested in these projects and 
 are concerned about the development of them.} 
 After receiving the review comments, the contributor usually responds positively and
 updates the pull-request for another round of review.
 Thereafter, the responsible integrator makes a decision to accept the pull-request
 or reject it by taking all judgments and changes into consideration.
 Previous studies~\cite{Tsay:2014b,Yu:2016,yu2015wait,t2015cq} have shown that 
 code review, as a well-established software quality practice,
 is one of the most significant stages in software development. 
 It ensures that only high-quality code changes are accepted,
 based on the in-depth discussion among reviewers. 
 {\color{red}With the evolution of collaboration tools and environment~\cite{Storey2014The,Zhu2016Effectiveness}, 
 development activities in the communities like GitHub are more transparent and social. 
 Various participants are taking part in the discussion and 
 what they are talking about are not only limited to improvement of code contributions 
 but also project evolution and social interaction.} 
 A comprehensive understanding of the reviewers' motivations
 of joining the discussions would be useful to
 better organize and optimize code review processes,
 such as reviewer recommendation~\cite{Yu:2015,Thongtanunam:2015,jiang:2015,rahman2016correct} 
 and pull-request prioritization~\cite{Veen:2015}.
 In this paper, we first conducted a case study on three popular
 open-source software projects hosted on GitHub
 and constructed a fine-grained taxonomy covering 11 categories
 (\eg {\color{red}defect detecting}, reviewer assigning, contribution encouraging, \etc)
 for the review comments generated in the discussions.
 Second, we manually labeled over 5,600 review comments
 according to the defined taxonomy.
 With this dataset, we propose \TSHC, a two-stage hybrid classification algorithm
 that is able to automatically classify review comments
 by combining rule-based and machine-learning (ML) techniques.
 Moreover, we did preliminary quantitative analysis on
 a large set of {\color{red} automatically classified review comments}.
 Thus, the main contributions of this study include the following:
 \begin{itemize}
 	\item A fine-grained and multilevel taxonomy for review comments
 			in the pull-based development model is provided
 			in relation to technical, management, and social aspects.
 	\item A high-quality manually labeled dataset of review comments,
 			which contains more than 5,600 items, can be accessed via
 			a web page~\footnote{We will publish it after the paper has been accepted.}
 			and used in further studies.
 	\item A high-performance {\color{red}automatic classification model for 
 				review comments of pull-requests} is proposed.
 			The model leads to a significant improvement in terms of the weighted average F-measure,
 			\ie 9.2\% in Rails, 5.3\% in Elasticsearch, and 7.2\% in Angular.js,
 			compared with the text-based method.
 	\item Typical review patterns are explored {\color{red}which provide implications
 		  for code reviewers and collaborative development platforms (\eg GitHub)}.
 			% for collaborative development platforms to improve their service
 			% and tools in order to better
 			% satisfy the practical needs in code review.
 \end{itemize}
 The rest of this paper is organized as follows.
 Section~\ref{sec:bg} provides the research background
 and introduces the research questions.
 In Section~\ref{sec:approach}, we elaborate the approach of our study.
 Section~\ref{sec:result} lists the research result,
 while Section~\ref{sec:Threats} discusses several threats to validity of our study.
 Section~\ref{sec:Concl} concludes this paper and gives an outlook to future work.
 % Several studies have been conducted to explore how the code review process influences
 % pull-request acceptance~\cite{Tsay:2014b,Yu:2016} and latency~\cite{yu2015wait},
 % and software release quality~\cite{t2015cq}.
 % They found that code review, as a well-established software quality practice,
 % is one of the most significant stages in pull-based development.
--- a/PRCCE/2_background.tex
+++ b/PRCCE/2_background.tex
@ -0,0 +1,229 @@
 \section{Background and Related Work}
 \label{sec:bg}
 \subsection{Pull-based development model}
 In GitHub, a growing number of developers contribute to the open source projects 
 by using the pull-request mechanism~\cite{Gousios:2014,Yu:2014}. 
 As illustrated in Figure\ref{fig:github_wf}, 
 a typical contribution process based on pull-based development model in GitHub 
 involves the following steps.
 \begin{figure}[ht]
 	\centering
 	\includegraphics[width=9cm]{resources/Fig-1.png}
 	\caption{Pull-based workflow on GitHub}
 	\label{fig:github_wf}
 \end{figure}
 \emph{Fork:} 
 A contributor can find an interesting project 
 by following several well-known developers and watching their projects. 
 Before contributing, the contributor has to fork the original project.
 \emph{Edit:} 
 After forking, the contributor can edit locally 
 without disturbing the main branch in original repository. 
 He is free to do whatever he wants, 
 such as implementing a new feature 
 or fixing bugs according to the cloned repository.
 \emph{Pull Request:} 
 When his work is finished, 
 the contributor submits the changed codes from the forked repository 
 to its source by a pull-request. 
 Except for commits, the submitter needs to provide a title and description 
 to elaborate on the objective of his pull-request.
 \emph{Test:} 
 Several reviewers play the role of testers 
 to ensure that the pull-request does not break the current runnable state. 
 They check the submitted changes by manually running the patches locally 
 or through an automated manner with the help of continuous integration (CI) services.
 \emph{Review:} 
 All developers in the community have the chance 
 to discuss that pull-request in the issue tracker, 
 with the existence of pull-request description, changed files, and test result. 
 After receiving the feedback from reviewers,
 the contributor updates his pull-request by attaching new commits 
 for another round review. 
 \emph{Decide: } 
 A responsible manager of the core team considers all the opinions of reviewers 
 and merges or rejects the pull-request.
 Although research on pull-requests is in its early stages, 
 several relevant studies have been conducted 
 in terms of exploratory analysis, priority determination, 
 reviewer recommendation, \etc. 
 %Pull-request determination
 Gousios \etal~\cite{Gousios:2014,Rigby:2014} conducted a statistical analysis 
 of millions of pull-requests from GitHub and analyzed the popularity of pull-requests, 
 the factors affecting the decision to merge or reject a pull-request, 
 and the time to merge a pull-request. 
 Their reserch result implies that 
 there is no difference between core members and outside contributors 
 in getting their pull-requests accepted and 
 only 13\% of the pull-requests are {\color{red}rejected} for technical reasons.
 Tsay \etal~\cite{Tsay:2014b} examined how social 
 and technical information are used to evaluate pull-requests.
 They found that reviewers take into account of both 
 the contributors' technical practices   
 and the social connection between contributors and reviewers.
 Yu \etal~\cite{Vasilescu:B,Yu:2016} conducted a quantitative study, 
 and discovered that latency is a complex issue to explain adequately, 
 and CI is a dominant factor for the process, 
 which even changes the effects of some traditional predictors.
 Lima \etal~\cite{Lima} and Yue \etal~\cite{Yu:2015} proposed approaches to automitically 
 recommend potential pull-reqeust reviewers, 
 which applied Random Forest algorithm and Vector Sapace Model respectively.
 Veen \etal~\cite{Veen:2015} presented PRioritizer, 
 a prototype pull-request prioritization tool, 
 which works to recognize the top pull-requests the core members should focus on.
 %%%%%%%是否可以用到评估里
 %再加一个pr优先级
 % Gousios et al.\cite{Gousios:2014b} conducted a survey with hundreds of top integrators(core team) in GitHub and analyzed the challenges pull-request reviewers are faced with. Their key findings are that integrators struggle to maintain the quality of their projects and have difficulties with prioritizing pull-request.
 % Veen et al.\cite{Veen:2015} presented PRioritizer, a prototype pull request prioritisation tool, which works like recommands the top pull requests the project owner should focus on. Tsay et al.\cite{Tsay:2014a} also explored how reviewers evaluate contributions in extended discussions and what issues are raised arount the coutributions.
 %%%%%%%是否可以用到评估里
 \subsection{Code review}
 Code review is employed by many software projects to 
 examine the changes made by others in source codes, 
 find potential defects, 
 and ensure software quality before they merge back~\cite{Baysal:2015,Mcintosh:***}.
 Traditional code review, 
 which is also well known as code inspection proposed by Fagan~\cite{Fagan:1976}, 
 has been performed since the 1970s~\cite{Ackerman:1989,Kollanus:2009,Bacchelli:2013}. 
 The inspection process consists of well-defined steps 
 which are executed one-by-one in group meeting\cite{Fagan:1976}. 
 Many evidence have proved the value of code inspection 
 in software developement\cite{Frank:1984,Ackerman:1989,Russell:1991,Aurum:2002}.
 However, its cumbersome and synchronous characteristics have hampered 
 its universal application in practice~\cite{Bacchelli:2013}. 
 On the other hand, 
 with the evolution of software development model, 
 this approach is also usually misapplied and 
 result in bad outcomes according to Riby's exploration\cite{Rigby:**}. 
 With the occurrence and development of version control systems 
 and collaboration tools, 
 Modern Code Review (MCR)~\cite{Rigby:2011} is adopted by many software companies and teams.
 Different from formal code inspections, 
 MCR is a lightweight mechanism that is less time-consuming and supported by various tools.
 Rigby \etal~\cite{Rigby:2006,Rigby:2008,Rigby:2011} explored MCR process 
 in open source communities. 
 They performed case studies on GCC, Linux, Mozilla and Apache 
 and found several code review patterns and 
 a broadcast-based code review style used by Apache.
 Basum \etal~\cite{Baum:2016} and Bacchelli \etal~\cite{Bacchelli:2013} analyzed MCR practice 
 in commercial software development teams inorder to 
 improve the use of code reviews in industry.
 While the main motivation for code review was believed 
 to be finding defects to control software quality, 
 recent research~\cite{Bacchelli:2013,Mcintosh:2014} has revealed 
 additional expectations, 
 including knowledge transfer, increased team awareness, 
 and creation of alternative solutions to problems.
 Moreover, participation rate in code review reported in the study 
 conducted by Ciolkowski \etal~\cite{Ciolkowski:2003}
 shown that the figure has increased in theses years.
 Other studies~\cite{Baysal:2015,Rigby:2011,Baysal:2013,Baum:2016} also 
 investigated factors that influence the outcomes of code review process.
 {\color{red}
 In recent years, collaboration tools have evolved with social media~\cite{Storey2014The,Zhu2016Effectiveness}. 
 Especially, GitHub integrates the function of code review in the pull-based model 
 and makes it more transparent and social~\cite{Storey2014The}.} 
 The evaluation of pull-request in GitHub is a form of MCR~\cite{Beller:2014,Bacchelli:2013}. 
 The way to give voice on pull-request evaluation is to 
 participate in the discussion and leave comments.
 {\color{red}
 Tsay \etal~\cite{Tsay:2014a} analyzed highly discussed pull-request 
 which have extended discussions and explored evaluating code contributions through discussion in GitHub. 
 Georgios \etal~\cite{Gousios:2014b} investigated the challenges faced by the integrators in GitHub 
 by conducting a survey which involves 749 integrators. }
 Two kinds of comments can be identified by their position in the discussion: 
 issue comments (\ie general comments) as a whole about the overall contribution 
 and inline comments for specific lines of code in the changes ~\cite{Yu:2015,Thongtanunam:2016}. 
 {\color{red}
 Actually, there is another kind of review comments in GitHub: commit comments 
 which are commented on the commits of pull-requests~\cite{zhang2017social}. 
 Among these three kinds of comments, 
 general comments and inline comments account for the vast majority of all the comments 
 and the number of commit comments is extremely small. 
 Consequently, we only focus on general comments and inline comments 
 and do not take the commit comments into consideration.
 }
 \begin{figure}[ht]
 %  https://github.com/rails/rails/pull/12150
 	\centering
 	\includegraphics[width=8cm]{resources/Fig-2.png}
 	\caption{Example comments on GitHub}
 	\label{fig:comment}
 \end{figure}
 % \emph{Visualization:}
 As can be seen from Figure~\ref{fig:comment}, 
 all the review comments corresponding to a pull-request 
 are displayed and ordered primarily by creation time.
 Issue comments are directly visible, 
 while inline comments default to be folded and is shown when the toggle button is clicked.
 % \emph{Participant:}
 Due to the transparent environment, 
 a large number of external developers, 
 who are concerned about the development of the corresponding project, 
 are allowed to participate in the evaluation of any pull-request 
 of a public repository. 
 All the participants can ask the submmiter to bring out the pull-request more clearly 
 or improve the solution.
 Prior study~\cite{Gousios:2014} also reveals that, in most projects, 
 more than half of the participants are external developers,  
 while the majority of the comments come from core team members.
 The diversity of participants and their different concerns 
 in pull-request evaluation
 produce various types of review comments 
 which cover a wide range of motivations 
 from solution detail to contribution appropriateness.
 \subsection{Research Questions}
 In this paper, we focus on analyzing the review comments in GitHub. 
 Although prior work~\cite{Gousios:2014b,Tsay:2014a} 
 has identified several kinds of issues and challenges brought by pull-requests, 
 we believe that the motivations of reviewers in joining the code review 
 are not yet well understood, 
 especially the underlying taxonomy of reveiw comments has not been identified.
 % With an increasing number of projects 
 % applying pull-based development model and organizing pull-requests evaluation
 % in a free form,
 % it is important to identify the underlying taxonomy of reveiw comments, 
 % which can further improve some code review practice 
 % like reviewer recommendation and pull-request prioritization.
 Consequently, Our first question is:
 \textbf{RQ1.} What is the taxonomy for review comments on pull-requests?
 Subsequently, we are interested in exploring {\color{red}whether there is an automatic way to 
 classify review comments according to the defined taxonomy}. Therefore, our second question is:
 \textbf{RQ2.} Is it possible to automatically classify review comments 
 according to the defined taxonomy?
 Among the large set of review comments, 
 several review patterns are worthy {\color{red}exploring}, 
 such as what the most comments are talking about 
 and what kinds of issues have been raised by contributors.
 This exploration is needed to guide our future work and led to our last research question:
 \textbf{RQ3.} What are the typical review patterns among the reviewers' discussions?
--- a/PRCCE/3_approach.tex
+++ b/PRCCE/3_approach.tex
@ -0,0 +1,492 @@
 \section{Approach}
 \label{sec:approach}
 \subsection{Approach Overview}
 % \hl{The goals of our work are to  }
 % build a taxonomy for review comments in the pull-based development model 
 % and automate the comment classification according to the defined taxonomy. 
 % % !!!!!!!!!!!!!!!!!改成motivation
 % We aim to provide the foundation for further study on 
 % the optimization of socialized code review process.
 The goal of our work is to 
 {\color{red}
 investigate the code review practice in GitHub.  
 We aim to build a fine-grained taxonomy and systematically analyze the review comments in depth.} 
 As illustrated in Figure \ref{fig:approach}, 
 our research approach consists of the following steps.
 \begin{figure}[ht]
 	\centering
 	\includegraphics[width=8.5cm]{resources/Fig-3.png}
 	\caption{The overview of our approach}
 	\label{fig:approach}
 \end{figure}
 \begin{enumerate}[1)]
 	\item \textit{Data collection}:
 	% !!!!!!!!!!!! 双盲评阅
 	 	% In our prior work~\cite{Yu:2015b,yu2015wait}, 
 	 	In our prior work, 
 	 	we collected {\color{red}4896 projects that have the most number of pull-requests on Github.}  
 	 	This dataset is crawled through the official API offered by GitHub 
 	 	and updated in the current study. 
 	 	In addition, we also use the data released by 
 	 	GHTorrent\footnote{http://ghtorrent.org}.
 	\item \textit{Taxonomy definition}: 
 		We constructed a fine-grained multilevel taxonomy for the review comments 
 		in pull-based development model.
 	\item \textit{Manual labeling}: 
 		We manually labeled a set of review comments 
 		according to the defined taxonomy.
 	\item \textit{Automatic classification}: 
 		We proposed the TSHC algorithm, 
 		which automatically classifies review comments 
 		using rule-based and ML techniques.
 	\item \textit{Quantitative analysis}: 
 		We did a preliminary quantitative analysis of a large set of 
 		review comments which were labeled by TSHC algorithm.
 \end{enumerate}
 \subsection{Dataset}
 \begin{table*}[ht]
 	\scriptsize
 	\centering
 	\caption{Dataset of our experiments}
 	\begin{tabular}{r c c r c c c c c}
 		% \toprule
 		\hline
 			\rowcolor[HTML]{000000} 
 			{\color[HTML]{FFFFFF}\textbf{Projects}}       &
 			{\color[HTML]{FFFFFF}\textbf{Language}}    &
 			{\color[HTML]{FFFFFF}\textbf{Application Area}}   &
 			{\color[HTML]{FFFFFF}\textbf{Hosted\_at}}      &
 			{\color[HTML]{FFFFFF}\textbf{\#Star}}   &
 			{\color[HTML]{FFFFFF}\textbf{\#Fork}}    &
 			{\color[HTML]{FFFFFF}\textbf{\#Contr}}  &
 			{\color[HTML]{FFFFFF}\textbf{\#PR}}  &
 			{\color[HTML]{FFFFFF}\textbf{\#Comnt}} \\
 		% \midrule
 		\hline
 		\textbf{Rails}	           & Ruby	& Web Framework     & May. 20 2009 	& 33906	& 13789	& 3194	     & 14648		& 75102    \\
 		\textbf{Elasticsearch	}     & Java	& Search Server     & Feb. 8 2010	& 20008	& 6871	& 753	          &6315	     		& 38930	\\
 		\textbf{Angular.js}	     & JavaScript	& Front-end Framework & Jan. 6 2010	& 54231	& 26930	& 1557	     & 6376		& 33335	\\
 		\bottomrule
 		\multicolumn{9}{l}{\emph{Abbreviations}: Contr: Contributor,  
 									PR: Pull-request,  Comnt: Comment}
 	\end{tabular}
 	\label{tab:dataset}
 \end{table*}
 The experiments in the paper is conducted on 
 three representative open source projects, 
 namely Rails, Elasticsearch and Angular.js, 
 which made heavy use of pull-requests in GitHub.
 The dataset is composed of two sources: 
 GHTorrent dump released in Jun. 2016 
 and our own crawled data from GitHub. 
 {\color{red}
 From GHTorrent, we can get a list of projects together with their basic information 
 such as programming language, hosting time, the number of forks, the list of pull-requests and etc.. 
 However, GHTorrent does not provide text content of 
 pull-requests (i.e., title, description) and review comments 
 and therefore we have to crawl this information according to the urls provided by GHTorrent. 
 Finally, the two sources are linked by the id of projects and pull-request number.} 
 Table \ref{tab:dataset} lists the key statistics of the dataset.
 We studied on original projects (\ie not forked from others) 
 written in Ruby, Java and JavaScript, 
 which are three of the most popular languages\footnote{https://github.com/blog/2047-language-trends-on-github} on GitHub.
 The three selected projects are hosted in GitHub at an early time, 
 and are applied in different areas;
 Rails is used to build websites, 
 Elasticsearch acts as a search server 
 and Angular.js is an outstanding front-end development framework.
 This ensures the diversity of experimental projects,  
 which is needed to increase the generalizability of our work.
 The number of stars, the number of forks and the number of contributors
 indicate the hotness of a project.
 Starring is similar to the function of a bookmarking system 
 which informs users of the latest activities of projects that they have starred.
 In total, our dataset contains 27339 pull-requests and 147367 review comments.
 \subsection{Taxonomy Definition}
 \label{td}
 Previous work has studied the challenges faced by pull-request reviewers 
 and the issues introduced by pull-request submitters~\cite{Gousios:2014b,Tsay:2014a}. 
 Inspired by their work, we decide to comprehensively observe 
 the motivations of reviewers in joining the code review in depth
 rather than merely focusing on technical and nontechnical perspectives.
 % !!!!!!!!!! card sort 
 We conducted a card sort~\cite{Bacchelli:2013} 
 to determine the taxonomy scheme, 
 which is executed manually through an iterative process 
 of reading and analyzing review comments randomly collected from the three projects. 
 The following steps are to executed to define the taxonomy. 
 {\color{red}
 \begin{enumerate}[1)]
 	\item First, we randomly selected 900 review comments (300 for each project).
 	\item Two participants independently conducted card sorts on the 70\% of the comments. 
 	They firstly labeled each selected comment with a descriptive message. 
 	The labeled comments are then divided into different groups 
 	according to their descriptive messages. 
 	Through a rigorous analysis of existing literature 
 	and their own experience with working and analyzing 
 	the pull-based model in the last two years, 
 	they identified their respective draft taxonomies. 
 	\item They then met and discussed their identified draft taxonomies. 
 	When they agreed with each other, the initial taxonomy was constructed.
 	\item Another 10 participants were invited to help varify and improve the taxonomy 
 	by examining another 10\% of the comments. 
 	After some refinement of category definition and adjustment of taxonomy hierarchy was executed, 
 	the final taxonomy was determined.
 	\item Finally, the two authors independently classified the remaining 20\% of the comments into the taxonomy. 
 	We used one of the most popular reliability measurement, that is percent agreement, to calculate their reliability. 
 	We found they agreed on the 97.8\% (176 / 180) of the labeling of comments.
 \end{enumerate}
 }
 In the above process, the two main participants (\ie the two authors of the paper) 
 have five years' and eight years' experience in software development respectively. 
 Both of them are interested in academic research in empirical software engineering and social coding. 
 Especially, the second author has four years' research experience in pull-based development model. 
 The another 10 participants also work at our research team
 including postgraduates, PH. D. candidates and project developers. 
 \subsection{Manual Labeling}
 For each project, we randomly sampled 200 pull-requests 
 ( of which the comment count is greater than 0 and less than 30) 
 per year from 2013 to 2015. 
 Overall, 1,800 distinct pull-requests and 5,645 review comments are sampled.
 According to the defined taxonomy, 
 we manually classified the sampled review comments. 
 We built an online labeling platform (OLP) which 
 was deployed on the public network and offered a web-based interface.
 OLP is helpful to reduce the extra burden on the labeling executor 
 and ensure the quality of the manual labeling results. 
 \begin{figure}[ht]
 	\centering
 	\includegraphics[width=8cm]{resources/Fig-4.png}
 	\caption{The main interface of OLP}
 	\label{fig:olp}
 \end{figure}
 As shown in Figure~\ref{fig:olp}, 
 the main interface of OLP contains three sections 
 which presents a pull-request together with its review comments, 
 \emph{Section 1}:  
 The title, description, and submitter's user name of a pull-request 
 are displayed in this section. 
 The hyperlink of the pull-request on GitHub is also shown to 
 make it more convenient to jump to the original webpage 
 for a more detailed view and inspection if necessary.
 \emph{Section 2}: 
 All the review comments of a pull-request are listed in this section 
 and ordered by creation time. 
 The user name, 
 comment type (inline comment or issue comment), 
 and creation time of a review comment appear on the top of comment content.
 \emph{Section 3}: 
 Beside a comment, 
 there are candidate labels corresponding to our taxonomy, 
 which can be selected by clicking the checkbox next to the label.
 The use of checkboxes means that multiple labels can be assigned to a comment. 
 Other labels that are not in our list can also be reported 
 by writing free text in the text field named `other'.
 We only label each comment with Level-2 categories, 
 and Level-1 categories are automatically labeled according to the taxonomy hierarchy.
 \subsection{Two-Stage Hybrid Classification}
 Figure~\ref{fig:TSHC} illustrates the overview of \TSHC. 
 TSHC consists of two stages that utilize comments text 
 and other information extracted from comments and pull-requests respectively.
 \begin{figure*}[ht]
 	\centering
 	\includegraphics[width=16cm]{resources/Fig-5.png}
 	\caption{Overview of \TSHC}
 	\label{fig:TSHC}
 \end{figure*}
 \underline{\textbf{Stage One:} }
 The classification in this stage mainly utilizes the text part of each review comment 
 and produces a vector of possibility (VP), 
 in which each item is the possibility of whether 
 a review comment will be labeled with the corresponding category.
 Preprocessing is necessary before formal comment classification~\cite{Antoniol:2008}. 
 Reviewers tend to reference the source code, hyperlink, or statement of others 
 in a review comment to clearly express their opinion, prove their point, or reply to other people. 
 These behaviors promote the review process 
 but cause a great challenge to comment classification. 
 Words in these reference texts contribute minimally to the classification 
 and even introduce interference. 
 Hence, we transform them into single-word indicators to 
 reduce vocabulary interference 
 and reserve reference information. 
 Specifically, the source code, hyperlink, and statement of others are replaced with 
 \textit{`cmmcode'}, \textit{`cmmlink'} and \textit{`cmmtalk'} respectively.
 After preprocessing, a review comment is classified by a rule-based technique, 
 which uses inspection rules to match the comment text for each category. 
 Several phrases often appear in the comments of a specific category. 
 For instance, \textit{``lgtm''} (abbreviation of \textit{``looks good to me''}) 
 is usually used by reviewers to express their satisfaction with a pull-request. 
 This type of phrases are discriminating and helpful 
 in recognizing the category of a review comment. 
 Therefore, we establish inspection rules for each category, 
 which are a set of regular expressions abstracted from discriminating phrases. 
 A category label is assigned to a review comment, 
 and the corresponding item in VP is set to 1 
 if one of its inspection rules matches the comment text. 
 {\color{red}
 In our method each category gets 7.5 rules in average, 
 therefore the following only shows part of the inspection rules 
 and the corresponding matched comments.
 }
 \begin{itemize}
 	\item From \textit{Style Checking (L2-1)}
 		\begin{itemize}
 			\item \footnotesize{\texttt{(blank$|$extra) (line$|$space)s?}}
 			\item \textit{``Please add a new \textbf{blank line} after the include''}
 		\end{itemize}
 	\item From \textit{Value Affirming (L2-4)}
 		\begin{itemize}
 			% \item \small{\texttt{(looks$|$seem)s? (good$|$great$|$useful$|$awesome$|$fine)}}
 			\item \footnotesize{\texttt{(looks$|$seem)s? (good$|$great$|$useful)}}
 			\item \textit{``\textbf{Looks good} to me, the grammar is definitely better.''}
 		\end{itemize}
 	\item From \textit{Reviewer Assigning (L2-8)}
 		\begin{itemize}
 			\item \footnotesize{\texttt{(cc:?$|$wdyt$|$defer to$|\backslash$br$\backslash$?) (@$\backslash$w+ *)+}}
 			\item \textit{``\textbf{/cc @fxn} can you take a look please?''}
 		\end{itemize}
 	\item From \textit{Politely Responding (L2-10)}
 		\begin{itemize}
 			\item \footnotesize{\texttt{thanks?|thxs?|:($\backslash$w+)?heart:}}
 			% \item $thanks?|thxs?|:(\backslash w+)?heart:$
 			\item \textit{``@xxx looks good. \textbf{Thank} you for your contribution \textbf{:yellow\_heart:}''}
 		\end{itemize}
 \end{itemize}
 The review comment is then processed by the ML-based technique. 
 ML-based classification is performed with scikit-learn\footnote{http://scikit-learn.org/}, 
 particularly the support vector machine (SVM) algorithm. 
 The comment text is tokenized and stemmed to a root form~\cite{Porter:1997}. 
 We filter out punctuations from word tokens and reserve English stop words 
 because we assume that common words play an important role in short texts, such as review comments. 
 We adopt the TF-IDF (term frequency-inverse document frequency) model~\cite{Baeza:2004} 
 to extract a set of features from the comment text 
 and apply the ML algorithm to text classification. 
 A single review comment often addresses multiple topics. 
 Hence, one of the goals of \TSHC is to perform multi-label classification. 
 To this end, we construct text classifiers (TCs) for each category 
 with a one-versus-all strategy. 
 For a review comment 
 which has been matched by inspection rule $R_{i}$ 
 (supposing that n categories of $C_{1},C_{2}, \dots, C_{n}$ exist),
 each TC ($TC_{1},TC_{2}, \dots, TC_{n}$), except for $TC_{i}$,  
 is applied and predict the possibility of this review comment belonging to the corresponding category
 Finally, the VP determined by inspection rules and text classifiers are passed on to \textit{stage~two}.
 \underline{\textbf{Stage Two:}} 
 The classification in this stage is performed on composed features. 
 Review comments are usually short texts. 
 Our statistics indicates that the minimum, maximum, and average numbers of words 
 contained in a review comment 
 are 1, 1527, and 32, respectively. 
 The text information contained in a review comment is limited to be used to determine its category.
 Therefore, in addition to the VP generated in \emph{stage one}, 
 we also consider the following other features related to review comments.
 \texttt{\small{Comment\_length:}}
 % \textit{Comment\_length:} 
 This feature refers to the total number of characters 
 contained in a comment text after preprocessing. 
 Long comments are likely to argue about pull-request appropriateness and code correctness.
 \texttt{\small{Comment\_type:}}
 % \textit{Comment\_type:}  
 This binary feature indicates whether a review comment is inline comment or issue comment. 
 An inline comment tends to talk about the solution detail, 
 whereas an issue comment is likely to talk about other ``high-level'' issues, 
 such as pull-request decision and project management.
 \texttt{\small{Core\_team:}}
 % \textit{Core\_team:} 
 This binary feature refers to whether the comment author
 is a core member of a project or an external contributor.
 Core members are more likely to pay attention to 
 pull-request appropriateness and project {\color{red}management.} 
 \texttt{\small{Link\_inclusion:}}
 % \textit{Link\_inclusion:} 
 This binary feature identifies if a comment includes hyperlinks. 
 Hyperlinks are usually used to provide evidence 
 when someone insists on a point of view or 
 to offer guidelines when someone wants to help other people.
 \texttt{\small{Ping\_inclusion:}}
 % \textit{Ping\_inclusion:} 
 This binary feature refers to if a comment includes ping activity 
 (occurring by the form of ``@ username'').
 \begin{table*}[ht]
 	\scriptsize
 	\caption{Complete taxonomy}
 	\begin{tabular}{r r p{9.8cm}}
 		% \toprule
 		\hline
 			\rowcolor[HTML]{000000} 
 			&&\\
 			\rowcolor[HTML]{000000} 
 			\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-1 Categories}}} & 
 			\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Level-2 Subcategories}}} & 
 			\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Description \& Example}}}\\
 		\hline
 		\multirow{6}*{\tabincell{r}{\textbf{{\color{red}Code Improvement}}\\(L1-1)}}
 			&\cellcolor{lightgray}  		&\cellcolor{lightgray}Points out extra blank line, improper indention, inconsistent naming convention, etc. \\
 			&\cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Style Checking\\(L2-1)}}			&\cellcolor{lightgray}e.g.,  \emph{``scissors: this blank line''}\\
 			&\multirow{3}*{\tabincell{r}{{\color{red}Defect Detecting}\\(L2-2)}}	& Figures out {\color{red}runtime program errors or evolvability defects and etc.}\\
 			&		&e.g., \emph{``''he default should be `false`''} and \emph{``let's extract this into a constant. No need to initialize it on every call.''}\\
 			&\cellcolor{lightgray} 		&\cellcolor{lightgray}Demands submitter to provide test case for changed codes, report test result, etc.\\
 			&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Code Testing\\(L2-3)}}		&\cellcolor{lightgray}e.g., \emph{``this PR will need a unit test, I'm afraid, before it can be merged.''}\\
 		\hline
 		\multirow{6}*{\tabincell{r}{\textbf{PR Decision-making}\\(L1-2)}}
 			&\multirow{2}*{\tabincell{r}{Value Affirming\\(L2-4)}}	&Satisfied with the pull-request and agree to merge it\\
 			&				&e.g., \emph{``PR looks good to me. Can you \dots''}\\
 			&\cellcolor{lightgray}	&\cellcolor{lightgray}Rejects to merge the pull-request for duplicate proposal, undesired feature, etc.\\
 			& \cellcolor{lightgray}\multirow{-3}*{\tabincell{r}{Solution Disagreeing\\(L2-5)}} & \cellcolor{lightgray} e.g., \emph{``I do not think this is a feature we'd like to accept. \dots''}\\
 			&\multirow{3}*{\tabincell{r}{Further Questioning\\(L2-6)}}		& Confused with the purpose of the pull-request and ask for more details or use cases\\
 			&				&e.g., \emph{``Can you provide a use case for this change?''}\\
 		%%%%%%
 		% \midrule
 		\hline
 		\multirow{6}*{\tabincell{r}{\textbf{Project Management}\\(L1-3)}}
 			&\cellcolor{lightgray}		&\cellcolor{lightgray} States what type of changes a specific version is expected to merge, etc.\\
 			&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Roadmap Managing\\(L2-7)}}				&\cellcolor{lightgray}e.g., \emph{``Closing as 3-2-stable is security fixes only now''}\\
 			&\multirow{2}*{\tabincell{r}{Reviewer Assigning\\(L2-8)}}		&Ping other one(s) to review this pull-request\\
 			&				&e.g., \emph{``/cc @fxn can you take a look please?''}\\
 			&\cellcolor{lightgray}		&\cellcolor{lightgray}Needs to squash or rebase the commits, formulate the message, etc.\\
 			&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Convention Checking\\(L2-9)}}				&\cellcolor{lightgray}e.g., \emph{``\dots Can you squash the two commits into one? \dots''}\\
 		%%%%%%
 		% \midrule
 		\hline
 		\multirow{4}*{\tabincell{r}{\textbf{Social Interaction}\\(L1-4)}}
 			&\multirow{3}*{\tabincell{r}{Politely Responding\\(L2-10)}}		& Thanks for what other people do, apologize for mistakes, etc.\\
 			&			&e.g.,\emph{``Thank you. This feature was already proposed and it was rejected. See \#xxx''}\\
 			&\cellcolor{lightgray}		&\cellcolor{lightgray} Agrees with others' opinion, compliments others' work, etc.\\
 			&\cellcolor{lightgray}\multirow{-2}*{\tabincell{r}{Contribution Encouraging\\(L2-11)}}				&\cellcolor{lightgray}e.g.,\emph{``:+1: nice one @cristianbica.''}\\
 		%%%%%%
 		% \midrule
 		\hline
 		\textbf{Others}	&\multicolumn{2}{l}{Short sentence without clear or exact meaning like \textit{``@xxxx will do''} and \textit{``The same here :)''}}				\\
 		\bottomrule
 		\multicolumn{3}{l}{\emph{Note: word beginning and ending with colon like ``:scissors:'' is a markdown grammar for emoji in GitHub}}
 	\end{tabular}
 	\label{tab:taxonomy}
 \end{table*}
 \texttt{\small{Code\_inclusion:}}
 % \textit{Code\_inclusion:}
 {\color{red}
 This binary feature denotes if a comment includes the code elements. 
 Comments related to the solution detail tend to contain code elements. 
 However, this assumption does not limit whether the code is a replica of the committed code 
 or a new code snippet wrote by the reviewer
 }
 \texttt{\small{Ref\_inclusion:}}
 % \textit{Ref\_inclusion:} 
 This binary feature indicates if a comment includes a reference on the statement of others. 
 Such a reference indicates a reply to someone, 
 which probably reflects further suggestion to or disagreement with a person.
 \texttt{\small{Sim\_pr\_title:}} 
 % \textit{sim\_pr\_title:} 
 This feature refers to the similarity between the text of the comment 
 and the text of the pull-request title 
 (measured by the number of common words divided by the number of union words).
 \texttt{\small{Sim\_pr\_desc:}} 
 % \textit{Sim\_pr\_desc:} 
 This binary feature denotes the similarity between the text of the comment 
 and the text of the pull request description 
 (measured similarly as how \texttt{\small{sim\_pr\_title}} is computed). 
 Comments with high similarity to the title or description of a pull-request 
 are likely to discuss the solution detail or the value of the pull request.
 Together with the VP passed from \emph{stage one}, 
 these features are composed to form a new feature vector to be processed by prediction models. 
 Similar to \emph{stage one}, \emph{stage two} provides binary prediction models for each category. 
 In the prediction models, a new VP is generated 
 to represent how likely a comment will fall into a specific category. 
 After iterating the VP, a comment is labeled with class $C_{i}$ 
 if the $i_{th}$ vector item is greater than 0.5. 
 If all the items of the VP are less than 0.5, 
 the class label corresponding to the largest possibility will be assigned to the comment. 
 Finally, each comment processed by \TSHC is marked with at least one class label.
 \subsection{Quantitative Analysis}
 {\color{red}
 To identify typical review patterns we used our trained hybrid classification model 
 to classify a large scale of review comments and then examined the review patterns 
 based on this automatically classified dataset.  
 In total, we classified 147,367 comments of 27,339 pull-requests. 
 Based on this dataset, we conducted a preliminary quantitative study. 
 We first explored the distribution of review comments on each category 
 and then we reported some of the interesting findings derived from the distribution. 
 } 
--- a/PRCCE/4_result.tex
+++ b/PRCCE/4_result.tex
@ -0,0 +1,421 @@
 \section{Results}
 \label{sec:result}
 \subsection{RQ1: What is the taxonomy for review comments on pull-requests?}
 Table \ref{tab:taxonomy} shows the complete taxonomy. 
 We identified four Level-1 categories, 
 namely \textit{Code Correctness (L1-1)},
 \textit{PR Decision-making (L1-2)},
 \textit{Project Management (L1-3)},
 and \textit{Social Interaction (L1-4)},
 each of which contains more specific Level-2 categories.
 For each Level-2 category, we present its description 
 together with an example comment. 
 We show short comments or only the key part of long comments 
 to provide a brief display. 
 Also, we would like to point out it is a common phenomenon 
 that a single comment usually covers multiple topics. 
 The following review comments are provided as examples.
 \emph{C1:  ``Thanks for your contribution! This looks good to me 
 but maybe you could split your patch in two different commits? One for each issue.''}
 \emph{C2: ``Looks good to me :+1: /cc @steveklabnik @fxn''}
 In the first comment \textit{C1}, 
 the reviewer thanks the contributor (\textit{L2-10}) 
 and shows his satisfaction with this pull request (\textit{L2-4}), 
 followed by a request of commit splitting (\textit{L2-9}). 
 In \textit{C2}, 
 the reviewer expresses that he agrees with this change (\textit{L2-4}), 
 thinks highly of this work 
 (\textit{L2-11}, \textit{:+1:} is markdown grammar for emoji ``thumb-up''), 
 and assigns other reviewers to ask for more advice (\textit{L2-11}). 
 With regard to reviewer assignment, reviewers tend to delegate a code review 
 if they are unfamiliar with the change under review.
 {\color{red}
 In addition, we have collected 13 exception comments 
 which are labled with \textit{other} option. 
 These comments mainly fall into the following two groups: 
 \begin{itemize}
 	\item \textit{Platform-related (2)}: this kind of comments are related to the platform, i.e., GitHub. 
 Sometimes, developers may be not familiar with the features of GitHub: 
 ``I don't understand why Github display all my commits''
 	\item \textit{Simple reply (11)}: this kind of comments have only a very few words 
 and have not exact significant meaning: ``@** no.'', ``yes'' and ``done''.
 \end{itemize}
 On one hand, the number of these comments is relatively small and 
 on the other hand these comments do not provide too much contribution on the code review process. 
 Moreover, the process of taxonomy definition in Section~\ref{td} 
 has made our taxonomy relatively complete and robust for 
 its large-scale sampling, thorough discussion and multiple validation. 
 Therefore, we exclude them from our analysis and do not adjust our taxonomy.
 }
 \begin{framed}
 \noindent
 \textbf{RQ1:} {}
 \textit{From the case study, we identified a two-level taxonomy for 
 review comments which consists of 4 categories in Level 1 
 and 11 sub-categories in Level 2.}
 \end{framed}
 \subsection{RQ2: Is it possible to automatically classify 
 review comments according to the defined taxonomy}
 \begin{table*}[ht]
 	\scriptsize
 	\centering
 	\caption{Classification performance on Level-2 subcategories}
 \begin{tabular}{l |l@{}l@{}l|l@{}l@{}l|l@{}l@{}l|l@{}l@{}l|l@{}l@{}l|l@{}l@{}l}
 		\hline
 			\rowcolor[HTML]{000000} 
 				&
 				\multicolumn{6}{c}{{\color[HTML]{FFFFFF}\textbf{Rails}}}		&
 				\multicolumn{6}{c}{{\color[HTML]{FFFFFF}\textbf{Elasticsearch}}}	&
 				\multicolumn{6}{c}{{\color[HTML]{FFFFFF}\textbf{Angular.js}}}\\
 				% \multicolumn{6}{c}{{\color[HTML]{FFFFFF}\textbf{Ave.}}}\\
 			\rowcolor[HTML]{000000} 
 				&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TBC}}}		&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TSHC}}}	&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TBC}}}		&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TSHC}}}&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TBC}}}		&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{TSHC}}}\\
 			\rowcolor[HTML]{000000} 
 					\multirow{-3}*{{\color[HTML]{FFFFFF}\textbf{Cat.}}}&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }	&
 					{\color[HTML]{FFFFFF}F-M }	&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }				&
 					{\color[HTML]{FFFFFF}F-M }	&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }				&
 					{\color[HTML]{FFFFFF}F-M }	&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }&
 					{\color[HTML]{FFFFFF}F-M }	&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }				&
 					{\color[HTML]{FFFFFF}F-M }	&
 					{\color[HTML]{FFFFFF}Prec. }&
 					{\color[HTML]{FFFFFF}Rec. }&
 					{\color[HTML]{FFFFFF}F-M }\\
 		\hline
 \textbf{L2-1}	&	
 	0.75 	&	0.57 	&	0.66 	&	
 	0.88 	&	0.67 	&	0.78 	&
 	0.75 	&	0.46 	&	0.61 	&
 	0.85 	&	0.58 	&	0.72 	&	
 	0.54 	&	0.26 	&	0.40 	&	
 	0.76 	&	0.78 	&	0.77 	\\	\hline
 \textbf{L2-2}	&	
 	0.63 	&	0.84 	&	0.74 	&	
 	0.71 	&	0.86 	&	0.79 	&	
 	0.67 	&	0.92 	&	0.80 	&	
 	0.77 	&	0.82 	&	0.80 	&	
 	0.65 	&	0.71 	&	0.68 	&	
 	0.69 	&	0.71 	&	0.70 	\\	\hline
 \textbf{L2-3}	&	
 	0.59 	&	0.46 	&	0.53 	&	
 	0.74 	&	0.78 	&	0.76 	&	
 	0.66 	&	0.55 	&	0.61 	&	
 	0.60 	&	0.52 	&	0.56 	&	
 	0.64 	&	0.63 	&	0.64 	&	
 	0.75 	&	0.77 	&	0.76 	\\	\hline
 \textbf{L2-4}	&	
 	0.56 	&	0.35 	&	0.46 	&	
 	0.73 	&	0.58 	&	0.66 	&	
 	0.92 	&	0.84 	&	0.88 	&	
 	0.91 	&	0.88 	&	0.90 	&	
 	0.83 	&	0.72 	&	0.78 	&	
 	0.84 	&	0.79 	&	0.82 	\\	\hline
 \textbf{L2-5}	&	
 	0.50 	&	0.26 	&	0.38 	&	
 	0.63 	&	0.43 	&	0.53 	&	
 	0.65 	&	0.36 	&	0.51 	&	
 	0.80 	&	0.67 	&	0.74 	&	
 	0.63 	&	0.48 	&	0.56 	&	
 	0.61 	&	0.51 	&	0.56 	\\	\hline
 \textbf{L2-6}	&	
 	0.47 	&	0.21 	&	0.34 	&	
 	0.60 	&	0.36 	&	0.48 	&	
 	0.33 	&	0.11 	&	0.22 	&	
 	0.38 	&	0.26 	&	0.32 	&	
 	0.45 	&	0.51 	&	0.48 	&	
 	0.47 	&	0.49 	&	0.48 	\\	\hline
 \textbf{L2-7}	&	
 	0.79 	&	0.66 	&	0.73 	&	
 	0.79 	&	0.72 	&	0.76 	&	
 	0.75 	&	0.39 	&	0.57 	&	
 	0.75 	&	0.68 	&	0.72 	&	
 	0.49 	&	0.31 	&	0.40 	&	
 	0.83 	&	0.64 	&	0.74 	\\	\hline
 \textbf{L2-8}	&	
 	0.83 	&	0.77 	&	0.80 	&	
 	0.98 	&	0.78 	&	0.88 	&	
 	0.57 	&	0.35 	&	0.46 	&	
 	0.88 	&	0.90 	&	0.89 	&	
 	0.35 	&	0.16 	&	0.26 	&	
 	0.96 	&	0.67 	&	0.82 	\\	\hline
 \textbf{L2-9}	&	
 	0.83 	&	0.76 	&	0.80 	&	
 	0.90 	&	0.81 	&	0.86 	&	
 	0.94 	&	0.57 	&	0.76 	&	
 	0.96 	&	0.76 	&	0.86 	&	
 	0.85 	&	0.76 	&	0.81 	&	
 	0.83 	&	0.82 	&	0.83 	\\	\hline
 \textbf{L2-10}	&	
 	0.98 	&	0.93 	&	0.96 	&	
 	0.99 	&	0.98 	&	0.99 	&	
 	0.91 	&	0.84 	&	0.88 	&	
 	0.93 	&	0.95 	&	0.94 	&	
 	0.90 	&	0.80 	&	0.85 	&	
 	0.94 	&	0.93 	&	0.94 	\\	\hline
 \textbf{L2-11}	&	
 	0.80 	&	0.66 	&	0.73 	&	
 	0.89 	&	0.87 	&	0.88 	&	
 	0.87 	&	0.67 	&	0.77 	&	
 	0.92 	&	0.91 	&	0.92 	&	
 	0.93 	&	0.62 	&	0.78 	&	
 	0.98 	&	0.92 	&	0.95 	\\	 \hline
 \textbf{AVG}	&	
 	0.71 	&	0.67 	&	\textbf{0.69} 	&	
 	0.80 	&	0.76 	&	\textbf{0.78} 	&	
 	0.79 	&	0.75 	&	\textbf{0.77} 	&	
 	0.83 	&	0.81 	&	\textbf{0.82} 	&	
 	0.71 	&	0.64 	&	\textbf{0.67} 	&	
 	0.76 	&	0.73 	&	\textbf{0.75} 	\\
 \bottomrule
 	\end{tabular}
 \label{tab:L2}
 \end{table*}
 In the evaluation, we design a text-based classifier (TBC) as a comparison baseline 
 {\color{red} 
 which only uses the text content of review comments and 
 does not apply any inspection rule or composed feature.}   
 TBC uses the same preprocessing techniques and SVM models as used in \TSHC. 
 Classification performance is evaluated through a 10-fold cross validation, 
 namely splitting review comments into 10 sets, 
 of which nine sets are used to train the classifiers 
 and the remaining set is for the performance test. 
 The process is repeated 10 times. 
 {\color{red}
 Moreover, the evaluation metrics used in the paper are precision, recall and 
 F-measure which are computed by the following formulas respectively. 
 \begin{small}
 \begin{equation}
 	Prec(L2\textrm{-}i) = \frac{N_{CC}(L2\textrm{-}i)}{N_{TSHC}(L2\textrm{-}i)} 
 	\label{equ:pre}		
 \end{equation}
 \begin{equation}
 	Rec(L2\textrm{-}i) = \frac{N_{CC}(L2\textrm{-}i)}{N_{total}(L2\textrm{-}i)} 
 	\label{equ:rec}		
 \end{equation}
 \begin{equation}
 	F\textrm{-}M(L2\textrm{-}i) = 
 	\frac{2*Prec(L2\textrm{-}i)*Rec(L2\textrm{-}i)}{Prec(L2\textrm{-}i)+Rec(L2\textrm{-}i)} 
 	\label{equ:fm}		
 \end{equation}
 \end{small}
 In the formulas,
 $N_{total}(L2-i)$is the total number of comments of category \textit{L2-i} in the test dataset,   
 $N_{TSHC}(L2-i)$ is the number of comments that are classfied as \textit{L2-i} by \TSHC, 
 and $N_{CC}(L2-i)$is the number of comments that have been correctly classified as \textit{L2-i}. 
 }
 Table~\ref{tab:L2} shows the precision, recall, and F-measure 
 provided by different approaches for Level-2 categories. 
 Our approach achieves the highest precision, recall, and F-measure scores 
 in all categories with only a few exceptions.
 To provide an overall performance evaluation, 
 we use the weighted average value of F-measure~\cite{Zhou:2014} of all categories 
 by the proportions of instances in that category. 
 Equation~\ref{equ:wav} describes the formula to derive the average F-measure. 
 In the equation, the average F-measure is denoted as $F_{avg}$, 
 the F-measure of category \textit{L2-i} as $f_{i}$, 
 and the number of instances of category \textit{L2-i} as $n_{i}$.
 \begin{equation}
 	F_{avg} = \frac{\sum_{i=1}^{11}n_{i} * f_{i}}{\sum_{i=1}^{11}n_{i}}
 	\label{equ:wav}		
 \end{equation}
 The table indicates that our approach consistently outperforms 
 the baseline across the three projects. 
 Compared with that in the baseline, 
 the improvement in \TSHC running on each project 
 in terms of the weighted average F-measure is 
 9.2\% (from 0.688 to 0.780) in Rails, 5.3\% (from 0.767 to 0.820) in Elasticsearch, 
 and 7.2\% (from 0.675 to 0.747) in Angular.js. 
 % !!!!!!!!!!!!!!!!!!说个最高到达了多少
 These results indicate that our approach is highly applicable in practice.
 {\color{red}
 Furthermore, we would like to explore 
 whether the rule-based technique or composed features contribute more 
 to the increased performance of \TSHC compared with the baseline TBC. 
 Therefore, we design another two comparative settings.
 \begin{itemize}
 	\item TBC\_R (TBC + rule): in this setting, we combine TBC with rule-based technique. 
 	\item TBC\_CF (TBC + composed features): in this setting, we combine TBC with composed features in stage 2.
 \end{itemize}
 }
 \begin{table*}[ht]
 	\scriptsize
 	\centering
 	\caption{Classification performance of different methods}
 \begin{tabular}{l |c c c| c c c| c c c}
 		\hline
 			\rowcolor[HTML]{000000} 
 				&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{Rails}}}		&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{Elasticsearch}}}	&
 				\multicolumn{3}{c}{{\color[HTML]{FFFFFF}\textbf{Angular.js}}}\\
 				% \multicolumn{6}{c}{{\color[HTML]{FFFFFF}\textbf{Ave.}}}\\
 			\rowcolor[HTML]{000000} 
 					\multirow{-2}*{{\color[HTML]{FFFFFF}\textbf{Methods}}}&
 					{\color[HTML]{FFFFFF}TBC }&
 					{\color[HTML]{FFFFFF}TBC\_R }	&
 					{\color[HTML]{FFFFFF}TBC\_CF }	&
 					{\color[HTML]{FFFFFF}TBC }&
 					{\color[HTML]{FFFFFF}TBC\_R }	&
 					{\color[HTML]{FFFFFF}TBC\_CF }	&
 					{\color[HTML]{FFFFFF}TBC }&
 					{\color[HTML]{FFFFFF}TBC\_R }	&
 					{\color[HTML]{FFFFFF}TBC\_CF }	\\
 		\hline
 $F_{avg}$	&	
 	0.69 	&	0.74 	&	0.71 	&	
 	0.77 	&	0.81 	&	0.78 	&	
 	0.67 	&	0.73 	&	0.69 	\\
 Improvement (\%)&
 	- 	&	7.24 	&	2.90 	&	
 	- 	&	5.19 	&	1.30 	&	
 	- 	&	8.96 	&	3.00 	\\
 \bottomrule
 	\end{tabular}
 \label{tab:increase}
 \end{table*}
 {\color{red}
 Table~\ref{tab:increase} presents the experiment result. 
 Overall speaking, TBC\_R is better than TBC\_CF
 which means rule-based technique contributes more to the increased performance 
 than composed features in stage 2. 
 Compared with the TBC, 
 TBC\_R achieves an improvement of 7.24\%, 5.19\%, and 8.96\% of  for Rails, ElasticSearch and Angular.JS respectively, 
 while TBC\_CF achieves 2.9\%, 1.30\% and 2.99\%.}  
 We further study the review comments miscategorized by \TSHC. 
 An example is that 
 \textit{``While karma does globally install with a bunch of plugins, 
 we do need the npm install because without that you don't get the karma-ng-scenario karma plugin.''}.
 \TSHC classifies it as belonging to \textit{Error Detecting (L2-2)}, 
 but it is actually a \textit{Solution Disagreeing (L2-5)} comment. 
 The reason for this incorrect predication is twofold, 
 namely the lack of explicit discriminating terms 
 and the too specific expression for rejection. 
 Inspection rule of \textit{Solution Disagreeing (L2-5)} is unable to matched 
 because of the lack of corresponding mathcing patterns.
 ML classifiers tend to categorize it into \textit{Error Detecting (L2-2)}  
 because the too specific expression of opinion makes it 
 more like a low-level comment about code correctness 
 instead of a high-level one about the pull-request decision. 
 We attempt to solve this problem by adding factors 
 (\eg \texttt{\small{Comment\_type}} and \texttt{\small{Code\_inclusion}}) in \textit{stage 2} of \TSHC, 
 which can help reveal whether a review comment is talking about 
 the pull-request as a whole or the solution detail. 
 Although the additional information improves the classification performance to some extent, 
 it is not sufficient to differentiate the two types of comments. 
 We plan to address the issue by extending the manually labeled {\color{red}dataset} 
 and introducing a sentiment analysis.
 \begin{framed}
 \noindent
 \textbf{RQ2:} 
 \textit{
 Compared with the baseline method,
 our \TSHC approach achieves high performance
 in terms of the weighted average F-measure, namely 
 0.78 in Rails, 0.82 in Elasticsearch, 
 and 0.75 in Angular.js, 
 which indicates our method is applicable
 in practical automatic classification.}
 \end{framed}
--- a/PRCCE/5_preliminary_analysis.tex
+++ b/PRCCE/5_preliminary_analysis.tex
@ -0,0 +1,268 @@
 % \section{Preliminary Quantitative Analysis}
 % \label{sec:ana}
 \subsection{RQ3: What are the typical review patterns among the reviewers' discussions?}
 We now discuss some of our preliminary quantitative analysis.
 We used \TSHC to automatically {\color{red}classify} all the review comments,
 based on which we explored the distribution of review comments on each category. 
 {\color{red}
 Actually, we also explored comments distribution on top of manually labeled dataset 
 and found it is similar to that on the automatically classified dataset. 
 The distribution difference between the two dataset is that 
 manually labeled dataset has more comments of L2-4 and less comments of L2-2 
 compared with the automatically labeled dataset. 
 But the difference is slight and does not affect the distribution outline. 
 And because our preliminary quantitative study is mainly 
 based on the comments distribution on each category, 
 research findings derived from the two datasets are similar. 
 As a result, we only reported our findings on the automatically classified dataset 
 which has the larger data size and therefore makes our findings more convincing.}
 Figure \ref{fig:stas_plot} shows the percentage of the Level-2 categories.
 As a general view, the comments distribution over the three projects
 represent similar patterns, in which most comments are about
 \textit{Defect Detecting (L2-2)},
 \textit{Value Affirming(L2-4)}
 and \textit{Politely Responding (L2-10)}.
 %%% 1.2 显著点
 Significantly, \textit{Defect Detecting (L2-2)} occupies the first place
 (Rails: 42.5\%, EalsticSearch: 50.6\%, Angular.js: 33.1\%).
 This is consistent with previous studies\cite{Gousios:2014b,Bacchelli:2013}
 which stated that the main purpose of code review is to find defects.
 Moreover, the development of open source projects rely on
 the voluntary collaborative contribution~\cite{Shah:2006} of thousands of developers.
 A harmonious and friendly community environment is beneficial to
 promote contributors' enthusiasm
 which is critical to the continious evolution of a project.
 As a result, a majority of reviewers never hesitate
 to react positively to others' contribution(L2-10, L2-11)
 and express their satisfaction with high-quality pull-requests(L2-4).
 \begin{figure}[ht]
 	\centering
 	\includegraphics[width=9cm]{resources/stas_plot.png}
 	\caption{The percentage of the Level-2 categories}
 	% \caption{The distribution of review comments on Level-2 categories}
 	\label{fig:stas_plot}
 \end{figure}
 In addition, we also found some interesting phenomenons.
 \textbf{Style checking \& Code testing.}
 It is strange that although inspecting code style (L2-1)
 and testing code (L2-3) are less effort-consuming behaviors,
 which do not require too much understanding for code changes,
 comments of these two categories are {\color{red}relatively} less than others.
 This is somewhat inconsistent with the study
 conducted by Bacchelli \etal~\cite{Bacchelli:2013} on Microsoft teams,
 in which they found \textit{code style} (called \textit{code improvement} in their study)
 is the most frequent outcome of code reviews.
 The reason for the inconsistence, in our opinion,
 is the difference of development environment.
 In GitHub, contributors are in a transparent environment
 and their activities are visible to everyone\cite{Tsay:2014b,Gousios:2016}.
 Hence, inorder to build and {\color{red}maintain} good reputations,
 contributors usually go through their code by themselves before submission
 and prevent making apparent mistakes~\cite{Gousios:2016}.
 % 2-3 有些通过了test，但是却还有代码缺陷
 %%% 说明有些缺陷必须要人工的接入，仅仅有自动化的工具是不靠谱的
 \textbf{Error detecting \& Code testing.}
 We were also curious about why a large percentage of comments were
 focused on detecting defects (L2-2)
 even though most of the pull-requests did not raise tesing issues (L2-3).
 After examining again those pull-requests
 which received comments of L2-2 but dit not received comments of L2-3,
 we found two main factors resulting in this phenomenon.
 \begin{table}[ht]
 \setlength\extrarowheight{-1.5pt}
 	\scriptsize
 	\centering
 	\caption{Examples of Error Detecting}
 	\begin{tabular}{p{8cm}}
 %%%%%%%
 \bottomrule
 \rowcolor{gray}\textbf{Example 1} (inline comment)\\ \hline
 	\rowcolor{lightgray}
 	Changed codes: \\
 $-$  \color{red} \texttt{assert r.save, ``First save should be successful''}\\
 +  \color[HTML]{008B00} \texttt{assert r.valid?, ``First validation should be}\\
 \quad  \color[HTML]{008B00}successful''\\
 +  \color[HTML]{008B00} \texttt{r.save} \\
 	\rowcolor{lightgray}
 	Review comments: \\
 		\textit{Just this change is not	necessary, it'll run validations twice,
 		you canuse r.save in the assert call} \\
 %%%%%%%%
 %%%%%%%
 \bottomrule
 \rowcolor{gray}\textbf{Example 2} (inline comment) \\ \hline
 	\rowcolor{lightgray}
 	Changed codes: \\
 $-$  \color{red} \texttt{``\#{schema\_search\_path}-\#{sql}''}\\
 +  \color[HTML]{008B00} \texttt{``\#{schema\_search\_path}-\#{sql}''.to\_sym}\\
 	\rowcolor{lightgray}
 	Review comments: \\
 		\textit{This will introduce a memory leakage on system
 				where theare a lot of possible SQLs.
 				Remember symbols are not removed from memory
 				from the garbage collector } \\
 %%%%%%%%
 %%%%%%%
 \bottomrule
 \rowcolor{gray}\textbf{Example 3} (general comment)\\ \hline
 	\rowcolor{lightgray}
 	Pull-request descirption: \\
 	after\_remove callbacks only run if record was removed\\
 	\rowcolor{lightgray}
 	Review comments: \\
 		\textit{This will change the existing behavior and
 		people may be relying on it so I don't think it is safe
 		to change it like this. If we want to change this
 		we need a better upgrade path, like a global option
 		to let people to switch the behavior} \\
 %%%%%%%%
 %%%%%%%
 \bottomrule
 \rowcolor{gray}\textbf{Example 4} (inline comment) \\ \hline
 	\rowcolor{lightgray}
 	Changed codes: \\
 +  \color[HTML]{008B00} \texttt{class TableList}\\
 +  \color[HTML]{008B00} \texttt{\quad delegate :klass, :to $\Rightarrow$ :@li, :allow\_nil $\Rightarrow$ true}\\
 +  \color[HTML]{008B00} \texttt{\quad def initialize(li)}\\
 +  \color[HTML]{008B00} \texttt{\quad \quad @li = li}\\
 +  \color[HTML]{008B00} \texttt{\quad end}\\
 +  \color[HTML]{008B00} \texttt{end}\\
 	\rowcolor{lightgray}
 	Review comments: \\
 		\textit{This new class isn't needed -
 		we can reuse an existing one} \\
 	\bottomrule
 %%%%%%%%
 	\end{tabular}
 \label{Example:ed}
 \end{table}
 One factor is contributors' unfamiliarity with
 the programming language and the APIs (\eg Example 1 and Example 2 in Table~\ref{Example:ed}).
 and the other one is the lack of development experience for some contributors (\eg Example 3 and Example 4 in Table~\ref{Example:ed}).
 % 产生的代码 能运行 但是 质量低
 These factors tend to produce runnable but ``low-quality'' codes
 which are of less elegance or
 even break the compatibility and introduce potential risks.
 This reality justifies the necessary of manual code review
 even although an increasing number of automatic tools are coming into being.
 %%% 9 项目都有特定的贡献规定，但是有些用户经常违反一些规范
 \begin{figure*}[ht]
 	\centering
  	\includegraphics[width=18cm]{resources/prds.png}
  	\caption{The cumulative distribution of pull-requests.
  			(a) Rails (b) Elasticsearch (c) Angular.js}
 	\label{fig:prd}
 \end{figure*}
 % \begin{figure*}[ht]
 % 	\centering
 %   	\includegraphics[width=18cm]{resources/eads.png}
 %   	\caption{The distribution of developers on different contribution frequency.
 %   			(a) Rails (b) Elasticsearch (c) Angular.js}
 % 	\label{fig:ead}
 % \end{figure*}
 \textbf{Convention checking.}
 Project specific development {\color{red}conventions}
 are sometimes ignored by contributors which can be seen from the following examples.
 \textbf{Example\_1:}
 ``\textit{
 Thanks! Could you please add `[ci skip]' to your commit message
 to avoid Travis to trigger a build ?
 }''
 \textbf{Example\_2:}
 ``\textit{
@xxx PR looks good to me.
 Can you squash the two commits into one.
 Fix and test should go there in a commit.
 Also please add a changelog. Thanks.
 }''
 % 比例
 % rails [438, 2278] 2716 83.9
 % elasticsearch [394, 566]  960  60.1
 % angularjs [578, 2416] 2994  80.7
 To better understand this problem,
 we did a statistical analysis to explore whether this kind of comments (L2-9)
 distributed differently on the roles of contributors
 (core members or external contributors).
 % We found that pull-requests submitted by external contributors
 % are more likely to receive such comments,
 % which accounts for \hl{80\%} of the total number.
 We found that pull-requests submitted by external contributors
 receive more comments of L2-9 than those submitted by core memebers,
 which accounts for 83.9\%, 60.1\%, and 80.7\% of the total number in
 Rails, Elasticsearch, and Angular.js respectively.
 % 90% 的用户共享次数小于5
 % 70% 的问题pr都是出现在小于5的次数内
 % 贡献次数（<=5）89.6, 94.4, 96.8
 % 贡献第几次的比例 （<=5）68.4, 75.1, 91.9
 Furthermore, we explored how an external developer's contribution experience
 affected the possibility of his pull-requests getting comments of L2-9.
 Figure~\ref{fig:prd} shows how many percentage of the pull-requests
 that receive comments of L2-9
 are created and accumulated during developers' contribution history.
 It is clear that a considerable proportion of such pull-requests
 (68.4\% in Rails, 75.1\% in Elasticsearch, and 91.9\% in Angular.js)
 have been generaged in the first five submissions.
 The above figures illustrate that
 external contributors are more likely to submit pull-reqeusts
 that break project conventions in their early contributions.
 % In addition, from Figure~\ref{fig:ead}, we can find that most external contributors
 % don't contribute continiously and the majority of them
 % (89.6\% in Rails, 94.4\% in Elasticsearch, and 96.8\% in Angular.js)
 % contribute less than 5 times
 % in spite the existence of some hyperactive contributors
 % who submitted more than hundreds of pull-reqeusts.
 Therefore, it is neccesary to make external contributors
 gain a well understanding of project conventions in the very early stage.
 In fact, most project management teams have provided contributors
 with specific guidelines\footnote{
 \eg http://edgeguides.rubyonrails.org/contributing\_to\_ruby\_on\_rails.html
 }.
 It seems, however, that not everyone is willing to
 go through the tedious specification before contribution.
 This may inspire GitHub to improve its collaborative mechanism
 and offer more efficient development tools.
 For example, it is better to briefly display
 a clear list of development {\color{red}conventions}
 which are edited by the management team of a project
 before a developer create a pull-request to this project.
 Another altenative solution for GitHub is to provide automatic reviewing tools
 that can be configured with predefined convension rules
 and triggered as a new pull-request is submitted.
 \begin{framed}
 \noindent
 \textbf{RQ3:}
 \textit{
 Most comments are discussing about code correcting and social interactions.
 External contributors are more likely to
 break project conventions in their early contributions,
 and their pull-requests might contain potential issues,
 even though they have passed the tests.
 }
 \end{framed} 
--- a/PRCCE/6_threats.tex
+++ b/PRCCE/6_threats.tex
@ -0,0 +1,57 @@
 \section{THREATS TO VALIDITY}
 \label{sec:Threats}
 In this section, we discuss threats to construct validity, internal validity 
 and external validity which may affect the results of our study.
 \noindent\textbf{Construct validity:}
 The first threat involves the taxonomy definition 
 since some of the categories could be overlapping or missing.
 To alleviate this problem, we, together with other experienced developers, 
 randomly selected review comments and 
 classified them according to our taxonomy definition.
 The verification result showed that our taxonomy is complete 
 and did not miss any significant categories.
 Secondly, manually labeling thousands of reviews comments is a 
 repetitive, time-consuming and boring task.
 To get reliable labeled set, the first author of the paper was {\color{red}assigned} 
 to do this job and we tried our best to provide him a pleasant work environment.
 Moreover, if it has been a long time since last round of labeling work, 
 the first author would revisit the taxonomy and go through some labeled comments 
 to ensure his accurate and consistent understanding of his job
 before the next round of labeling work proceeds. 
 {\color{red}
 In addition, it is possible that incorrect classification of \TSHC may affect our findings 
 even our model achieved high precison of 76\%-83\%. 
 However, our preliminary quantitative study is mainly 
 based on the comments distribution on each category
 wrong classification is not likely to alter the distribution significantly, 
 therefore our findings are not easily affected by some extent of incorrect classification.
 }
 \noindent\textbf{Internal validity:}
 There are many machine learning methods 
 that can be used to solve the classification problem 
 and the choice of ML algorithm has a direct impact on the classification performance.
 We copared several ML algorithms including linear regression classifier, 
 adaBoost classifier, random forest classifier, and SVM and 
 found that SVM perfromed better than others on classifying review comments.
 Furthermore, we also did some necessary parameter optimization.
 \noindent\textbf{External validity:}
 Our findings are based on the dataset of three open source projects hosted on GitHub.
 To increase the generalizability of our research, 
 we have selected projects with different programming language, 
 and different application areas.
 Nevertheless our dataset is a small sample compared to 
 the total number of projects in GitHub.
 Hence, it is not very sure whether the results can be generalized to 
 all the other projects hosted on GitHub or 
 those which are hosted on other platforms.
--- a/PRCCE/7_related_work.tex
+++ b/PRCCE/7_related_work.tex
@ -0,0 +1,83 @@
 \section{Related Work}
 \label{sec:RelatedW}
 \subsection{Code Review}
 Code review is employed by many software projects to 
 examine the change made by others in source codes, 
 find potential defects, 
 and ensure software quality before they are merged~\cite{Baysal:2015,Mcintosh:***}.
 Traditional code review proposed by Fagan~\cite{Fagan:1976} has been performed since the 1970s. 
 However, its cumbersome and synchronous characteristics have hampered 
 its universal application in practice~\cite{Bacchelli:2013}. 
 With the occurrence and development of VCS and collaboration tools, 
 Modern Code Review (MCR)~\cite{Rigby:2011} is adopted by many software companies and teams. 
 Different from formal code inspections, 
 MCR is a lightweight mechanism that is less time consuming and supported by various tools. 
 {\color{red}
 Bacchelli \etal~\cite{Bacchelli:2013} interviewed developers and 
 analysed hundreds of review comments across diverse teams at Microsoft. 
 They found additional expectations except from finding defects such as 
 knowledge transfer, increased team awareness, 
 and creation of alternative solutions to problems.
 In recent years, collaboration tools have evolved with social media~\cite{Storey2014The,Zhu2016Effectiveness}. 
 Especially, GitHub integrates the function of code review in the pull-based model 
 and makes it more transparent and social~\cite{Storey2014The}. 
 %
 Tsay \etal~{Tsay:2014a} analyzed highly discussed pull-request 
 which have extended discussions and explored evaluating code contributions through discussion in GitHub. 
 Georgios \etal~\cite{Gousios:2014b} investigated the challenges faced by the integrators in GitHub 
 by conducting a survey which involves 749 integrators. 
 }
 % tsay [Influence of Social and Technical Factors for Evaluating Contribution in GitHub]
 % 关注的是analyzed the association of various technical and social measures with the likelihood of contribution acceptance
 % While the main motivation for code review was believed 
 % to be finding defects to control software quality, 
 % recent research has revealed that defect elimination is not the sole motivation. 
 % Bacchelli \etal~\cite{Bacchelli:2013} reported additional expectations, 
 % including knowledge transfer, increased team awareness, 
 % and creation of alternative solutions to problems. 
 \subsection{Pull-request}
 Although research on pull-requests is in its early stages, 
 several relevant studies have been conducted. 
 Gousios \etal~\cite{Gousios:2014,Rigby:2014} conducted a statistical analysis 
 of millions of pull-requests from GitHub and analyzed the popularity of pull-requests, 
 the factors affecting the decision to merge or reject a pull-request, 
 and the time to merge a pull-request. 
 Tsay \etal~\cite{Tsay:2014b} examined how social 
 and technical information are used to evaluate pull-requests.
 Yu \etal~\cite{Yu:2016} conducted a quantitative study on pull-request evaluation in the context of CI. 
 Moreover, Yue \etal~\cite{Yu:2015} proposed an approach that combines information retrieval 
 and social network analysis to recommend potential reviewers. 
 Veen \etal~\cite{Veen:2015} presented PRioritizer, 
 a prototype pull-request prioritization tool, 
 which works to recommend the top pull-requests the project owner should focus on.
 % !!!!!!!!!!!!!!Collaboratively Generated Data
 \subsection{Classification on Free Text}
 Several studies have been performed to analyze free text 
 generated in the software development process. 
 Antoniol \etal~\cite{Antoniol:2008} conducted a survey on 1,800 issues 
 from the BTS of three large open-source systems 
 and concluded that the linguistic information contained in these issues 
 is sufficient to distinguish ``bug'' issues from ``non-bug'' ones. 
 Later on, Pingclasai \etal~\cite{Pingclasai:2013} used topic modeling 
 to classify bug reports. 
 Herzig \etal~\cite{Herzig:2013} conducted a fine-grained classification on 7,000 issue reports 
 and analyzed the classification results of early research. 
 Zhou \etal~\cite{Zhou:2014} proposed a hybrid approach 
 by combining text-mining and data-mining techniques 
 to automatically classify bug reports. 
 Ciurumelea \etal~\cite{Ciurumelea:2017} studied reviews on mobile apps, 
 proposed an approach to automatically organize reviews according 
 to predefined tasks (battery, performance, memory, privacy, etc.), 
 and recommended the related source code that should be modified.
--- a/PRCCE/8_conlusion.tex
+++ b/PRCCE/8_conlusion.tex
@ -0,0 +1,45 @@
 \section{CONCLUSION\&FUTURE WORK}
 \label{sec:Concl}
 Code review is one of the most significant stages in pull-based development. 
 It ensures that only high-quality pull-requests are accepted, 
 based on the in-depth discussions among reviewers.
 To comprehensively understand the reviewers' motivations of joining the discussions, 
 we conducted a case study on three popular open-source software projects 
 hosted on GitHub and systematically analyse the review comments 
 generated in the discussions.
 Our work includes that 
 1) we constructed a fine-grained two-level taxonomy for review comments 
 which covers the typical categories (e.g., error detecting, re-
 viewer assigning, contribution encouraging, etc.);
 2) according to the defined taxonomy, we manually labeled over 5,600 review comments;
 3) we proposed a Two-Stage Hybrid Classification (TSHC) algorithm 
 using rule-based and machine-learning techniques 
 to automatically classify review comments,
 which achieved a reasonable improvement in terms of the weighted average F-measure;
 and 4) we did preliminary quantitative analysis of a large set of labeled review comments and reported some interesting findings.
 The results indicate that most comments are 
 discussing about code correcting and social interactions. 
 External contributors are more likely to 
 break project conventions in their early contributions, 
 and their Pull-requests may contain potential issues 
 even though they have passed the tests.
 %further work
 Nevertheless, \TSHC performs poorly on a few Level-2 subcategories.
 More work could be done in the future to improve it.
 We plan to address the shortcomings of our approach 
 by extending the manually labeled {\color{red}dataset} 
 and introducing the sentiment analysis. 
 Moreover, we will try to dig more valuable information 
 (\eg comment cooccurrence, emotion shift,\etc)
 from the experiment result in the paper
 and assist core members 
 to better organize the code review process, 
 such as improving reviewer recommendation,
 contributor assessment, 
 and pull-request prioritization.
 % \section*{Acknowledgment}
 % The research is supported by the National Natural Science Foundation of China (Grant No.61432020, 61303064, 61472430, 61502512) and National Grand R\&D Plan (Grant No. 2016YFB1000805).
 % The authors would like to thank... more thanks here 
--- a/PRCCE/9_ref.bib
+++ b/PRCCE/9_ref.bib
@ -0,0 +1,469 @@
@article{Yu:2015,
  title={Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment?},
  author={Yu, Yue and Wang, Huaimin and Yin, Gang and Wang, Tao},
  journal={Information and Software Technology},
  volume={74},
  pages={204--218},
  year={2016},
  publisher={Elsevier}
 }
@inproceedings{yu2014reviewer,
  title={Reviewer recommender of pull-requests in GitHub},
  author={Yu, Yue and Wang, Huaimin and Yin, Gang and Ling, Charles X},
  booktitle={2014 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
  pages={609--612},
  year={2014},
  organization={IEEE}
 }
@inproceedings{yu2015wait,
  title={Wait For It: Determinants of Pull Request Evaluation Latency on GitHub},
  author={Yu, Yue and Wang, Huaimin and Filkov, Vladimir and Devanbu, Premkumar and Vasilescu, Bogdan},
  booktitle={MSR},
  year={2015}
 }
@inproceedings{Barr:2012,
  title={Cohesive and isolated development with branches},
  author={Barr, Earl T and Bird, Christian and Rigby, Peter C and Hindle, Abram and German, Daniel M and Devanbu, Premkumar},
  booktitle={International Conference on Fundamental Approaches To Software Engineering},
  pages={316-331},
  year={2012},
 }
@inproceedings{Gousios:2014,
  title={An exploratory study of the pull-based software development model},
  author={Gousios, Georgios and Pinzger, Martin and Deursen, Arie Van},
  booktitle={International Conference Software Engineering},
  pages={345-355},
  year={2014},
 }
@inproceedings{Gousios:2016,
  title={Work practices and challenges in pull-based development: the contributor's perspective},
  author={Gousios, Georgios and Storey, Margaret Anne and Bacchelli, Alberto},
  booktitle={International Conference Software Engineering},
  pages={285-296},
  year={2016},
 }
@inproceedings{Vasilescu:B,
  title={Quality and productivity outcomes relating to continuous integration in GitHub},
  author={Vasilescu, Bogdan and Yu, Yue and Wang, Huaimin and Devanbu, Premkumar and Filkov, Vladimir},
  booktitle={Joint Meeting},
  pages={805-816},
  year={2015},
 }
@inproceedings{Tsay:2014a,
  title={Let's talk about it: evaluating contributions through discussion in GitHub},
  author={Tsay, Jason and Dabbish, Laura and Herbsleb, James},
  booktitle={The  ACM Sigsoft International Symposium},
  pages={144-154},
  year={2014},
 }
@inproceedings{Gousios:2014b,
  title={Work practices and challenges in pull-based development: the integrator's perspective},
  author={Gousios, Georgios and Zaidman, Andy and Storey, Margaret-Anne and Van Deursen, Arie},
  booktitle={Proceedings of the 37th International Conference on Software Engineering},
  pages={358--368},
  year={2015},
  organization={IEEE}
 }
@inproceedings{gousios2016work,
  title={Work practices and challenges in pull-based development: The contributor's perspective},
  author={Gousios, Georgios and Storey, Margaret-Anne and Bacchelli, Alberto},
  booktitle={Proceedings of the 38th International Conference on Software Engineering},
  pages={285--296},
  year={2016},
  organization={ACM}
 }
@inproceedings{Marlow:2013,
  title={Impression formation in online peer production: activity traces and personal profiles in github},
  author={Marlow, Jennifer and Dabbish, Laura and Herbsleb, Jim},
  booktitle={Proceedings of the 2013 conference on Computer supported cooperative work},
  pages={117--128},
  year={2013},
  organization={ACM}
 }
@inproceedings{Veen:2015,
  title={Automatically prioritizing pull requests},
  author={Van Der Veen, Erik and Gousios, Georgios and Zaidman, Andy},
  booktitle={Proceedings of the 12th Working Conference on Mining Software Repositories},
  pages={357--361},
  year={2015},
  organization={IEEE Press}
 }
@inproceedings{Thongtanunam:2015,
  title={Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review},
  author={Thongtanunam, Patanamon and Tantithamthavorn, Chakkrit and Kula, Raula Gaikovina and Yoshida, Norihiro and Iida, Hajimu and Matsumoto, Kenichi},
  booktitle={IEEE International Conference on Software Analysis, Evolution, and Reengineering},
  pages={141-150},
  year={2015},
 }
@inproceedings{t2015cq,
  title={Investigating code review practices in defective files: an empirical study of the Qt system},
  author={Thongtanunam, Patanamon and McIntosh, Shane and Hassan, Ahmed E and Iida, Hajimu},
  booktitle={Proceedings of the 12th Working Conference on Mining Software Repositories},
  pages={168--179},
  year={2015},
  organization={IEEE Press}
 }
@inproceedings{rahman2016correct,
  title={CoRReCT: code reviewer recommendation in GitHub based on cross-project and technology experience},
  author={Rahman, Mohammad Masudur and Roy, Chanchal K and Collins, Jason A},
  booktitle={Proceedings of the 38th International Conference on Software Engineering Companion},
  pages={222--231},
  year={2016},
  organization={ACM}
 }
@article{jiang:2015,
  title={CoreDevRec: Automatic Core Member Recommendation for Contribution Evaluation},
  author={Jiang, Jing and He, Jia Huan and Chen, Xue Yuan},
  journal={Journal of Computer Science and Technology},
  volume={30},
  number={5},
  pages={998-1016},
  year={2015},
 }
@inproceedings{Tsay:2014b,
  title={Influence of social and technical factors for evaluating contribution in GitHub},
  author={Tsay, Jason and Dabbish, Laura and Herbsleb, James},
  booktitle={ICSE},
  pages={356-366},
  year={2014},
 }
@inproceedings{Bacchelli:2013,
  title={Expectations, outcomes, and challenges of modern code review},
  author={Bacchelli, A and Bird, C},
  booktitle={International Conference on Software Engineering},
  pages={712-721},
  year={2013},
 }
@article{Shah:2006,
  title={Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development.},
  author={Shah, Sonali K.},
  journal={Management Science},
  volume={52},
  number={7},
  pages={1000-1014},
  year={2006},
 }
@book{Porter:1997,
  title={An algorithm for suffix stripping},
  author={Porter, M. F},
  publisher={Morgan Kaufmann Publishers Inc.},
  year={1997},
 }
@inproceedings{Yu:2014,
  title={Who Should Review this Pull-Request: Reviewer Recommendation to Expedite Crowd Collaboration},
  author={Yu, Yue and Wang, Huaimin and Yin, Gang and Ling, Charles X},
  booktitle={Asia-Pacific Software Engineering Conference},
  pages={335-342},
  year={2014},
 }
@article{Thongtanunam:2016,
  title={Review participation in modern code review},
  author={Thongtanunam, Patanamon and Mcintosh, Shane and Hassan, Ahmed E. and Iida, Hajimu},
  journal={Empirical Software Engineering},
  pages={1-50},
  year={2016},
 }
@inproceedings{Beller:2014,
  title={Modern code reviews in open-source projects: which problems do they fix?},
  author={Beller, Moritz and Bacchelli, Alberto and Zaidman, Andy and Juergens, Elmar},
  booktitle={Working Conference on Mining Software Repositories},
  pages={202-211},
  year={2014},
 }
@article{Kollanus:2009,
  title={Survey of Software Inspection Research},
  author={Kollanus, Sami and Koskinen, Jussi},
  journal={Open Software Engineering Journal},
  volume={3},
  number={1},
  year={2009},
 }
@inproceedings{Baysal:2013,
  title={The influence of non-technical factors on code review},
  author={Baysal, O. and Kononenko, O. and Holmes, R. and Godfrey, M. W.},
  booktitle={Reverse Engineering},
  pages={122-131},
  year={2013},
 }
@article{Baysal:2015,
  title={Investigating technical and non-technical factors influencing modern code review},
  author={Baysal, Olga and Kononenko, Oleksii and Holmes, Reid},
  journal={Empirical Software Engineering},
  volume={21},
  number={3},
  pages={1-28},
  year={2016},
 }
 %Mcintosh2016An
@article{Mcintosh:***,
  title={An empirical study of the impact of modern code review practices on software quality},
  author={Mcintosh, Shane and Kamei, Yasutaka and Adams, Bram},
  journal={Empirical Software Engineering},
  volume={21},
  number={5},
  pages={1-44},
  year={2016},
 }
@inproceedings{Mcintosh:2014,
  title={The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects},
  author={Mcintosh, Shane and Kamei, Yasutaka and Adams, Bram and Hassan, Ahmed E},
  booktitle={Working Conference on Mining Software Repositories},
  pages={192-201},
  year={2014},
 }
 %%%%和Fagan:1999 一样
@incollection{Fagan:1976,
  title={Design and code inspections to reduce errors in program development},
  author={Fagan, Michael E},
  booktitle={Pioneers and Their Contributions to Software Engineering},
  pages={301--334},
  year={2001},
  publisher={Springer}
 }
@book{Frank:1984,
  title={Software inspections and the industrial production of software},
  author={Ackerman, A. Frank and Fowler, Priscilla J and Ebenau, Robert G},
  publisher={Elsevier North-Holland, Inc.},
  pages={13-40},
  year={1984},
 }
@book{Baeza:2004,
  title={Modern Information Retrieval},
  author={Baeza-Yates, Ricardo A and Ribeiro-Neto, Berthier},
  publisher={China Machine Press},
  pages={26–28},
  year={2004},
 }
@article{Ackerman:1989,
  title={Software inspections: an effective verification process},
  author={Ackerman, A. F and Buchwald, L. S and Lewski, F. H},
  journal={IEEE Software},
  volume={6},
  number={3},
  pages={31-36},
  year={1989},
 }
@article{Russell:1991,
  title={Experience With Inspection in Ultralarge-Scale Development},
  author={Russell, Glen W},
  journal={IEEE Software},
  volume={8},
  number={1},
  pages={25-31},
  year={1991},
 }
@article{Aurum:2002,
  title={State-of-the-art: software inspections after 25 years},
  author={Aurum, Aybuke and Petersson, Hakan and Wohlin, Claes},
  journal={Sofware Testing Verification \& Reliability},
  volume={12},
  number={3},
  pages={133-154},
  year={2002},
 }
 %Rigby2012Contemporary
@article{Rigby:**,
  title={Contemporary Peer Review in Action: Lessons from Open Source Development},
  author={Rigby, P. C and Cleary, B and Painchaud, F and Storey, M},
  journal={IEEE Software},
  volume={29},
  number={6},
  pages={56-61},
  year={2012},
 }
@article{Rigby:2006,
  title={A preliminary examination of code review processes in open source projects},
  author={Rigby, Peter C. and German, Daniel M.},
  journal={University of Victoria},
 }
@inproceedings{Rigby:2008,
  title={Open source software peer review practices: a case study of the apache server},
  author={Rigby, Peter C and German, Daniel M and Storey, Margaret Anne},
  booktitle={International Conference on Software Engineering},
  pages={541-550},
  year={2008},
 }
@inproceedings{Rigby:2011,
  title={Understanding broadcast based peer review on open source software projects},
  author={Rigby, Peter C. and Storey, Margaret Anne},
  booktitle={International Conference on Software Engineering},
  pages={541-550},
  year={2011},
 }
@inproceedings{Riby:2013,
  title={Convergent contemporary software peer review practices},
  author={Rigby, Peter C and Bird, Christian},
  booktitle={Joint Meeting on Foundations of Software Engineering},
  pages={202-212},
  year={2013},
 }
@book{Rigby:2014,
  title={A Mixed Methods Approach to Mining Code Review Data: Examples and a study of multi-commit reviews and pull requests},
  author={Rigby, Peter C and Bacchelli, Alberto and Gousios, Georgios and Mukadam, Murtuza},
  pages={231-255},
  year={2014},
 }
@inproceedings{Gerrit,
 title={Gerrit. https://code.google.com/p/gerrit/. \hl{Accessed 2014/01/20}}
 }
@inproceedings{GitHub,
 title={GitHub. https://github.com/. Accessed \hl{2014/01/14}}
 }
@inproceedings{Trustie,
 title={Trustie. https://www.trustie.net/ \hl{Accessed 2014/01/14}}
 }
@inproceedings{Baum:2016,
  title={Factors influencing code review processes in industry},
  author={Baum, Tobias and Liskin, Olga and Niklas, Kai and Schneider, Kurt},
  booktitle={ACM Sigsoft International Symposium},
  pages={85-96},
  year={2016},
 }
@article{Ciolkowski:2003,
  title={Software Reviews: The State of the Practice},
  author={Ciolkowski, Marcus and Laitenberger, Oliver and Biffl, Stefan},
  journal={Software IEEE},
  volume={20},
  number={20},
  pages={46-51},
  year={2003},
 }
@article{Yu:2016,
  title={Determinants of pull-based development in the context of continuous integration},
  author={Yu, Yue and Yin, Gang and Wang, Tao and Yang, Cheng and Wang, Huaimin},
  journal={Science China Information Sciences},
  volume={59},
  number={8},
  pages={1-14},
  year={2016},
 }
@inproceedings{Lima,
  title={Developers assignment for analyzing pull requests},
  author={De, Lima J and Nior, Manoel Limeira and Soares, Daric and Moreira, Lio and Plastino, Alexandre and Murta, Leonardo},
  booktitle={The  ACM Symposium},
  pages={1567-1572},
  year={2015},
 }
@inproceedings{Antoniol:2008,
  title={Is it a bug or an enhancement?: a text-based approach to classify change requests},
  author={Antoniol, Giuliano and Ayari, Kamel and Penta, Massimiliano Di and Khomh, Foutse and Neuc, Yann Ga},
  booktitle={Conference of the Centre for Advanced Studies on Collaborative Research, October 27-30, 2008, Richmond Hill, Ontario, Canada},
  pages={23},
  year={2008},
 }
@article{Zhou:2014,
  title={Combining text mining and data mining for bug report classification},
  author={Zhou, Yu and Tong, Yanxiang and Gu, Ruihang and Gall, Harald},
  journal={Journal of Software: Evolution and Process},
  year={2016},
  publisher={Wiley Online Library}
 }
@inproceedings{Pingclasai:2013,
  title={Classifying Bug Reports to Bugs and Other Requests Using Topic Modeling},
  author={Pingclasai, N. and Hata, H. and Matsumoto, K. I.},
  booktitle={Asia-Pacific Software Engineering Conference},
  pages={13 - 18},
  year={2013},
 }
@book{Herzig:2013,
  title={It's not a Bug, it's a Feature: How Misclassification Impacts Bug Prediction},
  author={Herzig, Kim and Just, Sascha and Zeller, Andreas},
  pages={392-401},
  year={2013},
 }
@inproceedings{Ciurumelea:2017,
  title={Analyzing Reviews and Code of Mobile Apps for Better Release Planning},
  author={Ciurumelea, Adelina and Schaufelbuhl, Andreas and Panichella, Sebastiano and Gall, Harald},
  booktitle={IEEE International Conference on Software Engineering},
  year={2016},
 }
@inproceedings{Yu:2015b,
  title={Quality and productivity outcomes relating to continuous integration in GitHub},
  author={Vasilescu, Bogdan and Yu, Yue and Wang, Huaimin and Devanbu, Premkumar and Filkov, Vladimir},
  booktitle={Joint Meeting},
  pages={805-816},
  year={2015},
 }
@article{zhang2017social,
  title={Social media in GitHub, the role of@-mention in assisting software development},
  author={Zhang, Yang and Wang, Huaimin and Yin, Gang and Wang, Tao and Yu, Yue},
  journal={Information Sciences},
  volume={60},
  number={032102},
  pages={032102:1-032102:18},
  year={2017}
 }
@inproceedings{Storey2014The,
  title={The (R) Evolution of social media in software engineering},
  author={Storey, Margaret Anne and Singer, Leif and Cleary, Brendan and Filho, Fernando Figueira and Zagalsky, Alexey},
  booktitle={on Future of Software Engineering},
  pages={100-116},
  year={2014},
 }
@inproceedings{Zhu2016Effectiveness,
  title={Effectiveness of code contribution: from patch-based to pull-request-based tools},
  author={Zhu, Jiaxin and Zhou, Minghui and Mockus, Audris},
  booktitle={ACM Sigsoft International Symposium on Foundations of Software Engineering},
  pages={871-882},
  year={2016},
 }
--- a/PRCCE/JCST.STY
+++ b/PRCCE/JCST.STY
@ -0,0 +1,141 @@
 \NeedsTeXFormat{LaTeX2e}
 \ProvidesPackage{Jcst}
         [2009/10/08 for Journal of Computer Science & Technology]
 %\DeclareOption*{\PassOptionsToClass{\CurrentOption}{article}}
 \RequirePackage{multicol}
 \RequirePackage{graphicx}
 \RequirePackage{CJK}
  \RequirePackage{float}
  % Force 'figure' and 'table' to use [H] placement
  \let\figure@bak\figure
  \renewcommand\figure[1][]{\figure@bak[H]}
  \let\table@bak\table
  \renewcommand\table[1][]{\table@bak[H]}
  \let\balance\relax
  \long\def\singlecolumn#1{\end{multicols}#1\begin{multicols}{2}}
 %---------------------
 \setlength{\headsep}{5truemm}
 \setlength{\headheight}{4truemm}
 \setlength{\textheight}{231truemm}
 \setlength{\textwidth}{175truemm}
 \setlength{\topmargin}{0pt}
 \setlength{\oddsidemargin}{-0.5cm}
 \setlength{\evensidemargin}{-0.5cm}
 \setlength{\footskip}{8truemm}
 \setlength{\columnsep}{7truemm}
 \setlength{\columnseprule}{0pt}
 \setlength{\parindent}{2em}
 \renewcommand{\baselinestretch}{1.2}
 \renewcommand{\arraystretch}{1.5}
 \abovedisplayskip=10pt plus 2pt minus 2pt
 \belowdisplayskip=10pt plus 2pt minus 2pt
 \renewcommand{\footnoterule}{\kern 1mm \hrule width 10cm \kern 2mm}
 %\renewcommand{\thefootnote}{\fnsymbol{footnote}}
 \renewcommand{\thefootnote}{}
 %-------------------- Set Page Head -----------------------
 %#1 for first page author's name and title
 %#2 for volume
 %#3 for issue
 %#4 for end page
 %#5 for month
 %#6 for year
 %#7 for right page author's name and title
 \newcommand{\setpageinformation}[7]{%
 \thispagestyle{empty}
 \pagestyle{myheadings}\markboth
 {\hss{\small\sl J. Comput. Sci. \& Technol., #5. #6, #2, #3}}
 {{\small\sl {#7}}\rm\hss}
 \vspace*{-13truemm}
 {\par\noindent\parbox[t]{17.5cm}{\small {#1}.
 \rm JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
 #2 #3: \thepage--{#4} \ #5 #6}}
 \def\@evenfoot{}
 \def\@oddfoot{}
 }%
 %--------------- Title, keywords, Section, and key words etc., ------------
 \renewcommand{\title}[1]{\par\noindent\medskip\unskip\\\Large\textbf{#1}\par\medskip}%
 \newcommand{\keywords}[1]{\small\par\noindent\textbf{Keywords}\quad #1\par\bigskip}
 \newcommand\email[1]{\par\medskip\noindent\normalsize\textrm{E-mail: ~ #1}}
 \newcommand\received[1]{\par\medskip\noindent\normalsize\textrm{Received #1}\par\medskip}
 \newcommand\revised[1]{\par\medskip\noindent\normalsize\textrm{Revised #1}\par\medskip}
 \renewcommand\author[1]{\par\medskip\noindent\normalsize\unskip\\\textrm{#1}}
 \newcommand\address[1]{\par\medskip\noindent\small\unskip\\\textit{#1}}
 \renewenvironment{abstract}{\par\medskip\small
           \noindent{\bfseries \abstractname\quad}}{\par\medskip}
 %----------------------------------------------------------------------
 \renewcommand\section{\@startsection{section}{1}{\z@}%
                                    {3.5ex \@plus 1ex \@minus .2ex}%
                                    {2.3ex \@plus.2ex}%
                                    {\normalfont\normalsize\bfseries}}
 \renewcommand\subsection{\@startsection{subsection}{2}{\z@}%
                                    {3.25ex\@plus 1ex \@minus .2ex}%
                                    {1.5ex \@plus .2ex}%
                                    {\normalfont\normalsize\bfseries}}
 \renewcommand\subsubsection{\@startsection{subsubsection}{2}{\z@}%
                                    {3.25ex\@plus 1ex \@minus .2ex}%
                                    {1.5ex \@plus .2ex}%
                                    {\normalfont\normalsize\itshape}}
 %\def\@seccntformat#1{\csname the#1\endcsname. }
 %%
 % Insert \small and remove colon after table/figure number
 \long\def\@makecaption#1#2{%
  \vskip\abovecaptionskip
  \small%
  \sbox\@tempboxa{#1 #2}%
  \ifdim \wd\@tempboxa >\hsize
    #1. #2\par
  \else
    \global \@minipagefalse
    \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}%
  \fi
  \vskip\belowcaptionskip}
 %----------------------------------------------
 \renewcommand\theequation{\arabic{equation}}
 \renewcommand\thetable{\arabic{table}}
 \renewcommand\thefigure{\arabic{figure}}
 \renewcommand\refname{References}
 \renewcommand\tablename{\small \textbf{Table}}
 \renewcommand\figurename{\small \rm Fig.}
 %\AtBeginDocument{\label{firstpage}}
 %\AtEndDocument{\label{lastpage}}
 \renewcommand \thefigure {\@arabic\c@figure}
 \long\def\@makecaption#1#2{%
  \vskip\abovecaptionskip
  \sbox\@tempboxa{\textbf{#1.} #2}%
  \ifdim \wd\@tempboxa >\hsize
    #1. #2\par
  \else
    \global \@minipagefalse
    \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}%
  \fi
  \vskip\belowcaptionskip}
 %---------------- Theorem, Lemma, etc., and proof ----------
 \def\@begintheorem#1#2{\par{\bf #1~#2.~}\it\ignorespaces}
 \def\@opargbegintheorem#1#2#3{\par{\bf #1~#2~(#3).~}\it\ignorespaces}
 \newenvironment{proof}{\par\textit{Proof.~}\ignorespaces}
  {\hfill~\fbox{\vbox to0.5mm{\vss\hbox to0.5mm{\hss}\vss}}\par}
 \newtheorem{theorem}{\indent Theorem}
 \newtheorem{lemma}{\indent Lemma}
 \newtheorem{remark}{\indent Remark}
 \newtheorem{definition}{\indent Definition}
 %-----------------------------------------------
 \newcommand{\biography}[1]{\par\noindent
 \vbox to 36truemm{\vss
 \hbox to 32truemm{\hss\includegraphics[height=32truemm,width=24truemm]
 {#1}\hss}}\par\vspace*{-34truemm}\par
 \hangindent=32truemm\hangafter=-10\noindent\indent}
 %-----------------------------------------------
 \endinput
 %%
 %% End of file `jcst.sty'.
--- a/PRCCE/SAMPLE.TEX
+++ b/PRCCE/SAMPLE.TEX
@ -0,0 +1,125 @@
 \documentclass[11pt,twoside]{article}
 \usepackage{jcst}
 %Equations,theorems,lemmas,etc. are created using the traditional Latex enviroment
 %
 \begin{document}
 \setcounter{page}{1}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%Set Page Head%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \setpageinformation
 %{Head of the first page} {Running head of odd pages}
 {Author Names}
 {}{}{}{Mon.}{Year}
 {First Author {\it et al}.: Shortened title within 45 charactors}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %===========================================================
 \begin{CJK}{GBK}{song}
 \title{Title of the Paper}
 \author{Author1$^1$ (Chinse name), {\it Member, CCF, ACM, IEEE}, Author2$^2$, and Author3$^1$}
    % Type all authors' full names above.
    % For a Chinese author, Chinse name is needed as well
    % Please declare the membership when any author is a member or fellow of CCF, ACM, IEEE
 \address{$^1$Department, University, ...,City, Zip Code, Nation\\
 $^2$Department, University, ...,City, Zip Code, Nation
 }
 \email{Emails of all the authors}
 \received{month day, year}
 %\revised{month day, year}%Revised date
 \footnotetext{This work was supported by...}
 \begin{abstract}
 The abstract goes here. An abstract should be around 200 to 300 words and should clearly state the nature and significance of the paper. Abstract should not include mathematical expressions or bibliographic references
 \end{abstract}
 \keywords{3-5 keywords separated by commas}
 % Keywords should closely reflect the topic and should optically characterize the paper.
 %%%%%%%%%%%%%%%%%%%%%%%%%%% Main Text Begin %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{multicols}{2}
 \normalsize
 \section{Introduction}
 Main text.
 All references \cite{1,2,3} should be cited, and numbered in their cited order. Abbreviations should be explained when they first appear. Set variables in italic, vectors/matrices in bold italic.
 Figures and Tables should be mentioned in the maintext.
 \begin{figure}
 \footnotesize\centering
 \centerline{\includegraphics[width=7.6cm]{figure1.eps}}
 \caption{Caption of the figure.}
 \end{figure}
 \begin{table}
 \caption{Caption of the Table}
 \centering
 \begin{tabular}{ccc}
 \hline
 $A$ & $B$ & $C$\\
 \hline
 a & b & c\\
 1 & 2 & 3\\
 \hline
 \end{tabular}
 \end{table}
 \subsection{Caption of the Subsection}
 \subsubsection{Caption of the Subsubsection}
 (1) is a sample equation:
 \begin{equation}\label{1}
 1+1=2.
 \end{equation}
 \begin{theorem}
 Theorem text.
 \end{theorem}
 \begin{lemma}
 Lemma text.
 \end{lemma}
 \begin{proof}
 Proof text.
 \end{proof}
 \begin{definition}[\textmd{Caption of Definition}]
 Definition text.
 \end{definition}
 \it Example. \rm Example text.
 \addcontentsline{toc}{section}{Acknowledgment}
 \medskip{\bf Acknowledgment} Acknowledgment text. Acknowledgments should be placed at the end of the paper, before the bibliography.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% Main Text End %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{thebibliography}{00}
 \bibitem{1} Floater M S. Mean value coordinates. {\it Computer Aided Geometric Design}, 2003, 20(1): 19--27.
 % Author Name1, Author Name2. Paper title. {\it Journal Name}, Publishing year, Volumn(Issue No.): p.begin--p.end.
 \bibitem{2} Haker S, Angenent S, Tannenbaum A, Kikinis R. Nondistorting flattening for virtual colonoscopy. In {\it Proc. MICCAI}, Oct. 2001, pp.358--366.
 % Author Name1, Author Name2. Paper title. In {\it Proc. Conference Name}, Month, Year, pp. p.begin--p.end.
 % Conference places and date are needed
 \bibitem{3} Clarke E M, Grunberg O, Peled D A. Model Checking. Cambridge, Massachusetts: The MIT Press, 1999.
 % Author Name. Book Title. Publisher, Publishing year, pp. p.begin--p.end.
 \end{thebibliography}
 \biography{photo.eps}%No empty line here
 {\bf Author Name} Biography text.
 \end{multicols}
 \end{CJK}
 \end{document}
--- a/PRCCE/jcst.bst
+++ b/PRCCE/jcst.bst
--- a/PRCCE/main.tex
+++ b/PRCCE/main.tex
@ -0,0 +1,105 @@
 \documentclass[12pt,twoside]{article}
 \usepackage{JCST}
 %%% from old %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \usepackage{framed}
 \newcommand{\tabincell}[2]{\begin{tabular}{@{}#1@{}}#2\end{tabular}}
 \usepackage{graphicx}
 \usepackage{subfig}
 \usepackage{setspace}
 \usepackage{color,soul}
 \usepackage{colortbl}
 % \usepackage{enumitem}
 \usepackage{enumerate}
 \usepackage{threeparttable}
 \usepackage{multirow,booktabs}
 % \usepackage{xcolor}
 \usepackage[table,xcdraw]{xcolor}
 \usepackage{url}
 \usepackage{array}
 \usepackage{booktabs}
 % \usepackage{subfigure}
 \usepackage{caption}
 \usepackage{dcolumn}
 \usepackage{xspace}
 \usepackage{balance}
 \usepackage{bm}
 \usepackage{cite}
 \usepackage{amsmath}  
 \newcommand{\ie}{{\emph{i.e.}},\xspace}
 \newcommand{\viz}{{\emph{viz.}},\xspace}
 \newcommand{\eg}{{\emph{e.g.}},\xspace}
 \newcommand{\etc}{etc.}
 \newcommand{\etal}{{\emph{et al.}}}
 \newcommand{\GH}{{\sc GitHub}\xspace}
 \newcommand{\BB}{{\sc BitBucket}\xspace}
 \newcommand{\TSHC}{TSHC\xspace}
 \makeatletter
 \g@addto@macro{\UrlBreaks}{\UrlOrds}
 \makeatother
 \makeatletter
 \def\url@leostyle{%
  \@ifundefined{selectfont}{\def\UrlFont{\same}}{\def\UrlFont{\scriptsize\bf\ttfamily}}}
 \makeatother
 \urlstyle{leo}
 %%% from old %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{document}
 \setcounter{page}{1}
 \setpageinformation
 %{Head of the first page} {Running head of odd pages}
 {}
 {}{}{}{Mon.}{Year}
 {Analyzing Code Reviews in Pull-based Model}
 \begin{CJK}{GBK}{song}
 \title{What are they talking about?\\
 Analyzing Code Reviews in Pull-based Development Model}
 % \author{Author1$^1$ (Chinse name), {\it Member, CCF, ACM, IEEE}, Author2$^2$, and Author3$^1$}
 %     % Type all authors' full names above.
 %     % For a Chinese author, Chinse name is needed as well
 %     % Please declare the membership when any author is a member or fellow of CCF, ACM, IEEE
 % \address{$^1$Department, University, ...,City, Zip Code, Nation\\
 % $^2$Department, University, ...,City, Zip Code, Nation
 % }
 % \email{Emails of all the authors}
 % \received{month day, year}
 % %\revised{month day, year}%Revised date
 % \footnotetext{This work was supported by...}
 \input{0_abstract}
 \keywords{Pull-request; code review; review comments}
 \begin{multicols}{2}
 \normalsize
 \renewcommand{\thefootnote}{\arabic{footnote}}
 \setcounter{footnote}{0}
 \input{1_introduction}
 \input{2_background}
 \input{3_approach}
 \input{4_result}
 \input{5_preliminary_analysis}
 \input{6_threats}
 % \input{7_related_work}
 \input{8_conlusion}
 \bibliographystyle{unsrt}
 \bibliography{9_ref}
 \end{multicols}
 \end{CJK}
 \end{document}
--- a/PRCCE/resources/Fig-1.png
+++ b/PRCCE/resources/Fig-1.png
--- a/PRCCE/resources/Fig-2.png
+++ b/PRCCE/resources/Fig-2.png
--- a/PRCCE/resources/Fig-3.png
+++ b/PRCCE/resources/Fig-3.png
--- a/PRCCE/resources/Fig-4.png
+++ b/PRCCE/resources/Fig-4.png
--- a/PRCCE/resources/Fig-5.png
+++ b/PRCCE/resources/Fig-5.png
--- a/PRCCE/resources/Fig-6.png
+++ b/PRCCE/resources/Fig-6.png
--- a/PRCCE/resources/Fig-7.png
+++ b/PRCCE/resources/Fig-7.png
--- a/Response.docx
+++ b/Response.docx
--- a/materials/(3）图片/图片源文件.pptx
+++ b/materials/(3）图片/图片源文件.pptx
--- a/materials/(4）作者照片及简介/1-Zhixing
+++ b/materials/(4）作者照片及简介/1-Zhixing
--- a/materials/(4）作者照片及简介/1-Zhixing
+++ b/materials/(4）作者照片及简介/1-Zhixing
@ -0,0 +1 @@
 Zhixing Li received his BS in Computer Science from Chongqing University in 2015. He is now a MS candidate in Computer Science, National University of Defense Technology. His work interests include open source software engineering, data mining, and knowledge discovering in open source software.
--- a/materials/(4）作者照片及简介/2-Yue
+++ b/materials/(4）作者照片及简介/2-Yue
--- a/materials/(4）作者照片及简介/2-Yue
+++ b/materials/(4）作者照片及简介/2-Yue
@ -0,0 +1 @@
 Yue Yu received his Ph D degree in Computer Science from National University of Defense Technology (NUDT) in 2016. He is now an associate professor in NUDT. He has visited UC Davis supported by CSC scholarship. His research findings has published on MSR, FSE, IST, ICSME APSEC and SEKE. His current research interests include software engineering, spanning from mining software repositories and analyzing social coding networks.
--- a/materials/(4）作者照片及简介/3-Gang
+++ b/materials/(4）作者照片及简介/3-Gang
--- a/materials/(4）作者照片及简介/3-Gang
+++ b/materials/(4）作者照片及简介/3-Gang
@ -0,0 +1 @@
 Gang Yin received his Ph D degree in Computer Science from National University of Defense Technology (NUDT) in 2006. He is now an associate professor in NUDT. He has worked in several grand research projects including national 973, 863 projects and so on. He has published more than 60 research papers in international conferences and journals. His current research interests include distributed computing, information security, software engineering, and machine learning.
--- a/materials/(4）作者照片及简介/4-Tao
+++ b/materials/(4）作者照片及简介/4-Tao
--- a/materials/(4）作者照片及简介/4-Tao
+++ b/materials/(4）作者照片及简介/4-Tao
@ -0,0 +1 @@
 Tao Wang received both his BS and MS in Computer Science from National University of Defense Technology (NUDT) in 2007 and 2010. He is now a PhD candidate in Computer Science, NUDT. His work interests include open source software engineering, machinelearning, datamining, and knowledge discovering in open source software.
--- a/materials/(4）作者照片及简介/5-Huaimin
+++ b/materials/(4）作者照片及简介/5-Huaimin
--- a/materials/(4）作者照片及简介/5-Huaimin
+++ b/materials/(4）作者照片及简介/5-Huaimin
@ -0,0 +1 @@
 Huaimin Wang received his Ph D in Computer Science from National University of Defense Technology (NUDT) in 1992. He is now a professor and chief engineer in department of educational affairs, NUDT. He has been awarded the “Chang Jiang Scholars Program” professor and the Distinct Young Scholar, etc. He has published more than 100 research papers in peer-reviewed international conferences and journals. His current research interests include middleware, software agent, and trustworthy computing.
--- a/materials/（2）短标题/title.txt
+++ b/materials/（2）短标题/title.txt
@ -0,0 +1 @@
 Analyzing Code Reviews in Pull-based Model
--- a/materials/（5）中文名称&摘要&关键字&作者名/7530-Chinese-Information.doc
+++ b/materials/（5）中文名称&摘要&关键字&作者名/7530-Chinese-Information.doc
		`@ -0,0 +1 @@`
							`Zhixing Li received his BS in Computer Science from Chongqing University in 2015. He is now a MS candidate in Computer Science, National University of Defense Technology. His work interests include open source software engineering, data mining, and knowledge discovering in open source software.`
		`@ -0,0 +1 @@`
							`Yue Yu received his Ph D degree in Computer Science from National University of Defense Technology (NUDT) in 2016. He is now an associate professor in NUDT. He has visited UC Davis supported by CSC scholarship. His research findings has published on MSR, FSE, IST, ICSME APSEC and SEKE. His current research interests include software engineering, spanning from mining software repositories and analyzing social coding networks.`
		`@ -0,0 +1 @@`
							`Gang Yin received his Ph D degree in Computer Science from National University of Defense Technology (NUDT) in 2006. He is now an associate professor in NUDT. He has worked in several grand research projects including national 973, 863 projects and so on. He has published more than 60 research papers in international conferences and journals. His current research interests include distributed computing, information security, software engineering, and machine learning.`
		`@ -0,0 +1 @@`
							`Tao Wang received both his BS and MS in Computer Science from National University of Defense Technology (NUDT) in 2007 and 2010. He is now a PhD candidate in Computer Science, NUDT. His work interests include open source software engineering, machinelearning, datamining, and knowledge discovering in open source software.`
		`@ -0,0 +1 @@`
							Huaimin Wang received his Ph D in Computer Science from National University of Defense Technology (NUDT) in 1992. He is now a professor and chief engineer in department of educational affairs, NUDT. He has been awarded the “Chang Jiang Scholars Program” professor and the Distinct Young Scholar, etc. He has published more than 100 research papers in peer-reviewed international conferences and journals. His current research interests include middleware, software agent, and trustworthy computing.