An Empirical Study on Developer Related Factors Characterizing Fix-Inducing Commits
- JSEP 2015 Online Appendix

This web page is a companion to our JSEP 2015 paper entitled "An Empirical Study on Developer Related Factors Characterizing Fix-Inducing Commits".

Abstract

This paper analyzes developer related factors that could influence the likelihood for a commit to induce a fix. Specifically, we focus on factors that could potentially hinder developers’ ability to build correct “mental model” of the change to be committed: (i) the coherence of the commit (i.e., how much it is focused on a specific topic), (ii) the experience level of the developer on the files involved in the commit, and (iii) the interfering changes performed by other developers on the files involved in past commits. The results of our study indicate that “fix-inducing” commits (i.e., commits that induced a fix) are significantly less coherent than “clean” commits (i.e., commits that did not induce a fix). Surprisingly, “fix-inducing” commits are performed by more experienced developers, yet, those are the developers performing more complex changes in the system. Finally, “fix-inducing” commits have a higher number of past interfering changes as compared to “clean” commits. Our empirical study sheds light on previously unexplored factors and presents significant results that can be used in improving approaches for defect prediction.

Analyzed Systems

  • Ant
  • JMeter
  • log4j
  • Tomcat
  • Xerces-J
The source code of the analyzed systems can be downloaded using this link.

Statistical Analysis

This link provides the R scripts used to:
  • Run Statistical Tests
  • Generate Boxplots
  • Generate Tables
All the results, organized for single system, can be found at this link.

Factors

A table (csv format) with factors analyzed for every commit of all the systems, can be downloaded using this link. For each commit, the following informations have been calculated:
  • Commit ID
  • Author
  • Bug (true/false)
  • Added Lines
  • Modified Lines
  • Removed Lines
  • Size Lines
  • Size Hunks
  • Num Files
  • Lexical Experience
  • Structural Experience
  • Lexical Coherence
  • Structural Coherence
  • Num Interferences
  • Size Interferences

Technical Report

A technical report showing more statistics and boxplots than those reported in the paper, can be found at the following link.

Authors

  • Michele Tufano - The College of William and Mary, VA, USA.
    E-mail: mtufano at cs dot wm
  • dot edu
  • Gabriele Bavota - Free University of Bozen-Bolzano, Bolzano, Italy.
    E-mail: gabriele.bavota at unibz
  • dot it
  • Denys Poshyvanyk - The College of William and Mary.
    E-mail: denys at cs dot wm
  • dot edu
  • Rocco Oliveto - University of Molise, Pesche (IS), Italy.
    E-mail: rocco.oliveto at unimol
  • dot it
  • Massimiliano Di Penta - University of Sannio, Benevento, Italy.
    E-mail: dipenta at unisannio
  • dot it
  • Andrea De Lucia - University of Salerno, Salerno, Italy.
    E-mail: adelucia at unisa dot it