Generating Benchmarks from Change History Data to Support Evaluation of Software Maintenance Tasks - Online Appendix
This web page is a companion to our 10th Working Conference on Mining Software Repositories (MSR 2013) submission entitled "Generating Benchmarks from Change History Data to Support Evaluation of Software Maintenance Tasks" [pdf][slides]
This webpage superseeds our original datasets from http://www.cs.wm.edu/semeru/data/benchmarks/. It contains two new datasets for ArgoUML0.24 and ArgoUML0.26.2, as well as a suite of Java tools used to generate these benchmarks, and two Matlab scripts that use VSM and LSI to compute the similarities between a query and the methods of a system (i.e., the corpus).
Datasets
Dataset (size) |
Source code URL [Webpage] |
Period | Issues [URL to Issue Tracking System] |
Trace Type (Format) | Number of Gold Set Methods |
---|---|---|---|---|---|
ArgoUML0.22 (462 MB) |
Source Code [ArgoUML] |
0.20-0.22 | 74 Defects 10 Enhancements 2 Features 5 Patches (91 Total) [URL Issues] |
Full (TPTP) |
701 |
ArgoUML0.24 (206 MB) |
Source Code [ArgoUML] |
0.22-0.24 | 32 Defects 4 Enhancements 15 Patches 1 Task (52 Total) [URL Issues] |
Full (TPTP) |
357 |
ArgoUML0.26.2 (921 MB) |
Source Code [ArgoUML] |
0.24-0.26.2 | 181 Defects 19 Enhancements 2 Features 4 Patches 3 Task (209 Total) [URL Issues] |
Full (TPTP) |
1,560 |
JabRef2.6 (22 MB) |
Source Code [JabRef] |
2.0-2.6 | 36 Defects 3 Features (39 Total) [URL Issues] |
Full (TPTP) |
280 |
jEdit4.3 (34 MB) |
Source Code [jEdit] |
4.2-4.3 | 86 Bugs 34 Features 30 Patches (150 Total) [URL Issues] |
Marked (JPDA) |
748 |
muCommander0.8.5 (278 MB) |
Source Code [muCommander] |
0.8.0-0.8.5 | 81 Defects 11 Enhancements (92 Total) [URL Issues] |
Full (TPTP) |
717 |
Tools
In Eclipse, click "File->Import...". Under "General", select "Existing Projects into Workspace" and click next. Choose "Select archive file" and point to the EclipseProjects.zip (34MB) archive file which contains all the Eclipse Projects. Select the ones you want to include in your workspace, then click Finish. In each of these Eclipse projects, the main class contains "Main" in its name.
Data Format Details
Traces Format
The format of TPTP traces is in XML format and it is pretty self explanatory.
The format of a JPDA trace is as following:
thread name Number of pipes ("|") denote call stack depth methodName -- ClassNameWithFullPath$InnerClass
Example:
main:0:| 5:2 processOptions -- org.mozilla.javascript.tools.shell.Main main:0:| 5:2 init -- org.mozilla.javascript.tools.shell.Global main:0:| | 5:2 <init> -- org.mozilla.javascript.tools.shell.Global$1 main:0:| | 5:2 call -- org.mozilla.javascript.ContextFactory main:0:| | 5:2 call -- org.mozilla.javascript.ContextFactory main:0:| | 5:2 <init> -- org.mozilla.javascript.ScriptableObject$Slot main:0:| | | 5:2 <clinit> -- org.mozilla.javascript.Context main:0:| | | | 5:2 <clinit> -- org.mozilla.javascript.ScriptRuntime main:0:| | | | | 5:2 classOrNull -- org.mozilla.javascript.Kit
Remarks
- $1 denotes an anonymous class
- <init> is the class constructor, and should be replaced with the actual name of the class (e.g., from org.mozilla.javascript.tools.shell.Global.<init> to org.mozilla.javascript.tools.shell.Global.Global)
- <clinit> is for static block or class initialization (can be discarded)
- the trace does not capture the signature of the methods
Participants
- Bogdan Dit, The College of William and Mary
- Andrew Holtzhauer, The College of William and Mary
- Denys Poshyvanyk, The College of William and Mary
- Huzefa Kagdi, Wichita State University
We gratefully acknowledge financial support from the NSF on this research project.