A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques

- FSE 2016 Online Appendix

This web page is a companion to our FSE 2016 submission entitled "A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques".

1. Studied TCP techniques

We conducted our empirical study on four static TCP techniques and four dynamic techniques.


2. Subject programs

We collected 30 large, real-world Java systems. The programs names, links and sizes in terms of lines of code (LOC) are shown in the following table. The numbers of test cases on method level and class level are shown in Columns 4 and 5 respectively. Columns 6 and 7 show the number of mutation faults that can be detected and the the number of all mutation faults for each subject.

Subjects & LinksVersion#LOC#TM#TCDetectedAll
P1-Java-apns(Link)2665ef3,23487154121,122
P2-gson-fire(Link)6ac2583,42155148471,064
P3-jackson-datatype-guava(Link)a049b93,99491153131,832
P4-jackson-uuid-generator(Link)4872894,1584568022,039
P5-jumblr(Link)68fa794,623103156101,192
P6-metrics-core(Link)d83b985,027144281,6565,265
P7-low-gc-membuffers(Link)febe595,19851181,8613,654
P8-xembly(Link)05101d5,31958161,1902,546
P9-scribe-java(Link)0311a45,35599185631,622
P10-gdx-artemis(Link)1687eb6,04331209681,687
P11-Protoparser(Link)8be66b6,074171143,3464,640
P12-webbit(Link)f628a77,363131251,2683,833
P13-RestFixture(Link)bb4c707,421268302,2343,278
P14-LastCalc(Link)7e0fc97,70734132,8146,635
P15-lambdaj(Link)bd3afc8,510252353,3824,341
P16-javapoet(Link)1a16949,007246163,4004,601
P17-Liqp(Link)54f9bd9,139235587,96218,608
P18-cassandra-reaper(Link)ef76a29,89640121,1865,105
P19-raml-java-parser(Link)6b691611,126190364,6786,431
P20-redline-smalltalk(Link)6322ac11,2283791,83410,763
P21-jsoup-learning(Link)2c058013,505380257,76113,230
P22-wsc(Link)a0ab0813,6521681,68717,942
P23-rome(Link)772d4f13,874443454,92010,744
P24-JActor(Link)20473914,17154431321,375
P25-jprotobuf(Link)e4ee6e21,16148181,53910,338
P26-worldguard(Link)32341324,457148121,12725,940
P27-commons-io(Link)c4931527,2631125927,630110,365
P28-asterisk-java(Link)4cbd2339,542220393,29917,664
P29-ews-java-api(Link)95c6df46,863130282,41931,569
P30-joda-time(Link)c3ef3682,9984,02612220,95728,382

3. Tools

  • PIT: A mutation testing system, providing mutation testing and test coverage for Java and the jvm (LINK), and its all available mutators (LINK).
  • WALA: A tool to collect the RTA static call graph for each test (LINK)
  • JDT: A tool to collect the textual test information (LINK).
  • R-lda: A R package to build topic models for test cases (LINK).
  • Mallet: A tool to build topic models for test cases (LINK).
  • ASM: A tool to collect the coverage information for each test case (LINK).

4. Results

4.1 The results of APFD for different TCP techniques across all subjects on test-class level.



4.2 The results of APFD for different TCP techniques across all subjects on test-method level.



4.3 The results for the ANOVA and Tukey HSD tests on the average APFD values.



4.4 The classification of subjects on different granularities using Jaccard distance at cut point 10%. The four values in each cell are the numbers of subjects, the faults of which detected by two techniques are highly dissimilar, dissimilar, similar and highly similar respectively. The results at other cut points can be found here.

  • Test-class level.
  • Test-method level.

4.4 The results for execution costs for different TCP techniques.




5. Authors

  • Qi Luo - The College of William and Mary, VA, USA.
    E-mail: qluo at cs dot wm dot edu
  • Kevin Moran - The College of William and Mary, VA, USA.
    E-mail: kpmoran at cs dot wm dot edu
  • Denys Poshyvanyk - The College of William and Mary.
    E-mail: denys at cs dot wm dot edu