A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques

- FSE 2016 Online Appendix

This web page is a companion to our FSE 2016 submission entitled "A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques".

1. Studied TCP techniques

We conducted our empirical study on four static TCP techniques and four dynamic techniques.

2. Subject programs

We collected 30 large, real-world Java systems. The programs names, links and sizes in terms of lines of code (LOC) are shown in the following table. The numbers of test cases on method level and class level are shown in Columns 4 and 5 respectively. Columns 6 and 7 show the number of mutation faults that can be detected and the the number of all mutation faults for each subject.

Subjects & Links	Version	#LOC	#TM	#TC	Detected	All
P1-Java-apns(Link)	2665ef	3,234	87	15	412	1,122
P2-gson-fire(Link)	6ac258	3,421	55	14	847	1,064
P3-jackson-datatype-guava(Link)	a049b9	3,994	91	15	313	1,832
P4-jackson-uuid-generator(Link)	487289	4,158	45	6	802	2,039
P5-jumblr(Link)	68fa79	4,623	103	15	610	1,192
P6-metrics-core(Link)	d83b98	5,027	144	28	1,656	5,265
P7-low-gc-membuffers(Link)	febe59	5,198	51	18	1,861	3,654
P8-xembly(Link)	05101d	5,319	58	16	1,190	2,546
P9-scribe-java(Link)	0311a4	5,355	99	18	563	1,622
P10-gdx-artemis(Link)	1687eb	6,043	31	20	968	1,687
P11-Protoparser(Link)	8be66b	6,074	171	14	3,346	4,640
P12-webbit(Link)	f628a7	7,363	131	25	1,268	3,833
P13-RestFixture(Link)	bb4c70	7,421	268	30	2,234	3,278
P14-LastCalc(Link)	7e0fc9	7,707	34	13	2,814	6,635
P15-lambdaj(Link)	bd3afc	8,510	252	35	3,382	4,341
P16-javapoet(Link)	1a1694	9,007	246	16	3,400	4,601
P17-Liqp(Link)	54f9bd	9,139	235	58	7,962	18,608
P18-cassandra-reaper(Link)	ef76a2	9,896	40	12	1,186	5,105
P19-raml-java-parser(Link)	6b6916	11,126	190	36	4,678	6,431
P20-redline-smalltalk(Link)	6322ac	11,228	37	9	1,834	10,763
P21-jsoup-learning(Link)	2c0580	13,505	380	25	7,761	13,230
P22-wsc(Link)	a0ab08	13,652	16	8	1,687	17,942
P23-rome(Link)	772d4f	13,874	443	45	4,920	10,744
P24-JActor(Link)	204739	14,171	54	43	132	1,375
P25-jprotobuf(Link)	e4ee6e	21,161	48	18	1,539	10,338
P26-worldguard(Link)	323413	24,457	148	12	1,127	25,940
P27-commons-io(Link)	c49315	27,263	1125	92	7,630	110,365
P28-asterisk-java(Link)	4cbd23	39,542	220	39	3,299	17,664
P29-ews-java-api(Link)	95c6df	46,863	130	28	2,419	31,569
P30-joda-time(Link)	c3ef36	82,998	4,026	122	20,957	28,382

3. Tools

PIT: A mutation testing system, providing mutation testing and test coverage for Java and the jvm (LINK), and its all available mutators (LINK).
WALA: A tool to collect the RTA static call graph for each test (LINK)
JDT: A tool to collect the textual test information (LINK).
R-lda: A R package to build topic models for test cases (LINK).
Mallet: A tool to build topic models for test cases (LINK).
ASM: A tool to collect the coverage information for each test case (LINK).

4. Results

4.1 The results of APFD for different TCP techniques across all subjects on test-class level.

4.2 The results of APFD for different TCP techniques across all subjects on test-method level.

4.3 The results for the ANOVA and Tukey HSD tests on the average APFD values.

4.4 The classification of subjects on different granularities using Jaccard distance at cut point 10%. The four values in each cell are the numbers of subjects, the faults of which detected by two techniques are highly dissimilar, dissimilar, similar and highly similar respectively. The results at other cut points can be found here.

Test-class level.
Test-method level.

4.4 The results for execution costs for different TCP techniques.

5. Authors

Qi Luo - The College of William and Mary, VA, USA.
E-mail: qluo at cs dot wm dot edu
Kevin Moran - The College of William and Mary, VA, USA.
E-mail: kpmoran at cs dot wm dot edu
Denys Poshyvanyk - The College of William and Mary.
E-mail: denys at cs dot wm dot edu

Software Engineering Maintenance and Evolution Research Unit

at the College of William and Mary

A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques

- FSE 2016 Online Appendix

1. Studied TCP techniques

2. Subject programs

3. Tools

4. Results

4.1 The results of APFD for different TCP techniques across all subjects on test-class level.

4.2 The results of APFD for different TCP techniques across all subjects on test-method level.

4.3 The results for the ANOVA and Tukey HSD tests on the average APFD values.

4.4 The results for execution costs for different TCP techniques.

5. Authors

Software Engineering Maintenance and Evolution Research Unit

at the College of William and Mary

A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques - FSE 2016 Online Appendix

1. Studied TCP techniques

2. Subject programs

3. Tools

4. Results

4.1 The results of APFD for different TCP techniques across all subjects on test-class level.

4.2 The results of APFD for different TCP techniques across all subjects on test-method level.

4.3 The results for the ANOVA and Tukey HSD tests on the average APFD values.

4.4 The results for execution costs for different TCP techniques.

5. Authors

A Large-scale Empirical Comparison of Static and Dynamic Test Case Prioritization Techniques

- FSE 2016 Online Appendix