Analysing and Comparing the Effectiveness of Mutation Testing Tools: A Manual Study

Overview

Mutation testing is considered as one of the most powerful testing methods. It operates by asking testers to design tests that reveal a set of mutants, which are purpose-made injected defects. Evidently, the strength of the method strongly depends on the used mutants. However, this dependence raises concerns regarding the mutation testing practice that is implemented by existing tools. Thus, it is probable that implementation inadequacies can lead to incompetent results. In this paper, we cross-evaluate three popular mutation testing tools for Java, namely MUJAVA, MAJOR and PIT, with respect to their effectiveness. We perform an empirical study of 3,324 manually analysed mutants from real-world projects and we find that there are large differences between the tools’ effectiveness, ranging from 76% to 88%, with MUJAVA achieving the best results. We also demonstrate that no tool is able to subsume the others and provide practical recommendations on how to strengthen each one of the studied tools. Finally, our analysis shows that 11%, 12% and 7% of the mutants generated by MUJAVA, MAJOR and PIT are equivalent, respectively.

Experimental Analysis

Test Subjects

Details about the test subjects utilised in this study are given below:

**Test Subjects’ Details:** "LoC" shows the source code lines of the projects; "Class" presents the name of the considered class, along with the enclosing package; "Method" refers to the names of the considered methods.
Test Subject	LoC	Class	Method
Commons-Math	16,489	org.apache.commons.math.util.MathUtils	gcd
		org.apache.commons.math.geometry.Vector3D	orthogonal
Commons	17,294	xorg.apache.commons.lang.ArrayUtils	toMap
			subarray(Object[],int,int)
			lastIndexOf(Object[], Object, int)
		xorg.apache.commons.lang.WordUtils	capitalize
			wrap
Pamvotis	5,505	pamvotis.core.Simulator	addNode
			removeNode
Triangle	47	Triangle	classify
XStream	15,048	xcom.thoughtworks.xstream.io.xml.XmlFriendlyNameCoder	decodeName
Bisect	37	Bisect	sqrt
Total	54,420	-	-

Manual Analysis Results

Details regarding the results of the performed manual analysis are presented below:

**Manual Analysis Results**: Columns "#Mutants", "#Equivs", "#Tests" present the number of generated mutants, the number of manually detected equivalent mutants and the number of the test cases that were manually created to cover the generated mutants per tool and method.
		MAJOR			PIT			MUJAVA
Method	#Mutants	#Equivs	#Tests	#Mutants	#Equivs	#Tests	#Mutants	#Equivs	#Tests
gcd	133	17	6	79	9	7	237	23	7
orthogonal	120	3	8	65	0	8	155	5	9
toMap	23	5	7	50	2	5	32	7	5
subarray([],int,int)	25	5	6	27	3	4	64	8	6
lastIndexOf(Object[], Object, int)	29	2	8	43	1	7	81	4	12
capitalize	37	6	5	42	1	6	69	14	9
wrap	71	8	10	70	4	6	198	19	7
addNode	89	11	8	53	3	8	318	33	34
removeNode	18	2	5	29	0	3	55	7	6
classify	139	7	25	94	1	16	354	38	27
decodeName	73	24	5	81	16	6	156	28	10
sqrt	51	4	4	29	3	4	135	17	6
Total	808	94	97	662	43	80	1,854	203	138

Downloads

All the data of this study can be downloaded from the following links: Bisect, Triangle. Commons, Commons-Math, Pamvotis, XStream.

Replication

Effort has been put into making this study as replicable as possible. Thus, for each project, we supply:

All its dependencies (lib/ directory).
Necessary libraries and settings for PIT and MAJOR.
Generated mutation adequate test suites for all mutation tools.
Manually detected equivalent mutants for each mutation tool (equivalent.org file – can be opened with any text editor; best viewed with an editor that supports Org-mode).

Scripts

For each project, we supply scripts that automate the project's compilation, test execution, mutant generation and execution and the cross-evaluation experiment. These scripts have been tested on GNU/Linux- and UNIX-based machines and can be found in the scripts/ directory of the projects. Note that the scripts must be run by visiting the containing directory. The most important script files are listed below:

**Description of various script files**: These files can be found at the `scripts/` directory of each project.
Filename	Description
InstantPitAll.sh	Run PIT against its mutation adequate test suite (ALL available mutation operators are used)
InstantMajor.sh	Same as above, but this time for MAJOR
majorRunningPitAll.sh	Execute MAJOR's mutants with PIT's mutation adequate test suite
majorRunningMujava3.sh	Execute MAJOR's mutants with MUJAVA's mutation adequate test suite
pitAllRunningMajor.sh	Execute PIT's mutants with MAJOR's mutation adequate test suite
pitAllRunningMujava3.sh	Execute PIT's mutants with MUJAVA's mutation adequate test suite
CumulativeMajor.sh	Execute MAJOR's mutants with the mutation adequate test suites of all tools
CumulativePitAll.sh	Same as above, but this time for PIT's mutants