Mutation testing is considered as one of the most powerful testing methods. It operates by asking testers to design tests that reveal a set of mutants, which are purpose-made injected defects. Evidently, the strength of the method strongly depends on the used mutants. However, this dependence raises concerns regarding the mutation testing practice that is implemented by existing tools. Thus, it is probable that implementation inadequacies can lead to incompetent results. In this paper, we cross-evaluate three popular mutation testing tools for Java, namely MUJAVA, MAJOR and PIT, with respect to their effectiveness. We perform an empirical study of 3,324 manually analysed mutants from real-world projects and we find that there are large differences between the tools’ effectiveness, ranging from 76% to 88%, with MUJAVA achieving the best results. We also demonstrate that no tool is able to subsume the others and provide practical recommendations on how to strengthen each one of the studied tools. Finally, our analysis shows that 11%, 12% and 7% of the mutants generated by MUJAVA, MAJOR and PIT are equivalent, respectively.
Details about the test subjects utilised in this study are given below:
Test Subject | LoC | Class | Method |
---|---|---|---|
Commons-Math | 16,489 | org.apache.commons.math.util.MathUtils | gcd |
org.apache.commons.math.geometry.Vector3D | orthogonal | ||
Commons | 17,294 | xorg.apache.commons.lang.ArrayUtils | toMap |
subarray(Object[],int,int) | |||
lastIndexOf(Object[], Object, int) | |||
xorg.apache.commons.lang.WordUtils | capitalize | ||
wrap | |||
Pamvotis | 5,505 | pamvotis.core.Simulator | addNode |
removeNode | |||
Triangle | 47 | Triangle | classify |
XStream | 15,048 | xcom.thoughtworks.xstream.io.xml.XmlFriendlyNameCoder | decodeName |
Bisect | 37 | Bisect | sqrt |
Total | 54,420 | - | - |
Details regarding the results of the performed manual analysis are presented below:
MAJOR | PIT | MUJAVA | |||||||
---|---|---|---|---|---|---|---|---|---|
Method | #Mutants | #Equivs | #Tests | #Mutants | #Equivs | #Tests | #Mutants | #Equivs | #Tests |
gcd | 133 | 17 | 6 | 79 | 9 | 7 | 237 | 23 | 7 |
orthogonal | 120 | 3 | 8 | 65 | 0 | 8 | 155 | 5 | 9 |
toMap | 23 | 5 | 7 | 50 | 2 | 5 | 32 | 7 | 5 |
subarray([],int,int) | 25 | 5 | 6 | 27 | 3 | 4 | 64 | 8 | 6 |
lastIndexOf(Object[], Object, int) | 29 | 2 | 8 | 43 | 1 | 7 | 81 | 4 | 12 |
capitalize | 37 | 6 | 5 | 42 | 1 | 6 | 69 | 14 | 9 |
wrap | 71 | 8 | 10 | 70 | 4 | 6 | 198 | 19 | 7 |
addNode | 89 | 11 | 8 | 53 | 3 | 8 | 318 | 33 | 34 |
removeNode | 18 | 2 | 5 | 29 | 0 | 3 | 55 | 7 | 6 |
classify | 139 | 7 | 25 | 94 | 1 | 16 | 354 | 38 | 27 |
decodeName | 73 | 24 | 5 | 81 | 16 | 6 | 156 | 28 | 10 |
sqrt | 51 | 4 | 4 | 29 | 3 | 4 | 135 | 17 | 6 |
Total | 808 | 94 | 97 | 662 | 43 | 80 | 1,854 | 203 | 138 |
All the data of this study can be downloaded from the following links: Bisect, Triangle. Commons, Commons-Math, Pamvotis, XStream.
Effort has been put into making this study as replicable as possible. Thus, for each project, we supply:
lib/
directory).equivalent.org
file – can be opened with any text
editor; best viewed with an editor that
supports Org-mode).
For each project, we supply scripts that automate the project's
compilation, test execution, mutant generation and execution and the
cross-evaluation experiment. These scripts have been tested on
GNU/Linux- and UNIX-based machines and can be found in
the scripts/
directory of the projects. Note that the
scripts must be run by visiting the containing directory. The most
important script files are listed below:
Filename | Description |
---|---|
InstantPitAll.sh | Run PIT against its mutation adequate test suite (ALL available mutation operators are used) |
InstantMajor.sh | Same as above, but this time for MAJOR |
majorRunningPitAll.sh | Execute MAJOR's mutants with PIT's mutation adequate test suite |
majorRunningMujava3.sh | Execute MAJOR's mutants with MUJAVA's mutation adequate test suite |
pitAllRunningMajor.sh | Execute PIT's mutants with MAJOR's mutation adequate test suite |
pitAllRunningMujava3.sh | Execute PIT's mutants with MUJAVA's mutation adequate test suite |
CumulativeMajor.sh | Execute MAJOR's mutants with the mutation adequate test suites of all tools |
CumulativePitAll.sh | Same as above, but this time for PIT's mutants |