Detecting Trivial Mutant Equivalences via Compiler Optimisations

Preprint

Overview

Mutation testing realises the idea of fault-based testing, i.e., using artificial defects to guide the testing process. It is used to evaluate the adequacy of test suites and to guide test case generation. It is a potentially powerful form of testing, but it is well-known that its effectiveness is inhibited by the presence of equivalent mutants. We recently studied Trivial Compiler Equivalence (TCE) as a simple, fast and readily applicable technique for identifying equivalent mutants for C programs. In the present work, we augment our findings with further results for the Java programming language. TCE can remove a large portion of all mutants because they are determined to be either equivalent or duplicates of other mutants. In particular, TCE equivalent mutants account for 7.4% and 5.7% of all C and Java mutants, while duplicated mutants account for a further 21% of all C mutants and 5.4% Java mutants, on average. With respect to a benchmark ground truth suite (of known equivalent mutants), approximately 30% (for C) and 54% (for Java) are TCE equivalent. It is unsurprising that results differ between languages, since mutation characteristics are language-dependent. In the case of Java, our new results suggest that TCE may be particularly effective, finding almost half of all equivalent mutants.

Results for the Java programming language

Subjects

Two sets of test subjects have been used for Java, similarly to the experiments of C: the first set consists of 6 real-world programs that were utilised in the large-scale experiment of this study and the second one comprises 6 test subjects from which 1,542 mutants were manually analysed.

Large-scale experiment

Details regarding the test subjects of the large-scale experiment are presented in the following table. Their source code can be downloaded from the following links: Commons-Math-1.2, Commons-Lang-2.4, BCEL-5.2, Joda-Time-2.4, H2-1.0.79, Ant-1.8.4.

Java Test Subjects’ details: "LoC" shows the source code lines of the projects; "Package" and "Class-Size" present the packages of the considered classes and their size; the "Methods" and "Mutants" columns show the number of methods and the corresponding number of generated mutants.
Program LoC Package Class-Size Methods Mutants
Commons-Math-1.2 16,753 org.apache.commons.math.ode 951 34 5,868
    org.apache.commons.math.analysis 429 16 2,861
    org.apache.commons.math.linear 1,294 119 4,962
    org.apache.commons.math.distribution 244 32 546
Commons-Lang-2.4 18,168 org.apache.commons.lang 4,008 350 10,371
    org.apache.commons.lang.builder 967 130 1,661
    org.apache.commons.lang.text 1,915 237 5,983
    org.apache.commons.lang.math 1,247 104 3,999
BCEL-5.2 23,726 org.apache.bcel.generic 1,658 145 2,514
    org.apache.bcel.classfile 897 112 1,065
    org.apache.bcel.verifier.structurals 2,599 351 1,711
    org.apache.bcel.util 974 39 1,666
Joda-time-2.4 28,255 org.joda.time 1,858 302 2,840
    org.joda.time.format 1,091 90 2,247
    org.joda.time.chrono 487 59 1,723
    org.joda.time.tz 353 36 310
H2-1.0.79 72,359 org.h2.jdbc 4,324 476 3,248
    org.h2.command 4,707 163 4,666
    org.h2.expression 1,130 55 1,774
    org.h2.tools 2,177 277 3,058
Ant-1.8.4 104,479 org.apache.tools.ant.taskdefs 2,035 177 1,194
    org.apache.tools.ant 2,349 163 1,635
    org.apache.tools.ant.types 1,354 91 583
    org.apache.tools.util 1,388 110 1,698
Total 263,740   40,436 3,668 68,183

Manually-analysed test subjects

Details regarding the manually-analysed test subjects are presented in the following table. Their source code can be downloaded from the following links: Bisect, Commons-Lang, Joda-Time, Pamvotis, Triangle, XStream.

Manually-analysed Java test subjects’ details: Columns "Program" and "Method" present the examined programs and the considered methods; "Mutants" shows the number of the generated mutants and "Equivalent" the number of the manually-identified equivalent mutants.
Program Method Mutants Equivalent Mutants
Bisect sqrt 135 17
Commons-Lang capitalize 69 14
  wrap 198 19
Joda-Time add 257 37
Pamvotis addNode 318 33
  removeNode 55 7
Triangle classify 354 40
XStream decodeName 156 29
Total   1,542 196

Manual Analysis

The results of the manual analysis can be found here.

Supporting Tool

The supporting tool, along with usage details and examples, can be found here.

Results for the C programming language

The test subjects, tools and obtained results, in the case of C, are available here.