Techniques |
Applicability |
TCP
|
Industry Motivation
|
Experiment subject(s) |
Industrial Partner |
Programming Language |
Major open-source multi language projects, including Firefox (480575 TCs) and SQLite (787530 TCs), plus SIR
Open-source, very large scale
Research dataset, very large scale |
|
C, C++, Ada, Java, sh, Perl, Lisp |
Effectiveness Metrics |
Efficiency Metrics |
Other Metrics |
Time/tests To First Failure
|
|
Applicability/Generality
|
Information Approach |
Algorithm Approach |
Open Challenges |
|
Similarity / distance-based
|
More subject programs; leverage structural information of test suites; perform empirical investigation of obs. I (neighboring tests tend to be related/similar)
|
Abstract
Existing test case prioritization (TCP) techniques have limitations when
applied to real-world projects, because these techniques require
certain information to be made available before they can be applied. For
example, the family of input-based TCP techniques are based on test
case values or test script strings; other techniques use test coverage,
test history, program structure, or requirements information. Existing
techniques also cannot guarantee to always be more effective than random
prioritization (RP) that does not have any precondition. As a result,
RP remains the most applicable and most fundamental TCP technique. This
article proposes an extremely simple, effective, and efficient way to
prioritize test cases through the introduction of a dispersity metric.
Our technique is as applicable as RP. We conduct empirical studies using
43 different versions of 15 real-world projects. Empirical results show
that our technique is more effective than RP. Our algorithm has a
linear computational complexity and, therefore, provides a practical
solution to the problem of prioritizing very large test suites (such as
those containing hundreds of thousands, or millions, of test cases),
where the execution time of conventional nonlinear prioritization
algorithms can be prohibitive. Our technique also provides a practical
solution to TCP when neither input-based nor execution-based techniques
are applicable due to lack of information.