Beating Random Test Case Prioritization

Existing test case prioritization (TCP) techniques have limitations when applied to real-world projects, because these techniques require certain information to be made available before they can be applied. For example, the family of input-based TCP techniques are based on test case values or test script strings; other techniques use test coverage, test history, program structure, or requirements information. Existing techniques also cannot guarantee to always be more effective than random prioritization (RP) that does not have any precondition. As a result, RP remains the most applicable and most fundamental TCP technique. This article proposes an extremely simple, effective, and efficient way to prioritize test cases through the introduction of a dispersity metric. Our technique is as applicable as RP. We conduct empirical studies using 43 different versions of 15 real-world projects. Empirical results show that our technique is more effective than RP. Our algorithm has a linear computational complexity and, therefore, provides a practical solution to the problem of prioritizing very large test suites (such as those containing hundreds of thousands, or millions, of test cases), where the execution time of conventional nonlinear prioritization algorithms can be prohibitive. Our technique also provides a practical solution to TCP when neither input-based nor execution-based techniques are applicable due to lack of information.

Year	Authors	Publisher	DOI
2020	Zhou, Zhi Quan Liu, Chen Chen, Tsong Yueh Tse, T. H. Susilo, Willy	IEEE	10.1109/TR.2020.2979815

Techniques	Applicability
TCP	Industry Motivation

Experiment subject(s)	Industrial Partner	Programming Language
Major open-source multi language projects, including Firefox (480575 TCs) and SQLite (787530 TCs), plus SIR Open-source, very large scale Research dataset, very large scale		C, C++, Ada, Java, sh, Perl, Lisp

Effectiveness Metrics	Efficiency Metrics	Other Metrics
Time/tests To First Failure		Applicability/Generality

Information Approach	Algorithm Approach	Open Challenges
	Similarity / distance-based	More subject programs; leverage structural information of test suites; perform empirical investigation of obs. I (neighboring tests tend to be related/similar)

Abstract