Metrics

Below is a table listing metrics. For further information, please read the About and Contribute pages.

Legend:

TCP: Test Case Prioritization
TCS: Test Case Selection
TSR: Test Suite Reduction
TSA: Test Suite Augmentation

Category	Name	Description
Effectiveness	Selection/reduction count/percentage	Absolute or relative size of the resulting test suite compared to the original.
Effectiveness	Average Percentage of Faults Detected (APFD)	A measure of how quickly a test suite detects faults, on average. Includes many variations, such as APFDc and NAPFD.
Effectiveness	Testing time	Time required to execute the prioritized/selected/reduced test suite as opposed to the original suite.
Effectiveness	Accuracy/precision/recall	Measures of correctness and completeness of the resulting test suite (e.g., count of false positives and false negatives).
Effectiveness	Fault Detection Capability	Number or proportion of faults detected by the resulting suite compared to the original.
Effectiveness	Fault Detection Rate (FDR)	Time to detect faults compared to the optimal RT suite.
Effectiveness	Coverage Effectiveness (CE)	Measure of the tradeoff between cost of the test suite and structural coverage of the SUT.
Effectiveness	Time/tests To First Failure	Number of tests or amount of time needed to reach the first failure.
Effectiveness	Fault detection within a budget	Faults still detected when restricting the testing time budget.
Effectiveness	Cost-benefit model	Mathematical models considering costs and benefits of applying a technique throughout development.
Effectiveness	Fault Detection Loss	Number or proportion of faults undetected by the selected/reduced test suite compared to the original.
Effectiveness	Comparison to expert	Compares the output of the tool with a list of tests selected by the project architect.
Effectiveness	Faults per tests/time	Number of faults deteted per number of tests or testing time.
Effectiveness	Number of tests added	Number of tests added to the test suite.
Effectiveness	Algorithm performance measures	Fitness value or hypervolume metrics applied to search-based algorithms
Effectiveness	Accumulated regression risk	How much of the "regression risk" is covered by the tests.
Effectiveness	Root-mean-square-error (RMSE)	Compares the predicted and observed results.
Effectiveness	Rank Percentile Average (RPA)	Comparison between the predicted ranking and the actual ranking (from the dataset).
Effectiveness	Most Likely Relative Position (MRP)	Average position of first failed test, i.e. an estimate of how long the suite takes to find the first fault.
Effectiveness	Savings Factor	Mapping of APFD to dollar savings.
Efficiency	Execution time	Time required to run the tool (e.g., selection time, prioritization time, etc).
Efficiency	Total/End-to-end time	End-to-end time, combining measuring time, execution time and testing time. Due to this, it is a measure of both efficiency and effectiveness.
Efficiency	Memory usage	Measures the amount of memory used by the tool.
Efficiency	Scalability	How well the tool performs on subjects of different sizes.
Efficiency	Measuring time/cost	Measure of how costly is the information needed by the technique (e.g. compiling tests, collecting coverage, training a model).
Other	Applicability/Generality	The variety of SUTs upon which the tool can be applied.
Other	Diagnosability	Cost of diagnosing a fault upon detection.