Metrics

Below is a table listing metrics. For further information, please read the About and Contribute pages.

Legend:

  • TCP: Test Case Prioritization
  • TCS: Test Case Selection
  • TSR: Test Suite Reduction
  • TSA: Test Suite Augmentation
Category Name Description
Effectiveness Selection/reduction count/percentage Absolute or relative size of the resulting test suite compared to the original.
Effectiveness Average Percentage of Faults Detected (APFD) A measure of how quickly a test suite detects faults, on average. Includes many variations, such as APFDc and NAPFD.
Effectiveness Testing time Time required to execute the prioritized/selected/reduced test suite as opposed to the original suite.
Effectiveness Accuracy/precision/recall Measures of correctness and completeness of the resulting test suite (e.g., count of false positives and false negatives).
Effectiveness Fault Detection Capability Number or proportion of faults detected by the resulting suite compared to the original.
Effectiveness Fault Detection Rate (FDR) Time to detect faults compared to the optimal RT suite.
Effectiveness Coverage Effectiveness (CE) Measure of the tradeoff between cost of the test suite and structural coverage of the SUT.
Effectiveness Time/tests To First Failure Number of tests or amount of time needed to reach the first failure.
Effectiveness Fault detection within a budget Faults still detected when restricting the testing time budget.
Effectiveness Cost-benefit model Mathematical models considering costs and benefits of applying a technique throughout development.
Effectiveness Fault Detection Loss Number or proportion of faults undetected by the selected/reduced test suite compared to the original.
Effectiveness Comparison to expert Compares the output of the tool with a list of tests selected by the project architect.
Effectiveness Faults per tests/time Number of faults deteted per number of tests or testing time.
Effectiveness Number of tests added Number of tests added to the test suite.
Effectiveness Algorithm performance measures Fitness value or hypervolume metrics applied to search-based algorithms
Effectiveness Accumulated regression risk How much of the "regression risk" is covered by the tests.
Effectiveness Root-mean-square-error (RMSE) Compares the predicted and observed results.
Effectiveness Rank Percentile Average (RPA) Comparison between the predicted ranking and the actual ranking (from the dataset).
Effectiveness Most Likely Relative Position (MRP) Average position of first failed test, i.e. an estimate of how long the suite takes to find the first fault.
Effectiveness Savings Factor Mapping of APFD to dollar savings.
Efficiency Execution time Time required to run the tool (e.g., selection time, prioritization time, etc).
Efficiency Total/End-to-end time End-to-end time, combining measuring time, execution time and testing time. Due to this, it is a measure of both efficiency and effectiveness.
Efficiency Memory usage Measures the amount of memory used by the tool.
Efficiency Scalability How well the tool performs on subjects of different sizes.
Efficiency Measuring time/cost Measure of how costly is the information needed by the technique (e.g. compiling tests, collecting coverage, training a model).
Other Applicability/Generality The variety of SUTs upon which the tool can be applied.
Other Diagnosability Cost of diagnosing a fault upon detection.