Dynamic Time Window based Reward for Reinforcement Learning in Continuous Integration Testing

Year	Authors	Publisher	DOI
2020	Pan, Chaoyue Yang, Yang Li, Zheng Guo, Junxia	ACM	10.1145/3457913.3457930

Year

Authors

Publisher

DOI

2020

Pan, Chaoyue Yang, Yang Li, Zheng Guo, Junxia

ACM

10.1145/3457913.3457930

Techniques	Applicability
TCP	Industry Motivation

Techniques

Applicability

TCP

Industry Motivation

Experiment subject(s)	Industrial Partner	Programming Language
six industrial datasets (from 89 to 5,555 test cases) Paint Control and IOF/ROL are from ABB Robotics Norway, Google Shared Dataset of Test Suite Results (GSDTSR), Rails, Mybatis and Apache Drill are extracted from Travis Torrent. Industrial open-source, large scale		Unclear

Experiment subject(s)

Industrial Partner

Programming Language

six industrial datasets (from 89 to 5,555 test cases) Paint Control and IOF/ROL are from ABB Robotics Norway, Google Shared Dataset of Test Suite Results (GSDTSR), Rails, Mybatis and Apache Drill are extracted from Travis Torrent. Industrial open-source, large scale

Unclear

Effectiveness Metrics	Efficiency Metrics	Other Metrics
Average Percentage of Faults Detected (APFD), Accuracy/precision/recall, Time/tests To First Failure

Effectiveness Metrics

Efficiency Metrics

Other Metrics

Average Percentage of Faults Detected (APFD), Accuracy/precision/recall, Time/tests To First Failure

Information Approach	Algorithm Approach	Open Challenges
	Bloom filter or window-based	(1) Combining test cases to execute other information features, optimize calculation methods for dynamic time windows, and improve performance on datasets; (2) Combining deep learning algorithms to optimize the agent algorithm of RL to improve the performance of RL in CI testing.

Information Approach

Algorithm Approach

Open Challenges

Bloom filter or window-based

(1) Combining test cases to execute other information features, optimize calculation methods for dynamic time windows, and improve performance on datasets; (2) Combining deep learning algorithms to optimize the agent algorithm of RL to improve the performance of RL in CI testing.

Supplementary Material
FALSE

Supplementary Material

FALSE

Abstract

Continuous Integration (CI) testing is an expensive, time-consuming, and resource-intensive process. Test case prioritization (TCP) can effectively reduce the workload of regression testing in the CI environment, where Reinforcement Learning (RL) is adopted to prioritize test cases, since the TCP in CI testing can be formulated as a sequential decision-making problem, which can be solved by RL effectively. A useful reward function is a crucial component in the construction of the CI system and a critical factor in determining RL’s learning performance in CI testing. This paper focused on the validity of the execution history information of the test cases on the TCP performance in the existing CI testing optimization methods based on RL, and a Dynamic Time Window based reward function are proposed by using partial information dynamically for fast feedback and cost reduction. Experimental studies are carried out on six industrial datasets. The experimental results showed that using dynamic time window based reward function can significantly improve the learning efficiency of RL and the fault detection ability when comparing with the reward function based on fixed time window.