Deep Reinforcement Learning Enabled Adaptive Virtual Machine Migration Control in Multi-Stage Information Processing Systems

Fukushima, Yukinobu; Koujitani, Yuki; Nakane, Kazutoshi; Tarutani, Yuya; Wu, Celimuge; Ji, Yusheng; Yokohira, Tokumi; Murase, Tutomu

Home // International Journal On Advances in Networks and Services, volume 17, numbers 3 and 4, 2024 // View article

Deep Reinforcement Learning Enabled Adaptive Virtual Machine Migration Control in Multi-Stage Information Processing Systems

Authors:
Yukinobu Fukushima
Yuki Koujitani
Kazutoshi Nakane
Yuya Tarutani
Celimuge Wu
Yusheng Ji
Tokumi Yokohira
Tutomu Murase

Keywords: Multi-stage information processing system; VM migration control; Deep reinforcement learning; Deep Deterministic Policy Gradient (DDPG)

Abstract:
This paper tackles a Virtual Machine (VM) migration control problem to maximize the progress (accuracy) of information processing tasks in multi-stage information processing systems. The conventional methods for this problem are effective only for specific situations, such as when the system load is high. In this paper, in order to adaptively achieve high accuracy in various situations, we propose a VM migration method using a Deep Reinforcement Learning (DRL) algorithm. It is difficult to directly apply a DRL algorithm to the VM migration control problem because the size of the solution space of the problem dynamically changes according to the number of VMs staying in the system while the size of the agent’s action space is fixed in DRL algorithms. To cope with this difficulty, the proposed method divides the VM migration control problem into two problems: the problem of determining only the VM distribution (i.e., the proportion of the number of VMs deployed on each edge server) and the problem of determining the locations of all the VMs so that it follows the determined VM distribution. The former problem is solved by a DRL algorithm, and the latter by a heuristic method. This approach makes it possible to apply a DRL algorithm to the VM migration control problem because the VM distribution is expressed by a vector with a fixed number of dimensions and can be directly outputted by the agent. The simulation results confirm that our proposed method can adaptively achieve quasi-optimal accuracy in various situations with different link delays, types of the information processing tasks and the number of VMs.

Pages: 116 to 125

Publication date: December 30, 2024

Published in: journal

ISSN: 1942-2644