H. Khan, A. Elgabli, S. Samarakoon, M. Bennis and C. S. Hong, "Reinforcement Learning-Based Vehicle-Cell Association Algorithm for Highly Mobile Millimeter Wave Communication," in IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 4, pp. 1073-1085, Dec. 2019. doi: 10.1109/TCCN.2019.2941191
Reinforcement learning based vehicle-cell association algorithm for highly mobile millimeter wave communication
|Author:||Khan, Hamza1; Elgabli, Anis1; Samarakoon, Sumudu1;|
1Centre for Wireless Communications, University of Oulu, 90014 Oulu, Finland
2Department of Computer Science and Engineering, Kyung Hee University, Seoul 17104, South Korea
3Department of Computer Science and Engineering, Kyung Hee University, Seoul 17104, South K
|Online Access:||PDF Full Text (PDF, 3.6 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe202002195811
IEEE Communications Society,
|Publish Date:|| 2020-02-19
Vehicle-to-everything (V2X) communication is a growing area of communication with a variety of use cases. This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks. The aim is to maximize the time average rate per vehicular user (VUE) while ensuring a target minimum rate for all VUEs with low signaling overhead. We first formulate the user (vehicle) association problem as a discrete non-convex optimization problem. Then, by leveraging tools from machine learning, specifically distributed deep reinforcement learning (DDRL) and the asynchronous actor critic algorithm (A3C), we propose a low complexity algorithm that approximates the solution of the proposed optimization problem. The proposed DDRL-based algorithm endows every road side unit (RSU) with a local RL agent that selects a local action based on the observed input state. Actions of different RSUs are forwarded to a central entity, that computes a global reward which is then fed back to RSUs. It is shown that each independently trained RL performs the vehicle-RSU association action with low control overhead and less computational complexity compared to running an online complex algorithm to solve the non-convex optimization problem. Finally, simulation results show that the proposed solution achieves up to 15% gains in terms of sum rate and 20% reduction in VUE outages compared to several baseline designs.
IEEE transactions on cognitive communications and networking
|Pages:||1073 - 1085|
|Type of Publication:||
A1 Journal article – refereed
|Field of Science:||
213 Electronic, automation and communications engineering, electronics
This research was supported by the Kvantum institute strategic project SAFARI, CARMA, MISSION, NOOR, SMARTER, High5 project number 2192/31/2016 funded by Business Finland, Bittium, Keysight, Kyynel, MediaTek, Nokia, University of Oulu and the Academy of Finland 6Genesis Flagship project under grant 318927.
|Academy of Finland Grant Number:||
318927 (Academy of Finland Funding decision)
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.