University of Oulu

X. Chen, Z. Han, H. Zhang, G. Xue, Y. Xiao and M. Bennis, "Wireless Resource Scheduling in Virtualized Radio Access Networks Using Stochastic Learning," in IEEE Transactions on Mobile Computing, vol. 17, no. 4, pp. 961-974, 1 April 2018. doi: 10.1109/TMC.2017.2742949

Wireless resource scheduling in virtualized radio access networks using stochastic learning

Saved in:
Author: Chen, Xianfu1; Han, Zhu2,3; Zhang, Honggang4;
Organizations: 1VTT Technical Research Centre of Finland Ltd, Oulu 90571, Finland
2Department of Electrical and Computer Engineering and the Department of Computer Science, University of Houston, Houston, TX 77004
3Department of Computer Science and Engineering, Kyung Hee University, Seoul 02447, South Korea
4College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
5Ira A. Fulton Schools of Engineering, Arizona State University, Tempe, AZ 85281
6Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721
7Centre for Wireless Communications, University of Oulu, Oulu 90014, Finland
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 1.1 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2018
Publish Date: 2018-08-08


How to allocate the limited wireless resource in dense radio access networks (RANs) remains challenging. By leveraging a software-defined control plane, the independent base stations (BSs) are virtualized as a centralized network controller (CNC). Such virtualization decouples the CNC from the wireless service providers (WSPs). We investigate a virtualized RAN, where the CNC auctions channels at the beginning of scheduling slots to the mobile terminals (MTs) based on bids from their subscribing WSPs. Each WSP aims at maximizing the expected long-term payoff from bidding channels to satisfy the MTs for transmitting packets. We formulate the problem as a stochastic game, where the channel auction and packet scheduling decisions of a WSP depend on the state of network and the control policies of its competitors. To approach the equilibrium solution, an abstract stochastic game is proposed with bounded regret. The decision making process of each WSP is modeled as a Markov decision process (MDP). To address the signalling overhead and computational complexity issues, we decompose the MDP into a series of single-agent MDPs with reduced state spaces, and derive an online localized algorithm to learn the state value functions. Our results show significant performance improvements in terms of per-MT average utility.

see all

Series: IEEE transactions on mobile computing
ISSN: 1536-1233
ISSN-E: 1558-0660
ISSN-L: 1536-1233
Volume: 17
Issue: 4
Pages: 961 - 974
DOI: 10.1109/TMC.2017.2742949
Type of Publication: A1 Journal article – refereed
Field of Science: 213 Electronic, automation and communications engineering, electronics
Funding: This research was supported in part by AKA grants 310786 and 289611, TEKES grants 2364/31/2014 and 2368/31/2014, US National Science Foundation grants 1717454, 1731424, 1704092, 1702850, 1646607, 1547201, 1434789, 1456921, 1443917, 1405121, 1457262 and 1461886, and the Program for Zhejiang Leading Team of Science and Technology Innovation under Grant 2013TD20.
Academy of Finland Grant Number: 289611
Detailed Information: 289611 (Academy of Finland Funding decision)
Copyright information: © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.