University of Oulu

S. Ali, A. Ferdowsi, W. Saad and N. Rajatheva, "Sleeping Multi-Armed Bandits for Fast Uplink Grant Allocation in Machine Type Communications," 2018 IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, United Arab Emirates, 2018, pp. 1-6. doi: 10.1109/GLOCOMW.2018.8644350

Sleeping multi-armed bandits for fast uplink grant allocation in machine type communications

Saved in:
Author: Ali, Samad1; Ferdowsi, Aidin2; Saad, Walid2;
Organizations: 1Center for Wireless Communications (CWC), University of Oulu, Oulu, Finland
2Wireless@VT, Bradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA, USA
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 3.7 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2018
Publish Date: 2019-08-13


Scheduling fast uplink grant transmissions for machine type communications (MTCs) is one of the main challenges of future wireless systems. In this paper, a novel fast uplink grant scheduling method based on the theory of multi-armed bandits (MABs) is proposed. First, a single quality-of-service metric is defined as a combination of the value of data packets, maximum tolerable access delay, and data rate. Since full knowledge of these metrics for all machine type devices (MTDs) cannot be known in advance at the base station (BS) and the set of active MTDs changes over time, the problem is modeled as a sleeping MAB with stochastic availability and a stochastic reward function. In particular, given that at each time step, the knowledge on the set of active MTDs is probabilistic, a novel probabilistic sleeping MAB algorithm is proposed to maximize the defined metric. Numerical results show that the proposed algorithm has logarithmic regret, and hence is optimal. The results also show that the proposed framework yields a three-fold reduction in latency compared to a random scheduling policy since it prioritizes the scheduling of MTDs that have stricter latency requirements. Moreover, by properly balancing the exploration versus exploitation tradeoff, the proposed algorithm is able to provide system fairness by allowing the most important MTDs to be scheduled more often while also allowing the less important MTDs to be selected enough times to ensure the accuracy of estimation of their importance.

see all

ISBN: 978-1-5386-4920-6
ISBN Print: 978-1-5386-4921-3
Pages: 1 - 6
DOI: 10.1109/GLOCOMW.2018.8644350
Host publication: 2018 IEEE Globecom Workshops (GC Wkshps) : Proceedings
Conference: IEEE Globecom Workshops
Type of Publication: A4 Article in conference proceedings
Field of Science: 213 Electronic, automation and communications engineering, electronics
Funding: This research was supported by the Academy of Finland 6Genesis Flagship under Grant 318927 and, in part, by the U.S. National Science Foundation under Grants AST-1506297 and CNS-1460316.
Academy of Finland Grant Number: 318927
Detailed Information: 318927 (Academy of Finland Funding decision)
Copyright information: © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.