University of Oulu

M. V. da Silva, S. Montejo-Sánchez, R. D. Souza, H. Alves and T. Abrão, "D2D Assisted Q-Learning Random Access for NOMA-Based MTC Networks," in IEEE Access, vol. 10, pp. 30694-30706, 2022, doi: 10.1109/ACCESS.2022.3160156

D2D assisted Q-learning random access for NOMA-based MTC networks

Saved in:
Author: da Silva, Matheus V.1; Montejo-Sánchez, Samuel2; Souza, Richard Demo3;
Organizations: 1Centre for Wireless Communications, University of Oulu, 90014 Oulu, Finland
2Programa Institucional de Fomento a la I+D+i, Universidad Tecnológica Metropolitana, Santiago 8940577, Chile
3Department of Electrical and Electronics Engineering, Federal University of Santa Catarina, Florianópolis 88040-900, Brazil
4Department of Electrical Engineering, State University of Londrina (UEL), Londrina 86057-970, Brazil
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 1.8 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2022
Publish Date: 2022-04-26


Machine-type communications (MTC) should account for half the connections to the internet by 2030. The use case massive MTC (mMTC) allows for applications to connect a massive number of low-power and low-complexity devices, leading to challenges in resource allocation. Not only that, mMTC networks suffer under rigid random access schemes due to mMTC ultra-dense nature resulting in poor performance. In this sense, this paper proposes a Q -Learning-based random access method for massive machine-type communications, with device clustering and non-orthogonal multiple access (NOMA). The traditional NOMA implementation increases spectral efficiency, but at the same time, demands a larger Q -Table, thus slowing down convergence, which is known to be a highly detrimental effect on massive networks. We use pre-clustering through short-range device-to-device technology to mitigate this drawback, allowing devices to operate with a smaller Q -Table. Furthermore, the previous selection of partner devices allows us to implement a full-feedback-based reward mechanism so that clusters avoid time slots already successfully allocated. Additionally, to cope with the negative impact of system overload, we propose an adaptive frame size algorithm to run in the base station (BS). It allows adjusting the frame size to the network load, preventing idle slots in an underloaded scenario, and providing extra slots when the network is overloaded. The results show the great benefits in terms of throughput of the proposed method. In addition, the impact of the use of clustering and the size of the clusters, as well as the frame size adaptation, are analyzed.

see all

Series: IEEE access
ISSN: 2169-3536
ISSN-E: 2169-3536
ISSN-L: 2169-3536
Volume: 10
Pages: 30694 - 30706
DOI: 10.1109/ACCESS.2022.3160156
Type of Publication: A1 Journal article – refereed
Field of Science: 213 Electronic, automation and communications engineering, electronics
Funding: This work was supported in part by the Brazilian National Council for Scientic and Technological Development (CNPq), Brazil; in part by the Print Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Universidade Federal de Santa Catarina (CAPES-UFSC) ‘‘Automation 4.0,’’ Brazil; in part by Rede Nacional de Ensino e Pesquisa/Ministério da Ciência, Tecnologia, Inovações e Comunicações (RNP/MCTIC), Brazil, under Grant 01245.010604/2020-14 (6G Mobile Communications Systems); in part by Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) Iniciación, Chile, under Grant 11200659; and in part by the Academy of Finland 6Genesis Flagship, Finland, under Grant 318927.
Academy of Finland Grant Number: 318927
Detailed Information: 318927 (Academy of Finland Funding decision)
Copyright information: © The Author(s) 2022. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see