Online caching policy with user preferences and time-dependent requests : a reinforcement learning approach
Hatami, Mohammad; Leinonen, Markus; Codreanu, Marian (2020-03-30)
M. Hatami, M. Leinonen and M. Codreanu, "Online Caching Policy with User Preferences and Time-Dependent Requests: A Reinforcement Learning Approach," 2019 53rd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 2019, pp. 1384-1388.
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
https://urn.fi/URN:NBN:fi-fe202003319802
Tiivistelmä
Abstract
Content caching is a promising approach to reduce data traffic in the back-haul links. We consider a system where multiple users request items from a cache-enabled base station that is connected to a cloud. The users request items according to the user preferences in a time-dependent fashion, i.e., a user is likely to request the next chunk (item) of the file requested at a previous time slot. Whenever the requested item is not in the cache, the base station downloads it from the cloud and forwards it to the user. In the meanwhile, the base station decides whether to replace one item in the cache by the fetched item, or to discard it. We model the problem as a Markov decision process (MDP) and propose a novel state space that takes advantage of the dynamics of the users’ requests. We use reinforcement learning and propose a Q-learning algorithm to find an optimal cache replacement policy that maximizes the cache hit ratio without knowing the popularity profile distribution, probability distribution of items, and user preference model. Simulation results show that the proposed algorithm improves the cache hit ratio compared to other baseline policies.
Kokoelmat
- Avoin saatavuus [31657]