Reinforcement Q-Learning using OpenAI Gym
1University of Oulu, Faculty of Information Technology and Electrical Engineering, Information Processing Science
|Online Access:||PDF Full Text (PDF, 0.5 MB)|
|Persistent link:|| http://urn.fi/URN:NBN:fi:oulu-201903151329
Oulu : J. Laivamaa,
|Publish Date:|| 2019-03-20
|Thesis type:||Bachelor's thesis
Q-Learning is an off-policy algorithm for reinforcement learning, that can be used to find optimal policies in Markovian domains. This thesis is about how Q-Learning can be applied to a test environment in the OpenAI Gym toolkit. The utility of testing the algorithm on a problem case is to find out how well it performs as well proving the practical utility of the algorithm. This thesis starts off with a general overview of reinforcement learning as well as the Markov decision process, both of which are crucial in understanding the theoretical groundwork that Q-Learning is based on. After that we move on to discussing the Q-Learning technique itself and dissect the algorithm in detail. We also go over OpenAI Gym toolkit and how it can be used to test the algorithm’s functionality. Finally, we introduce the problem case and apply the algorithm to solve it and analyse the results.
The reasoning for this thesis is the rise of reinforcement learning and its increasing relevance in the future as technological progress allows for more and more complex and sophisticated applications of machine learning and artificial intelligence.
© Juuso Laivamaa, 2019. This publication is copyrighted. You may download, display and print it for your own personal use. Commercial use is prohibited.