Reinforcement Q-Learning using OpenAI Gym
Laivamaa, Juuso (2019-03-14)
Laivamaa, Juuso
J. Laivamaa
14.03.2019
© 2019 Juuso Laivamaa. Tämä Kohde on tekijänoikeuden ja/tai lähioikeuksien suojaama. Voit käyttää Kohdetta käyttöösi sovellettavan tekijänoikeutta ja lähioikeuksia koskevan lainsäädännön sallimilla tavoilla. Muunlaista käyttöä varten tarvitset oikeudenhaltijoiden luvan.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-201903151329
https://urn.fi/URN:NBN:fi:oulu-201903151329
Tiivistelmä
Q-Learning is an off-policy algorithm for reinforcement learning, that can be used to find optimal policies in Markovian domains. This thesis is about how Q-Learning can be applied to a test environment in the OpenAI Gym toolkit. The utility of testing the algorithm on a problem case is to find out how well it performs as well proving the practical utility of the algorithm. This thesis starts off with a general overview of reinforcement learning as well as the Markov decision process, both of which are crucial in understanding the theoretical groundwork that Q-Learning is based on. After that we move on to discussing the Q-Learning technique itself and dissect the algorithm in detail. We also go over OpenAI Gym toolkit and how it can be used to test the algorithm’s functionality. Finally, we introduce the problem case and apply the algorithm to solve it and analyse the results.
The reasoning for this thesis is the rise of reinforcement learning and its increasing relevance in the future as technological progress allows for more and more complex and sophisticated applications of machine learning and artificial intelligence.
The reasoning for this thesis is the rise of reinforcement learning and its increasing relevance in the future as technological progress allows for more and more complex and sophisticated applications of machine learning and artificial intelligence.
Kokoelmat
- Avoin saatavuus [31657]