Reinforcement Learning In Real-Time Strategy Games

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
School of Science | Master's thesis
Checking the digitized thesis and permission for publishing
Instructions for the author
Date
2011
Major/Subject
Informaatiotekniikka
Mcode
T-61
Degree programme
Language
en
Pages
132
Series
Abstract
We consider the problem of effective and automated decision-making in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. For such environments, value functions are represented using function approximators. Due to approximation errors, temporal-difference methods suffer from stability issues. This thesis proposes Exlos, a stable, model-based Monte-Carlo method which borrows ideas from several existing algorithms including prioritized sweeping and upper confidence trees (UCT). Contrary to existing model-based algorithms, Exlos assumes models are imperfect, reducing their influence in the decision-making process. Experimental results in a testing environment show the superiority of Exlos in large discrete state spaces when compared to traditional reinforcement learning methods such as Q-learning and Sarsa. Furthermore, Exlos is shown to be effective and efficient when operating over value functions represented by approximators. Its effectiveness is further improved by including a novel online search procedure in the control policy. As an additional result, we present an improved version of UCT, denoted UCTO, which is experimentally shown to outperform UCT.
Description
Supervisor
Oja, Erkki|Monteiro, José Carlos
Thesis advisor
Raiko, Tapani
Keywords
reinforcement learning, real-time strategy, games, artificial intelligence, UCT, planning, continuous reinforcement learning
Citation