Reinforcement Learning In Real-Time Strategy Games
No Thumbnail Available
School of Science | Master's thesis
AbstractWe consider the problem of effective and automated decision-making in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. For such environments, value functions are represented using function approximators. Due to approximation errors, temporal-difference methods suffer from stability issues. This thesis proposes Exlos, a stable, model-based Monte-Carlo method which borrows ideas from several existing algorithms including prioritized sweeping and upper confidence trees (UCT). Contrary to existing model-based algorithms, Exlos assumes models are imperfect, reducing their influence in the decision-making process. Experimental results in a testing environment show the superiority of Exlos in large discrete state spaces when compared to traditional reinforcement learning methods such as Q-learning and Sarsa. Furthermore, Exlos is shown to be effective and efficient when operating over value functions represented by approximators. Its effectiveness is further improved by including a novel online search procedure in the control policy. As an additional result, we present an improved version of UCT, denoted UCTO, which is experimentally shown to outperform UCT.
SupervisorOja, Erkki|Monteiro, José Carlos
Thesis advisorRaiko, Tapani
reinforcement learning, real-time strategy, games, artificial intelligence, UCT, planning, continuous reinforcement learning