![]() |
|
||
Reinforcement Learning with N-tuples on the Game Connect-4Markus Thill, Patrick Koch, and Wolfgang Konen Department of Computer Science, Cologne University of Applied Sciences, 51643, Gummersbach, Germanypatrick.koch@fh-koeln.de wolfgang.konen@fh-koeln.de Abstract. Learning complex game functions is still a difficult task. We apply temporal difference learning (TDL), a well-known variant of the reinforcement learning approach, in combination with n-tuple networks to the game Connect-4. Our agent is trained just by self-play. It is able, for the first time, to consistently beat the optimal-playing Minimax agent (in game situations where a win is possible). The n-tuple network induces a mighty feature space: It is not necessary to design certain features, but the agent learns to select the right ones. We believe that the n-tuple network is an important ingredient for the overall success and identify several aspects that are relevant for achieving high-quality results. The architecture is sufficiently general to be applied to similar reinforcement learning tasks as well. Keywords: Machine learning, reinforcement learning, TDL, self-play, n-tuple systems, feature generation, board games LNCS 7491, p. 184 ff. lncs@springer.com
|