机器人足球行为控制学习算法的研究

发布时间：2018-11-21 14:27

【摘要】：机器人足球作为人工智能的一项重大挑战,集合了多类热门研究于一身,是目前人工智能、多Agent系统研究的一个标准平台。机器人足球智能可以从多Agnet协作、单机器人行为策略决策和行为动作优化等多个方向研究。本文着重于足球机器人行为动作自主学习方向,引入强化学习算法,并针对多种足球机器人行为的强化学习进行仿真,验证了强化学习在足球机器人行为动作优化上的可行性。本文首先概括的介绍了机器人足球系统,对机器人足球的多个智能化研究方向进行了阐述。介绍了传统足球机器人行为动作的实现方式,说明了其方法的不足,提出了使用强化学习解决足球机器人行为控制上所面临的问题的思路。然后对强化学习进行了详细的阐述,从Markov决策过程开始,引出了针对离散状态空间Q学习算法,并针对连续状态空间介绍了连续逼近法在强化学习中的应用,并介绍了基于多层前馈神经网络的强化学习算法的实现过程。接着针对足球机器人截球行为,介绍了基于CMAC网络强化学习算法。CMAC网络具有结构简单、学习速度快的特性。对足球机器人截球的实现做仿真,验证了该算法的有效性。根据CMAC网络的不足,对CMAC网络做出了改进,实现了神经网络输出的连续逼近。使用基于改进后的连续CMAC网络的强化学习再次对足球机器人截球进行了仿真。然后针对足球机器人的躲避动态障碍,提出了使用并行连续CMAC的强化学习算法,避免了高维输入状态空间导致的维数灾难。最后为了实现足球机器人以指定方向趋近目标点的PID控制,将Actor-Critic学习算法应用在PID控制中,最终实现了足球机器人以指定方向趋近目标点的自适应PID控制的仿真。
[Abstract]:As an important challenge of artificial intelligence, robot soccer is a standard platform for the research of artificial intelligence and multi-Agent system. Robot soccer intelligence can be studied from multiple Agnet collaboration, single robot behavior strategy decision and behavioral action optimization. This paper focuses on the autonomous learning direction of soccer robot behavior, introduces reinforcement learning algorithm, and simulates the reinforcement learning of various soccer robot behaviors, which verifies the feasibility of reinforcement learning in the optimization of soccer robot behavior. In this paper, the robot soccer system is introduced, and several intelligent research directions of robot soccer are described. This paper introduces the implementation of the traditional soccer robot behavior, explains the shortcomings of the method, and puts forward the idea of using reinforcement learning to solve the problem of the soccer robot behavior control. Then the reinforcement learning is elaborated in detail. Starting from the Markov decision-making process, the Q learning algorithm for discrete state space is introduced, and the application of continuous approximation method in reinforcement learning is introduced for continuous state space. The implementation of reinforcement learning algorithm based on multilayer feedforward neural network is introduced. Then the reinforcement learning algorithm based on CMAC network is introduced for soccer robot truncation. CMAC network has the characteristics of simple structure and fast learning speed. The simulation of soccer robot truncation is carried out, and the validity of the algorithm is verified. According to the deficiency of CMAC network, the CMAC network is improved to realize the continuous approximation of the output of neural network. The reinforcement learning based on the improved continuous CMAC network is used to simulate the soccer robot again. Then a reinforcement learning algorithm based on parallel continuous CMAC is proposed to avoid the dimensionality disaster caused by high-dimensional input state space in order to avoid the dynamic obstacle of soccer robot. Finally, in order to realize the PID control of the soccer robot approaching the target point in the specified direction, the Actor-Critic learning algorithm is applied to the PID control. Finally, the simulation of the adaptive PID control of the soccer robot approaching the target point in the specified direction is realized.
【学位授予单位】：北方工业大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP242

【相似文献】