仿真足球机器人防守动作及跑位研究

发布时间：2018-08-14 14:31

【摘要】：Robocup2D仿真平台是一个动态的多智能体对抗体系,在仿真平台上,球员智能体于每一个周期的动作选择将直接决定了球队的攻防能力,而球员在比赛过程中如何相互配合,更精确、快速的到达各自目标点位进行进攻或防守是一切有效策略的前提条件。本文在三角剖分的阵型设计基础上,以防守任务中的智能体动作选择和阵型转换中的球员跑位为工作重点,研究内容如下:首先,将蒙特卡洛树搜索算法引入2D仿真中,将球员智能体在球场上的状态定义为博弈树节点,将双方球员的动作选择视为节点间的状态转移,对于球队的防守任务建立蒙特卡洛树模型。利用极坐标方式对球场进行区域分割,结合Q学习与蒙特卡洛树搜索中的信心上限树算法进行球队训练,将训练结果的动作评估值用于优化比赛代码,得到了一个较为良好的动作选择策略。其次,在分配智能体协调移动问题上提出了一种时间最小化的可扩展角色分配方法,对该方法的不同实现方式进行较为深层次的分析与比较,并将其应用于2D平台中球队攻防转换的阵型实现和球员进攻防守过程中的局部配合跑位上,把球员群体跑位问题模型化,使得球员的跑位更加高效与灵敏,减少了不必要的失误。最后,通过把攻防转换时的状态定义为蒙特卡洛树中的根节点,结合时间最小化角色分配方法进行智能体群防守联合实验,分析实验数据优化代码参数,通过比赛数据证明了方法的有效性。
[Abstract]:The Robocup2D simulation platform is a dynamic multi-agent antagonistic system. On the simulation platform, the action choice of the player agent in each cycle will directly determine the team's ability to attack and defend, and how the players cooperate with each other in the course of the game is more accurate. Fast arrival at the target point for attack or defense is a prerequisite for all effective strategies. On the basis of triangulation formation design, this paper focuses on agent action selection in defense task and player movement in formation transformation. The research contents are as follows: firstly, Monte Carlo tree search algorithm is introduced into 2D simulation. The state of player agent on the court is defined as the game tree node, the action selection of both players is regarded as the state transfer between the nodes, and the Monte Carlo tree model is established for the defense task of the team. Using polar coordinates to segment the area of the course, combining the Q-learning and the confidence upper tree algorithm in Monte Carlo tree search for team training, the training results of the action evaluation value is used to optimize the match code. A better action selection strategy is obtained. Secondly, a time-minimized scalable role assignment method is proposed to coordinate the movement of allocation agents. The different implementation methods of this method are analyzed and compared at a deeper level. And it is applied to the realization of team attack and defense transformation in 2D platform and the partial coordination movement in the process of player attack and defense. The problem of movement of player group is modeled to make the movement of players more efficient and sensitive. Unnecessary mistakes were reduced. Finally, by defining the state of attack and defense transformation as the root node in the Monte Carlo tree and combining with the role assignment method of time minimization, the joint experiment of agent group defense is carried out, and the experimental data is analyzed to optimize the code parameters. The validity of the method is proved by the competition data.
【学位授予单位】：南京邮电大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP242

【参考文献】