当前位置:主页 > 科技论文 > 自动化论文 >

基于增强学习的灵巧手控制算法及其应用

发布时间:2018-08-19 20:44
【摘要】:灵巧手操作是极具挑战的机器人控制任务之一,并且至今仍存在大量问题尚未解决。本文针对机器人灵巧手操作中抓取任务,以实际Baxter机器人为平台,实现了一套完整的抓取控制系统,能够高效、自适应的控制机械臂到达指定目标位置和跟随指定关节位置轨迹两个目标,并进行实验验证。本文主要的工作有如下几个方面。针对抓取场景下目标跟踪问题,完成引入结构化约束的跟踪算法设计。跟踪算法作为控制系统在运行时的辅助模块,为控制系统提供的目标定位以及目标形态信息。改进后视觉跟踪算法,提升了机器人操作场景中,快速移动、光照变化、形变等问题的跟踪效果。算法结合TLD算法框架,利用提出的交叉骨架模型和软分割模型,开拓被跟踪目标的结构信息和外貌信息。跟踪算法测试于权威数据视觉跟踪数据库,对比近年优秀的跟踪器,并获得了不错的成绩。针对直接训练神经网络策略样本需求量大的问题,利用构建局部控制器结合监督学习技术训练神经网络策略。本文将操作任务划分为数个简单状态,在简单状态下,利用基于模型的传统控制算法或高效的增强学习算法,完成局部控制器的训练。在得到能够在各自独立的简单状态下完成任务的局部控制器后,利用成熟的监督学习技术,将多个局部控制器整合训练为一个全局策略。全局策略保证机器人能够以统一的控制器完成整个操作任务,而非机器人在每个状态下使用不同的控制器完成任务。本文第三个重要工作就是以实际Baxter机器人为平台,实现实际的抓取控制系统。控制算法的仿真实现往往较为容易,而从仿真到现实世界则存在一条巨大的鸿沟。因为在仿真环境中,可以忽略现实世界中各种干扰因素,例如数据采集时的测量误差、系统误差、执行器执行精度、控制系统的模型差异、采样的成本等。此外,真实系统上需要一套完整的系统提供算法运行的基础。上述任一问题未能处理得当,都足以使得整个算法在实际系统中失败。本文最后,在实际抓取控制系统中完成两组控制任务实验验证。分别对局部控制器和全局策略进行测试,不同次数迭代的执行误差,以实验结果来证实整个系统的可用性。实验数据也充分展示了,控制系统的可靠性,为数不多的迭代次数下,即可获得较为理想的执行结果。
[Abstract]:Dexterous hand manipulation is one of the most challenging tasks for robot control, and a large number of problems remain unsolved. In this paper, aiming at grasping task of robot dexterous hand, taking actual Baxter robot as the platform, a complete grab control system is realized, which can be highly efficient. The adaptive control manipulator reaches the specified target position and follows the specified joint position trajectory, and the experimental results are verified. The main work of this paper is as follows. Aiming at the target tracking problem in grab scenario, the tracking algorithm with structured constraints is designed. As the auxiliary module of the control system, the tracking algorithm provides the target location and the target shape information for the control system. The improved visual tracking algorithm improves the tracking effect of robot operation scene, such as fast moving, illumination change, deformation and so on. The algorithm combines the TLD algorithm framework with the proposed cross-skeleton model and soft segmentation model to exploit the structure and appearance information of the target being tracked. The tracking algorithm is tested in the authoritative data visual tracking database, compared with the excellent tracker in recent years, and achieved good results. In order to solve the problem of large demand for direct training neural network strategies, a local controller combined with supervised learning technology is used to train neural network strategies. In this paper, the operation task is divided into several simple states. In the simple state, the local controller is trained by using the traditional model-based control algorithm or the efficient reinforcement learning algorithm. After obtaining the local controller which can complete the task independently and simply, the local controller can be integrated into a global strategy by using the mature supervised learning technology. The global strategy ensures that the robot can complete the whole task with a unified controller instead of using a different controller to complete the task in each state. The third important work of this paper is to realize the actual grab control system based on the actual Baxter robot. The simulation of control algorithm is always easy, but there is a huge gap between simulation and real world. In the simulation environment, we can ignore all kinds of interference factors in the real world, such as measurement error, system error, executive precision of actuator, model difference of control system, cost of sampling and so on. In addition, the real system needs a complete set of systems to provide the basis for the algorithm to run. Any of the above problems can not be handled properly enough to make the whole algorithm fail in the real system. Finally, two groups of control tasks are verified in the actual grab control system. The local controller and the global strategy are tested, and the error of different iterations is obtained to verify the usability of the whole system. The experimental data also fully show that the reliability of the control system and the few iterations can obtain more satisfactory results.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP242

【参考文献】

相关硕士学位论文 前1条

1 李昊;非刚性目标的跟踪-学习-检测算法研究[D];电子科技大学;2015年



本文编号:2192781

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2192781.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户4d28f***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com