面向过盈配合微器件的机器人装配技能学习及其应用
马燕芹
2020-05-21
页数156
学位类型博士
中文摘要

随着科技的快速发展,微机电系统(MEMS, Micro-Electro-Mechanical System)作为一项革命性的新技术,在电子、医学、物理和航天航空等领域得到越来越广泛的应用。而精密装配技术作为微机电系统装配的关键技术,成为近年来研究的热点。精密装配技术主要涵盖感知、测量和控制等方面,该技术的发展对于提高微机电系统的产品质量、降低产品生产周期具有非常重要的意义。本文针对过盈配合微器件的装配技术进行了研究。在此基础上,为了提高装配的智能性,研究了装配技能学习技术。首先,在感知和测量方面,研究了高精度的位姿测量方法,为装配控制提供精确而可靠的测量信息;其次,在装配控制方面,研究了抓取、对准和插入装配控制方法,提高了机器人精密装配的自动化程度;最后,在装配技能学习方面,研究了基于强化学习和演示学习的装配技能学习方法,赋予机器人装配技能学习能力。本文的主要工作和贡献如下:

(1) 针对圆形凸台微器件的姿态测量问题,提出了一种基于阴影分布的单目视觉姿态测量方法。首先,图像中圆形凸台阴影为基础,推导出了圆形凸台阴影分布模型;在此基础上,建立了阴影度差和姿态角之间的数学模型,数学模型的参数可以通过采集少量离线数据进行学习;最后,基于数学模型和坐标系之间的旋转变换解算出圆形器件在笛卡尔空间的姿态。该方法仅采用配备有环形光源的单目相机,与传统的姿测量方法相比,在不增加硬件成本的前提下有效提高了测量精度。

(2) 基于搭建的工业机器人精密装配系统,设计了自动抓取、高效位姿对准和插入装配控制方法。首先,提出了基于单目视觉导引的自动抓取控制方法,该方法包含对准-趋近-抓取三个阶段。在对准阶段,设计了基于图像的视觉伺服控制律,控制机器人末端执行器运动使得器件处于相机坐标系中期望的位置;在趋近阶段,控制机器人末端执行器移动固定位置偏移量,使得夹持器对准待抓取器件;在抓取阶段,在吸附作用下,夹持器将器件拾取。其次,提出了基于双目视觉导引的两器件位姿对准方法。首先,通过估计待对准器件的三维姿态,解算出期望的对准位姿;然后,采用基于图像的伺服控制实现两器件位姿高效对准。另外,考虑到末端执行器的姿态调整导致所夹持器件的位置偏移问题,基于微分运动原理解算了位置偏移量,并对其进行补偿。相比于传统“先姿态后位置”的位姿对准方法,该方法提高了对准效率。最后,提出了基于模型的插入装配控制方法。该方法根据弹性体的胡克定律建立了径向调整模型,基于该模型设计了装配控制策略。相对于传统的无模型的插入装配控制方法,该方法提高了装配效率和柔顺性。

(3) 针对微器件过盈装配中复杂的动态接触,很难建立精确装配模型的问题,提出了一种基于深度确定性策略梯度的装配技能学习方法。该方法将装配技能学习分成预训练阶段和自学习阶段。首先,在预训练阶段,对深度确定性策略梯度的策略网络和评估网络进行训练,使智能体达到演示者水平。另外,设计了数据增强方法用于扩充预训练数据集,该数据增强技术基于状态转移模型和演示装配数据,有效地减少人工演示次数;其次,在自学习阶段,通过让智能体在设计的奖励函数的引导下进行探索,从而学习到最优的装配策略。具体而言,设计了融合动作空间探索和参数空间探索的智能体混合探索策略,从而提高自学习的效率并降低自学习的成本。另外,考虑到插入装配需要兼顾安全性和效率,基于模糊控制设计了分层奖励函数,使智能体学到安全且高效的装配策略。

(4) 针对上述提出的装配技能学习方法依赖多次演示数据、装配转移模型以及需要同时训练深度确定性策略梯度框架中策略网络和评估网络的问题,提出了结合知识迁移模型和归一化优势函数的装配技能学习方法。该方法将装配策略分解成初始策略和剩余策略。在初始策略学习阶段,建立了基于高斯混合模型/高斯混合回归的知识迁移模型,该模型参数的学习仅需几次演示装配,可大大降低初始策略的学习成本。在剩余策略学习阶段,建立了基于归一化优势函数的强化学习框架。智能体在初始策略的基础上,通过探索学习不断优化装配策略,即学习剩余策略,最终学习到最优的装配策略。设计了自适应动作探索策略和优先经验回放策略,不但能提高学习效率,而且通过提高更优经验的回放概率使经验回放更有效。

最后,对本文提出的研究成果进行了总结,并对未来的研究工作进行了展望。

英文摘要

With the rapid development of science and technology, micro-electro-mechanical system (MEMS) as a revolutionary new technology, has been more and more widely used in electronics, medicine, physics, aerospace and other fields. As one of the key technologies of MEMS assembly, precision assembly technology has become a hotspot in recent years. This technology mainly involves perception, measurement and control. The development of precision assembly technology is of great significance for improving the quality of MEMS products and reducing the production cycle. In this dissertation, the classic assembly technology of micro components with interference fit is studied. Furthermore, the assembly skill learning technology is studied to improve the intelligence of precision assembly. Firstly, a high-precision pose measurement method is presented, which provides accurate and reliable measurement information for assembly control. Secondly, the grasping, alignment, and insertion assembly control methods are studied to improve the degree of automation of precision assembly. Finally, the assembly skill learning methods based on reinforcement learning and demonstration learning are studied to enable robot the assembly skill learning ability. The main work and contributions of this dissertation are as follows:

(1) A monocular visual method for the pose measurement of circular flange object is proposed based on the flange’s shadow distribution. Firstly, based on the circular flange’s shadow in the image, the shadow distribution model of the circular flange is obtained with function. Secondly, a mathematical model between the shadow thickness difference and the orientation angle is established, and the parameters of the mathematical model can be learned from several offline data. Lastly, the circular flange object’s orientation in the Cartesian space is calculated based on the mathematical model and the rotation transformation. This method only uses a monocular camera equipped with a ring light source. Compared with the traditional pose measurement method, the measurement accuracy of the proposed method is effectively improved without increasing the hardware cost.

(2) The automatic grasping, efficient pose alignment, and precision assembly control methods are designed based on the precision assembly system built with industrial robot. Firstly, an automatic grasping control method based on monocular vision guidance is proposed, which includes three stages "aligning-approaching-grasping". In the aligning phase, an image-based visual servo control law is designed to control the movements of the end-effector to make the component at the desired position in the camera’s frame. In the approaching phase, the end effector is controlled to move a fixed position offset so that the gripper is aligned to the component. In the grasping stage, the gripper picks up the component via adsorption. Secondly, a method based on binocular vision guidance is proposed for two components’ pose alignment. Specifically, the component’s 3D pose to be aligned is estimated, and the desired alignment pose is calculated. Then the image-based servoing is used to achieve pose alignment. In addition, considering the position offset of the component held by the gripper due to the orientation adjustment of the end effector, the position offset is calculated and compensated based on the principle of differential motion. The method improves the alignment efficiency compared with the traditional method "first orientation alignment, then position alignment". Finally, a model-based insertion assembly control method is proposed. The method establishes a radial adjustment model based on Hooke's law of elastomer. Then an assembly control strategy based on the model is designed. It improves the assembly efficiency and compliance compared with the traditional model-free inserting assembly control methods.

(3) The dynamic contact in interference assembly is complex, and it is difficult to establish an accurate assembly model to describe it. Therefore, an assembly skill learning method based on deep deterministic policy gradient is designed for the interference assembly. This method divides the assembly skills learning into pre-training phase and self-learning phase. Firstly, in the pre-training phase, the actor network and critic network of the deep deterministic policy gradient are trained to make the agent have the level of the demonstrator. In addition, a data augmentation method is designed to expand the pre-trained dataset. This data augmentation technology is based on the state transition model and demonstration assembly data, which effectively reduces the number of demonstrations. Secondly, during the self-learning phase, the agent explores in the environment to learn the optimal assembly strategy under the guidance of the reward function. Specifically, an exploration strategy that consists of action space exploration and parameter space exploration is designed, which improves the efficiency and reduces the cost of self-learning. In addition, considering the requirements of both safety and efficiency in inserting assembly, a hierarchical reward function is designed, which is based on fuzzy control and enables the agent learn a safe and efficient assembly strategy.

(4) The assembly skill learning method proposed above relies on much demonstration data, assembly transfer model and needs to simultaneously train the strategy network and the evaluation network in the depth deterministic strategy gradient framework. An assembly skill learning method combining knowledge transfer model and normalized advantage function is designed to overcome the shortages above. This method decomposes the assembly strategy into an initial strategy and a residual strategy. In the initial strategy learning phase, a knowledge transfer model based on Gaussian mixture model/Gaussian mixture regression is established. The model’s parameters can be learned with several demonstrations, which greatly reduces the cost in learning the initial strategy. In the residual strategy learning phase, a reinforcement learning framework based on the normalized advantage function is established. Based on the learned initial strategy, the agent continuously optimizes the assembly strategy through exploration in the environment, i.e., learning the residual strategy, and finally learns the optimal assembly strategy. Besides, an adaptive action exploration strategy and the prioritized experience replay strategy are designed, which not only improves the learning efficiency, but also makes the experience replay more effective by increasing the replay probability of better experience.

Finally, the research results in this dissertation are summarized, and the future research work is prospected.

关键词精密装配 视觉测量 图像伺服控制 装配控制 强化学习 演示学习 装配技能学习
语种中文
七大方向——子方向分类机器学习
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/39205
专题中科院工业视觉智能装备工程实验室_精密感知与控制
推荐引用方式
GB/T 7714
马燕芹. 面向过盈配合微器件的机器人装配技能学习及其应用[D]. 中科院自动化所. 中国科学院大学,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
面向过盈配合微器件的机器人装配技能学习及(5940KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[马燕芹]的文章
百度学术
百度学术中相似的文章
[马燕芹]的文章
必应学术
必应学术中相似的文章
[马燕芹]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。