面向服务机器人的交互意图理解与跟踪控制方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	面向服务机器人的交互意图理解与跟踪控制方法研究
	李康
	2019-05-23
页数	118
学位类型	博士
中文摘要	随着人工智能、机器人技术的迅猛发展，人们对于服务机器人的期待越来越高，服务机器人开始进入我们的日常生活，机器人与人的交互技术成为学术界和企业界关注的热点。对于服务机器人而言，自然高效地与人交互是提供服务最关键的能力，它不仅体现在准确感知用户传达的信息和理解用户的交互意图，还要求能合理响应用户并作出智能的行为决策。近年来，人机交互技术取得了一些成果，在机器人理解用户的交互意图方面，很多核心算法都源于模式识别，但直接迁移到服务机器人交互应用中存在各种问题，适合于服务机器人的用户交互意图推断和理解方法研究仍然任重道远；在服务机器人行为决策方面，不同服务内容面临着不同的技术挑战，现有机器人还远未达到自主适应各种场合的智能水平。因此，研究面向服务机器人的人机交互技术，以进一步提高服务机器人的智能性和灵活性，对于推动服务机器人在实际生活中的普及和使用，具有重要的理论意义和应用价值。本文针对面向服务机器人的用户交互意图理解和行为决策中的目标跟踪控制方法开展研究，主要工作总结如下：首先，研究了交互开始前机器人对用户有没有交互意图的推断问题。模仿人类主动交互的对象选择方式，提出了一种基于双模信息分析的交互对象选择方法，实现了机器人主动感知并选择想要交互的对象，改变了传统机器人被动等待用户呼唤的方式。该方法抽象了人类选择交互对象的模型，采用视觉和激光两种模态信息实现了机器人360度范围的行人感知，提出了一种基于仿人交互意图特征和随机森林回归模型(Humanoid Interaction Intention Feature and Random Forest Regression Model, HIIF-RFRM)的意图推断方法，设计了交互意图推断优先级规则，最终通过日常生活中多人场景下交互开始前的交互对象选择实验验证了所提方法的有效性。第二，研究了交互过程中机器人对用户肢体交互意图的理解问题。针对传统肢体运动识别算法直接迁移到机器人肢体交互应用中的模型输入提取以及交互中断的问题，提出了一种基于双向长短时记忆网络和仿人决策任务中断(Bidirectional Long-Short-Term Memory networks and Humanoid Decision-making Task Interruption, BLSTM-HDTI)的肢体交互意图理解方法。该方法设计了一种基于3D人体骨骼点信息的交互起止端检测方法，实现了快速开启和中断交互的功能。设计了具有肢体空域关系的骨骼连通向量作为模型输入，搭建了具有肢体运动时序分析能力的三层BLSTM网络，识别用户的肢体交互意图。模拟人类智能决策任务中断的方式，在设计的肢体交互接口中加入了中断功能。最终通过肢体交互中断实验验证了所提方法的有效性。第三，研究了服务机器人交互行为决策中的目标跟踪问题。提出了一种基于双模型融合的目标跟踪控制方法。该方法首先改进传统时空上下文追踪器，构建了中层特征时空上下文(Middle-level Feature Spatio-Temporal Context, MFSTC)模型，以提升光照和表观变化时的目标定位精度。然后设计了融合机制，融合了三维均值漂移(Mean-Shift based on 3D information, MS3D)模型，以改善形变和遮挡情形下的目标跟踪效果，充分发挥了两种目标追踪模型的优势。此外，设计了基于视觉伺服的目标跟踪控制器，实现了机器人对用户的安全稳定跟踪。最终通过与代表性方法在最新的RGB-D数据集上的对比实验以及多种真实场景下的用户跟踪实验验证了所提方法的有效性。第四，设计并开发了一套自然高效的人与服务机器人交互系统。该系统包括了两种交互模式：正常人与机器人的肢体交互模式以及残障人与机器人的眼动交互模式。基于本文提出的交互对象选择、肢体意图理解和目标跟踪控制方法，结合设计的机器人肢体交互回应方式，实现了正常人与服务机器人的肢体交互模式。考虑有生理功能障碍的人不方便肢体交互，设计了基于眼动意图理解和机器人避障及抓取决策的眼动交互模式，提出了一套基于eTracker注视区域追踪模型的眼动交互方案，实现了模型输入自动获取和眨眼预处理、eTracker注视区域追踪分析以及两种眼动交互接口。最终通过真实场景下人与机器人的两种模式交互实验验证了交互系统的有效性。
英文摘要	With the rapid development of artificial intelligence and robot technology, public expectations for service robots are becoming more and more intense. Service robots are beginning to enter our daily life. The interaction technology between robots and human has become a hot spot of academic and business fields. For service robots, naturally and efficiently interacting with a human is the most critical ability to provide services. It is not only reflected in accurately sensing the information conveyed by users and understanding the user's interaction intentions, but also requires reasonable responses to users and intelligent behavior decisions. In recent years, human-robot interaction technology has achieved some results. In terms of robots understanding user interaction intentions, many core algorithms are derived from pattern recognition, but there are various problems in direct migration to service robot interaction applications. In user interaction intention inference and understanding, there is still a long way to go for methodological research on service robots. In terms of behavioral decision-making of service robots, different service tasks face different challenges, and existing robots are far from reaching the intelligent level of self-adaptation for various occasions. Therefore, researching human-robot interaction technology of service-oriented robots to further improve the intelligence and flexibility of service robots has important theoretical significance and application value for promoting the popularization and use of service robots in real life. This thesis studies the user interaction intention understanding and the target tracking control method in behavior decision-making of service-oriented robots. The main work is summarized as follows: Firstly, the inference problem of whether the robot has any interaction intention to the user before the interaction starts is studied. Imitating the object selection method of human active interaction, this thesis proposes an interactive object selection method based on bimodal information analysis, which realizes the robot's active perception and selection of the objects that want to interact, and changes the way that the traditional robot passively waits for the user to call. This method abstracts the model of human selection of interactive objects. It uses the visual and laser modal information to realize the 360-degree pedestrian perception of the robot. An intentional scoring mechanism based on humanoid interaction intention feature and random forest regression model (HIIF-RFRM) is proposed, and the interaction intention inference priority rule is designed. Finally, the effectiveness of the proposed method is verified by the interactive object selection experiment in the multi-person scenarios in daily life. Secondly, the problem of the robot's understanding of the user's limb interaction intention during the interaction process is studied. When the traditional limb motion recognition algorithm directly transfers to the robot limb interaction application, the problem of model input extraction and interaction interruption will occur. In this part, a method of understanding user limb interaction intention based on bidirectional long-short-term memory network and humanoid decision-making task interruption (BLSTM-HDTI) is proposed. The method designs an interactive start and end detection method based on 3D human skeleton point information, which realizes the function of quickly opening and interrupting interaction. The skeletal connectivity vector with limb spatial relationship is designed as the model input, and a three-layer BLSTM network with limb motion temporal analysis ability is built to identify the user's limb interaction intention. Simulating the way in which human intelligent decision-making tasks are interrupted, the interrupt function is added to the designed limb interaction interface. Finally, the effectiveness of the proposed method is verified by the limb interaction interruption experiment. Thirdly, the problem of target tracking in the decision-making of service robot interaction behavior is studied. A target tracking control method based on dual model fusion is proposed. The method firstly improves the traditional spatio-temporal context tracker and constructs a middle-level feature spatio-temporal context (MFSTC) model to improve the target positioning accuracy when illumination changes and appearance changes. Then the fusion mechanism is designed, and the mean-shift based on 3D information (MS3D) model is combined to improve the performance of target tracking under the deformation and occlusion conditions, and the advantages of the two target tracking models are fully utilized. In addition, a target tracking controller based on the visual servo is designed to realize the safe and stable tracking of the robot for the user. Finally, the effectiveness of the proposed method is verified by comparison experiments with state-of-the-art methods on the latest RGB-D dataset and user tracking experiments in various real-life scenarios. Fourthly, a natural and efficient interaction system between human and service robots is designed and developed. The system includes two modes of interaction: the physical interaction mode between the normal person and the robot, and the eye movement interaction mode between the disabled person and the robot. Based on the interactive object selection, limb intention understanding and target tracking control method proposed in this thesis, combined with the designed robotic limb interaction response mode, the limb interaction mode between normal people and service robot is realized. Considering the inconvenient limb interaction of people with physiological dysfunction, an eye movement interaction model based on eye movement intention understanding and robot obstacle avoidance and grasping decision is designed. An eye movement interaction scheme based on eTracker gaze area tracking model is proposed, which realizes model input automatic acquisition and blinking preprocessing, eTracker gaze area tracking analysis and two eye movement interactive interfaces. Finally, the effectiveness of the interactive system is verified by two modes of human-robot interaction experiments in real scenes.
关键词	共融机器人（服务机器人）机器人视觉交互意图理解行为决策目标跟踪控制
语种	中文
七大方向——子方向分类	智能机器人
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23791
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	李康. 面向服务机器人的交互意图理解与跟踪控制方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（13789KB）	学位论文		限制开放	CC BY-NC-SA