CASIA OpenIR  > 毕业生  > 博士学位论文
面向交互的序列动作类人学习机制与方法研究
赵博程
Subtype博士
Thesis Advisor陶建华
2019-12-03
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Name工学博士
Degree Discipline模式识别与智能系统
Keyword模仿学习、类脑学习、非刚性点集配准、笔画顺序恢复、手写风格模仿、知识解耦、序列感知机
Abstract

交互式序列动作类人学习是模仿学习中的关键问题之一。交互式学习算法通过提取、建模与分析用户的个性化动作序列,实现机器人或其他交互系统对于用户动作序列的模仿和交互。进一步可使机器人等交互系统根据用户的现场“示范”的学习样本模仿人类动作,作为人工智能领域的重要方向,模仿学习是机器学习算法与现实物理环境交互的一项创新尝试可探索。本文以类人动作轨迹作为交互切入点,以在线手写文字序列风格模仿作为具体落脚点,针对交互式模仿提出了四部分主要内容:模仿学习的评价方法、线上手写体数据库增强算法、文字手写体模仿学习方法、类脑笔迹风格学习模型。其中文字模仿学习方法为本文的类人动作序列研究的核心内容;模仿学习评价方法的作用为衡量模仿学习算法的输出结果;线上手写体数据库增强算法旨在生成大量的可供模仿学习算法使用的线上文字手写体数据;类脑笔迹风格学习模型是文字风格特征提取方向的重要探索性研究,旨在通过借鉴大脑工作模式而降低模仿学习中的计算量以及模型复杂度。基于以上四点,本文贡献了四部分原创性成果。

1、本文提出了一种基于非刚性点集配准的文字手写体相似度评价方法

由于汉字的关键点配准的强非刚性以及汉字关键点分布较为稀疏的特性,传统方法在汉字关键点的配准问题上表现不佳。本文所提出的方法排除了传统方法依赖于图像的几何特征以及局部相似性的问题,在实验中大幅提升了配准效果。

2、本文提出了一种基于图像笔迹顺序复现的线上手写体数据库增强方法

就目前线上手写体缺乏大量数据库资源的问题,本文提出了将离线手写体图像转换为在线文字序列信息的方法。该算法框架包括一个静态能量网络,一个动态能量网络以及搜索遍历框架。与传统在英文等字母符号上的笔画恢复算法相比,本文所提出的方法在汉字数据上大幅提升了恢复正确率。可在一定程度上达到增强汉字笔迹数据库的作用。

3、本文基于注意力机制提出了一种风格化手写体文字模仿算法

本文从人类在线手写体风格化模仿方面展开,提出了一种可以通过观察用户的少量汉字手写体样本来以其笔迹模拟出3755个常用汉字的笔画生成算法框架。在此之前,笔画生成的算法主要的研究方向是增强文字识别精度,同时关于笔迹风格化的研究也主要集中在作者识别领域,本文通过建立一个新的基于注意力机制的算法框架将笔画生成与笔迹风格化统合在一起。同时提出了一个双条件门循环单元网络,实现了特定笔迹风格化的汉字笔画序列生成。实验结果经过图灵测试,笔迹特征点配准以及作者识别三个方法的检测,取得了令人满意的效果。

4、本文提出了一种类脑手写笔迹风格特征分析算法

受大脑工作方式启发,本文提出了知识解耦算法框架来对于笔迹风格进行提取。在作者识别领域,传统方法通常需要大约15个输入汉字来确保所提取的作者笔迹风格信息的准确度。本文算法首先将笔画的时序信息进行并行化建模,通过知识解耦将先验知识(文字内容)与后验知识(作者风格)分离后,大幅提升了作者识别中传统方法的低资源识别精度,在仅输入1个汉字的情况下取得了传统方法输入10个以上汉字的准确率。同时本方法在手写体识别中也取得了优于现有方法的表现。另外,由于知识解耦框架的特点,本文所提方法将模型总参数量与单次训练所需的模型参数量分割开来,大幅降低了模型训练和预测的时长以及计算资源消耗。

Other Abstract

Interactive learning of human-wise motion sequence is a momentous part of imitation learning. Interactive learning algorithm achieves interaction between robot or other system with human through extracting, modeling and analyzing user's stylized motion sequence. This could further assist interactive system to imitate human activation from user's “demonstration”, which is an important basic of deeper study on machine learning. As a significant branch of artificial intelligent, imitation learning is an innovative exploration on interaction between machine learning algorithm and reality. This paper take online handwriting character motion sequence as major studying point, proposes four main contributions: evaluation method of handwriting imitation; online handwriting database enhancement; handwriting imitation method via deep attention networks; brain-inspired human handwriting calligraphical feature learning model. Where handwriting imitation method is the core idea of this paper; evaluation method of handwriting imitation is an important result analyzing tool for the output of handwriting imitation;  brain-inspired human handwriting calligraphical feature learning model is an exploration of writing style extraction method. Based on these four points, this paper contributes for original achievements.

1. This paper has proposed a non-rigid point matching algorithm to evaluate imitation results

Due to the strong non-rigid characteristic and sparsity of point set, tradition point matching algorithm performs poorly. The proposed method avoids the high reliability on geometry feature and local similarity of tradition algorithms to achieve precise matching results. 

2. This paper has proposed an online handwriting database enhancement through recovering stroke drawing order

To solve the problem on lacking of online handwriting database, this paper introduces a method to transform off line handwriting image data to online sequence data. The proposed method recovers stroke order by employing a deep convolution neural network. The proposed framework of in this paper includes a static energy neural network, a dynamic energy neural network and a novel point searching progress. Comparing with other existing English writing order recover methods, our algorithm significantly improves the performance of DOR in Chinese database.

3. This paper has proposed a stylized handwriting imitation method via deep attention networks

This paper has proposed a handwriting imitation method, which method could generate 3755 different stylized Chinese characters by analyzing a dozen of writing image samples. Previous character generation works mainly focus on reinforcement of character recognition. Besides, most of the researches on calligraphy stylized features are proposed by writer identification algorithms. This paper builds an attention-based framework to cooperate stroke generation and calligraphy feature together. The component of stroke generation are realized by a dual conditional gated unit, which achieves drawing of stylized handwriting sequences. Experiment results have passed evaluations of writerID test, point matching test and subjective test.

4. This paper has proposed a brain-inspired calligraphical feature analyzing method

This paper introduces a knowledge uncouple framework to extract calligraphical features. Tradition writerID methods need more than 15 character samples as input to ensure a high accuracy, which is because online handwriting data contain both character and style information. The proposed methods firstly models temporal sequence input parallelly, then significantly improves the writerID accuracy in limited input sample scenario by separating posterior knowledge(character index) with prior knowledge(calligraphy style). The performance of our method for 1 character input surpasses the other methods with 10 input characters. Besides, the proposed algorithm could also be expanded in the field of online character recognition, in which our method performs competitively against the state-of-the-art. In addition, due to the knowledge uncouple, our method significantly reduces the parameter amount in the training process, which drastically lowered the cost of training time and computing resource.

Pages144
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/28351
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
赵博程. 面向交互的序列动作类人学习机制与方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.
Files in This Item:
File Name/Size DocType Version Access License
Thesis.pdf(13658KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[赵博程]'s Articles
Baidu academic
Similar articles in Baidu academic
[赵博程]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[赵博程]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.