中文手写字符图像的笔划分析方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 模式分析与学习

	中文手写字符图像的笔划分析方法研究
	王铁强
	2021-06
页数	150
学位类型	博士
中文摘要	中文手写字符图像的笔划提取和分析对书写效果评价、个性化手写字体合成以及书写教育等应用具有重要的作用。然而，虽然当前手写汉字识别取得了很大进展，但笔划提取和分析的自动化程度依然较低，且因数据匮乏、建模困难、评价模糊等问题而极具挑战性。本文研究中文手写字符图像的笔划分析方法，内容包括字符图像骨架化、笔划提取、不规范笔划检测、书写轨迹恢复，旨在为基于笔划的笔迹分析应用提供支撑。本文的主要创新性工作和成果如下：一、提出了一种基于全卷积网络的中文手写字符图像骨架化方法。字符图像经骨架化后为后续的笔划提取提供便利。针对过去的骨架化方法容易产生交叠区域变形和分叉等问题，本文提出了一种基于全卷积神经网络（fully convolutional network，FCN）的中文手写字符图像骨架化方法。为了支撑网络训练和性能评价，本文提出了一种以联机中文手写字符合成图像的策略来克服样本的像素级标注难题，并将由合成数据训练所得的模型成功推广到真实图像。实验表明本文的骨架化方法面临笔划形状复杂、笔划宽度多变、笔划边缘非平滑等情况时，在量化评估和视觉效果上均明显优于现有方法。二、提出了一种基于查询点引导和启发式搜索的笔划提取与匹配方法。该方法首先基于全卷积神经网络检测笔划间的交叠区域，并进一步将骨架态笔划切分为笔划段；然后基于查询点引导通路网络（PathNet）评价笔划段合并的一致度，并采用笔划段子集表构建出的候选笔划与字符模板笔划进行动态匹配，通过启发式搜索得到最优的笔划提取与匹配结果。实验结果证明了本方法的有效性，表明模板匹配对于提升笔划提取性能有显著作用，并且本文方法给相关研究提供了标准数据集和一系列基准结果。三、提出了基于字符分类器可解释性的不规范笔划检测方法。据实验观测，手写字符中不规范笔划，会对卷积神经网络字符分类器在真值类别上的置信度造成负面影响。因此，本文从字符分类器的可解释性入手，首先，通过保留和移除像素的手段，衡量该像素对正确类别上的分类置信度有何种影响；然后聚合字符图像中对真值类别上的置信度产生负面影响的连通像素，检测出不规范笔划或笔划段。实验结果证明了本文方法的有效性，并且也为相关研究提供了标准数据集和基准结果。四、提出了一种基于卷积指针网络和启发式搜索的书写轨迹恢复方法。本方法将书写轨迹恢复定义为笔划段排列和笔划段起点检测这两者的混合任务，采用卷积指针网络（convolutional pointer network，Conv-Ptr-Net）对前者进行建模，采用FCN对后者进行建模，并将二者统一到启发式搜索框架中通过组合优化得到最优书写轨迹。实验表明，本方法在大类别集中文手写字符数据上获得了优良的书写轨迹恢复效果，并且可成功推广到其他语种。
英文摘要	Stroke extraction and analysis of Chinese handwritten character images play an important role in various applications, such as writing quality evaluation, personalized font synthesis, and Chinese elementary education, etc. However, despite the great progress in handwritten Chinese character recognition, stroke extraction and analysis show a quite lower automation level due to the scarcity of data, difficulties in modeling and unspecific evaluation criteria. This dissertation studies stroke analysis methods for Chinese handwritten character images, including character image skeletonization, stroke extraction, irregularly-written stroke detection, and handwritten trajectory recovery. The major contributions are as follows: 1. A skeletonization method for Chinese handwritten character images based on fully convolutional network (FCN) is proposed. Skeletonization makes the stroke width one pixel in character image, so as to support various tasks of stroke analysis. The proposed method uses an FCN as a primary module of skeletonization, and we propose a strategy to synthesize pixel-level annotated training data from online handwritten Chinese characters. Experimental results show that the proposed skeletonization method outperforms existing methods significantly, especially when facing complex stroke shapes, variable stroke widths, and heavily un-smooth stroke edges. 2. A stroke extraction and matching method based on query pixel guidance and heuristic search is proposed. This method first detects the cross regions among strokes based on an fully convolutional network, and further divides the skeletonized strokes into stroke segments; then a query point guided path network (PathNet) is used to evaluate the merging consistency of stroke segments; finally, the candidate strokes constructed by stroke segments are dynamically matched with the character template strokes, and the optimal stroke extraction and matching results are obtained through heuristic search. Experimental results prove the effectiveness of this method and indicate that template matching improves stroke extraction significantly. This work also provides a standard dataset and a series of benchmark results for related research. 3. An irregularly-written stroke detection method based on the interpretability of deep classifier is proposed. Experimental observations show that irregularly-written strokes in handwritten character images affect the confidence of ground-truthed category predicted by deep convolutional neural network. Based on this, the proposed method fist measure the impact of pixels on the classification confidence by retaining and removing pixels. Connected pixels that negatively affect the confidence can be merged into the detected irregular strokes or stroke segments. Experimental results prove the effectiveness of the proposed method, and also provide a standard dataset and benchmark results for related research. 4. A handwritten trajectory recovery method based on a convolutional pointer network and heuristic search is proposed. This method defines handwritten trajectory recovery as a hybrid task of stroke segment ordering and starting point detection of each stroke segment. A convolutional pointer network (Conv-Ptr-Net) and an FCN are used to model stroke segment ordering and starting point detection, respectively. The two sub-tasks are integrated into the heuristic search framework to obtain the optimal handwritten trajectory. Experiments demonstrate the effectiveness of the proposed handwritten trajectory recovery method on a large set of Chinese handwritten characters data, and the applicability to other languages.
关键词	中文手写字符图像骨架化笔划交叠区域检测笔划提取笔划匹配书写轨迹恢复
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/45023
专题	多模态人工智能系统全国重点实验室_模式分析与学习
推荐引用方式 GB/T 7714	王铁强. 中文手写字符图像的笔划分析方法研究[D]. 中国科学院自动化研究所智能化大厦三楼第五会议室. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
中文手写字符图像的笔划分析方法研究.pd（6299KB）	学位论文		开放获取	CC BY-NC-SA