CASIA OpenIR  > 毕业生  > 硕士学位论文
面向联机手写中文数据的深度生成算法研究
任敏思
2024-05
Pages64
Subtype硕士
Abstract

近些年来,随着手机、平板电脑、电子黑板等移动触控设备的发展与普及, 手写数据在我们的日常生活中变得越来越流行,因此对手写数据的分析与处理 是一项具有广泛应用场景的研究问题。另一方面,读与写是人类智能中的两项重 要能力,而在人工智能中,它们分别对应着对文档数据的识别与生成。因此,从 研究意义的角度出发,手写数据生成是对人类智能中一项重要能力的模拟,是一 项有趣而富有创造力的任务。一般来说,手写数据有两种不同的表示方式:第一 种基于光栅图像,被称为脱机手写数据;另一种基于轨迹点序列,被称为联机手 写数据。本文主要关注联机手写中文数据的生成任务,主要的研究成果如下: 一. 提出一种基于条件扩散模型的风格化联机手写汉字生成的方法。我们将 非自回归的扩散模型应用到轨迹点序列生成的任务上,并通过增加字符结构编 码字典与书写风格编码器两个模块,实现了生成指定类别汉字、同时模仿指定书 写风格的功能。在公开手写数据集 CASIA-OLHWDB 上的一系列实验验证了方 法的有效性,并取得了比同网络架构下的自回归生成方法更优越的性能。 二. 实现了文本行级别的联机手写中文生成这一很少被探索的任务。我们提 出了一种层次化的生成方法:首先设计了一个布局规划器模块,它可以基于目标 文本内容和给定的手写文本行书写风格参考样本进行上下文学习,并为文本行 中每个字符生成边框信息;而基于条件扩散模型的字符生成器将在边框位置依 次生成对应的字符,从而完成完整文本行的书写。通过将字符的位置尺寸信息的 生成与字符结构生成这两个部分解耦开,使得完整文本行生成的过程更加可控 与鲁棒。

Other Abstract

In recent years, with the development and popularization of mobile touch devices such as smartphones, tablets, and electronic whiteboards, handwritten data has become increasingly prevalent in our daily lives. Consequently, the analysis and processing of handwritten data have emerged as important research topics with broad application scenarios. On the other hand, reading and writing are two crucial abilities in human intelligence, which in artificial intelligence correspond respectively to the recognition and generation of document data. Therefore, from the perspective of research significance, handwritten data generation simulates an important capability in human intelligence, constituting an intriguing and creative task.Generally, handwritten data can be represented in two different ways: raster image-based offline handwritten data and trajectory point sequence-based online handwritten data. This paper focuses on the generation of online handwritten Chinese data, and the main research achievements are as follows: 1. Proposal of a method for stylized online handwritten Chinese character generation based on conditional diffusion models. We apply the non-autoregressive diffusion model to the task of generating trajectory point sequences and achieve the functionality of generating specific categories of Chinese characters and imitating specific writing styles by adding two modules: character structure encoding dictionary and handwriting style encoder. A series of experiments on the publicly available handwritten dataset CASIA-OLHWDB validate the effectiveness of the method and demonstrate superior performance compared to autoregressive generation methods under the same network architecture. 2. The implementation of text-line-level online handwritten Chinese generation, which is a rarely explored task. We propose a hierarchical generation method that utilizes a novel layout planner module to perform context learning based on the target text content and given handwritten text line style reference, generating box information for each character in the text line. The character generator based on conditional diffusion models then sequentially generates the corresponding characters at the box positions, thereby completing the writing of the entire text line. Decoupling the generation of position-size information and the generation of character structure allows for a more controllable and robust process in generating complete text lines.

Keyword深度生成模型 联机手写中文生成 条件扩散模型
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/56675
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
任敏思. 面向联机手写中文数据的深度生成算法研究[D],2024.
Files in This Item:
File Name/Size DocType Version Access License
任敏思_毕业论文.pdf(7529KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[任敏思]'s Articles
Baidu academic
Similar articles in Baidu academic
[任敏思]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[任敏思]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.