Knowledge Commons of Institute of Automation,CAS
联机手写文档的动态版面分析 | |
杨宇婷 | |
2022-05-29 | |
页数 | 87 |
学位类型 | 硕士 |
中文摘要 | 联机手写文档是一种重要的媒体数据类型,广泛应用于人机交互、在线教育与自动化办公等领域。 联机手写文档的版面分析是指将笔画划分成不同的语义类别,如:文字、公式、表格、列表、流程图、草图等。 对于支持自由手写的文档分析系统来说, 版面分析是一项基本任务。 先前的方法本质上是静态的,依赖于全局上下文建模, 必须等待用户完成整个文档才能进行预测。然而,在实践中更人性化的方式是在用户书写的同时进行实时预测。 因此,本文研究联机手写图文混合文档的动态版面分析,目的是可以在手写输入过程中对文档内容动态实时地进行分析,为动态识别提供基础。主要研究内容和成果如下: |
英文摘要 | Online handwritten document is an important type of media data, used in humanmachine interface, education, office automation, and so on. Layout analysis for online handwritten documents aims to divide strokes into several semantic categories, such as text, formula, table, diagram and graph. It is an essential component for document analysis systems that support free writing. Previous methods rely on global context, so these methods are essentially static in that they have to wait for the user to finish the whole document before making prediction. However, in practice, the more user-friendly way is to make real-time prediction as the user is writing. Therefore, this thesis studies the dynamic layout analysis of online handwritten image-text mixed documents. The purpose is to analyze the document content dynamically in the process of handwriting input, so as to provide a basis for dynamic recognition. The main research contents and results are as follows: 3.We proposed a chunk-based streaming Transformer model for real-time document object classification of strokes in online handwritten documents. The method is based on Transformer's encoder, using attention mechanism to model stroke sequences. The input sequence is divided into chunks, and an effective attention mask strategy is applied to limit context information, so that the model has the modeling ability to only consider limited context. The method is experimented on multiple online handwritten document datasets, and the results show that it achieves comparable performance with lower training memory requirements and faster inference speed than previous methods. |
关键词 | 联机手写文档 版面分析 动态笔画分类 图神经网络 Transformer |
语种 | 中文 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/48860 |
专题 | 多模态人工智能系统全国重点实验室_模式分析与学习 |
推荐引用方式 GB/T 7714 | 杨宇婷. 联机手写文档的动态版面分析[D]. 自动化所. 自动化所,2022. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
杨宇婷毕业论文《联机手写文档的动态版面分(3190KB) | 学位论文 | 开放获取 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[杨宇婷]的文章 |
百度学术 |
百度学术中相似的文章 |
[杨宇婷]的文章 |
必应学术 |
必应学术中相似的文章 |
[杨宇婷]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论