Knowledge Commons of Institute of Automation,CAS
基于深度结构化学习的手写数学公式识别 | |
吴金文 | |
2021-12 | |
页数 | 124 |
学位类型 | 博士 |
中文摘要 | 手写数学公式的识别,对于教育、科学传播和自动化等领域都有着重要意 义。相比于一般的文字识别或者图像识别问题,手写数学公式版面复杂,内容多样。因此,手写数学公式的符号检测、符号分类以及结构关系推理等都非常具有挑战性。本文研究手写数学公式的识别以及结构解析问题,利用深度学习和结构化学习的思想提出了几种有效的模型和方法,在手写数学公式识别实验中取得了优良的性能。论文的主要创新工作如下: 2. 提出了一种基于预感知单元的手写数学公式识别方法。基于注意机制的隐式分割模型处理形似的符号或者复杂的结构时,常常对某一符号过注意或者欠注意,导致在识别过程中重复识别或者丢失符号。为了解决这一问题,该方法设计了一种基于预感知单元的解码器,将符号阅读过程的空间信息嵌入在注意机制中,使得识别器能够准确地并行学习每一个符号的视觉和语义对应关系。实验表明,该方法能有效提升手写数学公式识别的精度。
|
英文摘要 | The recognition of handwritten mathematical expressions is important to the fields of education, science and office automation. Compared with other vision recognition task, such as text recognition and image classification, handwritten mathematical expressions have more complex layout and divergent writing styles. Hence, the symbol detection and recognition, and structure analysis of handwritten mathematical expressions pose great challenges. This thesis studies the recognition and structure analysis problems of handwritten mathematical expressions. Taking advantage of deep learning and structured learning, this work proposes some effective methods for handwritten mathematical expression recognition (HMER) and has achieved superior performance on public dataset. The main contributions are as follows: 1. A paired adversarial learning based HMER method is proposed. This method parses the visual representation of input formula data into LaTeX format markup with an attention based neural decoder. During training, the method uses standard printed mathematical formulas images as templates, and guides the model to pay attention to the corresponding symbols on the handwritten formulas and printed templates, so as to learn the semantic-invariant feature of math symbols by adversarial learning to enhance the robustness to the writing style variation. Experimental results show that the method the robustness to the writing style variation. Experimental results show that the method performs competitively on public datasets. 2. A HMER method based on pre-aware unit is proposed. This is to overcome the problem that for similar symbols or complex structures, attention based implicit symbol segmentation model tends to over- or under-attend some symbols, so that some symbols are replicated or lost. The proposed method designs a decoder based on pre-aware unit, which embeds the spatial information of read symbols into the attention mechanism, so that the recognizer can accurately learn the visual and semantic correspondence of each symbol in parallel. Experimental results show that the method can improve the HMER 3. A graph-to-graph generation based method is proposed for HMER and structure analysis. In this method, both the input handwritten expression data and the output markup are formulated as graphs. The model explores the hierarchical structure, enabling symbol segmentation and relation parsing, and can be learned end-to-end. Experimental results show that this method significantly refreshes the recognition accuracy on several public datasets, and explicitly segment the mathematical symbols in online handwritten mathematical formulas. The method also shows extensibility to offline mathematical formulas. 4. A weakly-supervised symbol prototype based graph-to-graph learning method is proposed for online HMER. To overcome the reliance of graph-to-graph generation model learning on large dataset with symbol-level annotations, the proposed method first learns the symbols prototypes on a closed math symbol set, and then learns symbol segmentation and structures on formula-level labeled data. Experimental results show that the method achieves competitive results on multiple public datasets. |
关键词 | 手写数学公式识别 配对对抗学习 预感知单元 图到图生成 字符原型 |
语种 | 中文 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/47472 |
专题 | 多模态人工智能系统全国重点实验室_模式分析与学习 |
通讯作者 | 吴金文 |
推荐引用方式 GB/T 7714 | 吴金文. 基于深度结构化学习的手写数学公式识别[D]. 中国科学院自动化所. 中国科学院大学,2021. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
132839414497723750.p(4312KB) | 学位论文 | 开放获取 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[吴金文]的文章 |
百度学术 |
百度学术中相似的文章 |
[吴金文]的文章 |
必应学术 |
必应学术中相似的文章 |
[吴金文]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论