基于Transformer的几何基元检测与分析 | |
周威![]() | |
2024-07-05 | |
页数 | 80 |
学位类型 | 硕士 |
中文摘要 | 几何基元分析在流程图识别与分析等领域中具有重要的应用价值。然而,由 |
英文摘要 | Geometric primitive analysis has significant application value in fields such as flow chart recognition and analysis. However, it has always been a research challenge due to the complexity and diverse types of geometric primitives, as well as the difficulties in representing and optimizing primitive parameters. Geometric primitive detection and analysis of geometric primitive relationships are two crucial tasks in geometric primitive analysis. Geometric primitive detection aims to identify the categories and positions of geometric primitives, while the analysis of geometric primitive relationships focuses on recognizing the relationships between primitives, such as connection relationships. Some of the current geometric primitive detection methods are extensions of classical object detection methods, which use rectangular boxes to represent objects. However, using rectangular boxes does not accurately represent geometric primitives, leading to imprecise geometric primitive parameters. Additionally, certain existing methods for analyzing relationships between objects employ complex model structures, making model training challenging and requiring additional complex post-processing methods. To address these issues, this thesis conducts corresponding research on the detection of geometric primitives and the identification of connection relationships between them in flow charts. The main contents and contributions of this thesis are as follows: (1) A general geometric primitive representation and detection scheme for flowchart analysis is proposed, along with the construction of a geometric primitive dataset specifically designed for regular flowcharts. To address the issue of inaccurate representation of geometric primitives using rectangular boxes, this thesis presents a general geometric primitive representation method based on multiple keypoint sequences. This method allows for more precise description of various types of geometric primitives' shapes. Furthermore, an effective detection scheme based on multiple keypoints is proposed on top of this representation method.To overcome the limitations of using bounding box-based intersection-over-union (IoU) calculation, which fails to accurately reflect the overlapping degree between primitives, a keypoint-based IoU calculation method is introduced. This approach utilizes the polar coordinates of keypoint sequences to calculate the positional displacement between geometric primitives, resulting in a more realistic representation of their overlapping degree.In addition, addressing the lack of datasets for geometric primitive detection and relationship analysis in current regular flowcharts, a geometric primitive dataset specifically tailored for flowcharts is constructed. This dataset comprises 8,000 machine-generated flowchart images, covering nine categories of geometric primitives, and is accompanied by over 240,000 annotations, including keypoint positions and relationships between primitives.Experimental results demonstrate that the proposed universal geometric primitive detection scheme based on multiple keypoint sequences effectively enhances the performance of geometric primitive detection. (2) A method for geometric primitive relationship analysis based on adjacency matrix prediction is proposed. Addressing the drawbacks of existing methods for analyzing relationships between objects, which are overly complex and require post-processing, this thesis presents a one-stage geometric primitive relationship analysis method based on adjacency matrix prediction. In this method, the analysis of geometric primitive relationships is modeled as a directed graph recognition problem, where nodes in the graph represent geometric primitives and directed edges represent the connection relationships between primitives. Therefore, the problem of analyzing relationships between primitives is transformed into a problem of predicting the adjacency matrix of the graph. Additionally, based on the idea of task decoupling, this thesis introduces a dynamic relationship adjacency matrix prediction loss, which allows the model to focus more on geometric primitive detection in the early stages of training and shift its attention to geometric primitive relationship analysis in the later stages. Experimental results demonstrate that the proposed method effectively identifies the connection relationships between primitives. (3) A flowchart detection and reconstruction system has been constructed. This system can run on major mainstream browsers, offering advantages such as cross-platform compatibility, high compatibility, and user-friendly interaction. Specifically, this thesis builds a front-end and back-end separated system based on a browser/server framework. The user interface is presented in the browser to facilitate user operations. The tasks of flowchart recognition and reconstruction are handled on the server-side, thereby reducing the performance requirements on user devices and enhancing the user experience. The presentation results of the system demonstrate that the proposed method has significant practical value in real-world scenarios. |
关键词 | 基元检测 关系分析 关键点 Transformer |
语种 | 中文 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/58574 |
专题 | 毕业生_硕士学位论文 |
推荐引用方式 GB/T 7714 | 周威. 基于Transformer的几何基元检测与分析[D],2024. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
周威-学位论文-最终版.pdf(10295KB) | 学位论文 | 限制开放 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[周威]的文章 |
百度学术 |
百度学术中相似的文章 |
[周威]的文章 |
必应学术 |
必应学术中相似的文章 |
[周威]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论