Layout-Aware Single-Image Document Flattening
Li, Pu1,2; Quan, Weize1,2; Guo, Jianwei1,2; Yan, Dong-Ming1,2
发表期刊ACM Transactions on Graphics
2023-12-02
卷号43期号:1页码:9: 1-17
摘要

Single image rectification of document deformation is a challenging task. Although some recent deep learning-based methods have attempted to solve this problem, they cannot achieve satisfactory results when dealing with document images with complex deformations. In this article, we propose a new efficient framework for document flattening. Our main insight is that most layout primitives in a document have rectangular outline shapes, making unwarping local layout primitives essentially homogeneous with unwarping the entire document. The former task is clearly more straightforward to solve than the latter due to the more consistent texture and relatively smooth deformation. On this basis, we propose a layout-aware deep model working in a divide-and-conquer manner. First, we employ a transformer-based segmentation module to obtain the layout information of the input document. Then a new regression module is applied to predict the global and local UV maps. Finally, we design an effective merging algorithm to correct the global prediction with local details. Both quantitative and qualitative experimental results demonstrate that our framework achieves favorable performance against state-of-the-art methods. In addition, the current publicly available document flattening datasets have limited 3D paper shapes without layout annotation and also lack a general geometric correction metric. Therefore, we build a new large-scale synthetic dataset by utilizing a fully automatic rendering method to generate deformed documents with diverse shapes and exact layout segmentation labels. We also propose a new geometric correction metric based on our paired document UV maps. Code and dataset will be released at https://github.com/BunnySoCrazy/LA-DocFlatten.

关键词Document Image Rectiication Document Layout Analysis Deep Neural Networks Geometric Models
学科门类工学::计算机科学与技术(可授工学、理学学位)
DOIhttps://doi.org/10.1145/3627818
URL查看原文
收录类别SCIE
语种英语
是否为代表性论文
七大方向——子方向分类计算机图形学与虚拟现实
国重实验室规划方向分类视觉信息处理
是否有论文关联数据集需要存交
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/57111
专题多模态人工智能系统全国重点实验室_三维可视计算
通讯作者Guo, Jianwei
作者单位1.MAIS, Institute of Automation, Chinese Academy of Sciences
2.School of Artiicial Intelligence, UCAS
第一作者单位中国科学院自动化研究所
通讯作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Li, Pu,Quan, Weize,Guo, Jianwei,et al. Layout-Aware Single-Image Document Flattening[J]. ACM Transactions on Graphics,2023,43(1):9: 1-17.
APA Li, Pu,Quan, Weize,Guo, Jianwei,&Yan, Dong-Ming.(2023).Layout-Aware Single-Image Document Flattening.ACM Transactions on Graphics,43(1),9: 1-17.
MLA Li, Pu,et al."Layout-Aware Single-Image Document Flattening".ACM Transactions on Graphics 43.1(2023):9: 1-17.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2023-TOG-Layout-Awar(3423KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Pu]的文章
[Quan, Weize]的文章
[Guo, Jianwei]的文章
百度学术
百度学术中相似的文章
[Li, Pu]的文章
[Quan, Weize]的文章
[Guo, Jianwei]的文章
必应学术
必应学术中相似的文章
[Li, Pu]的文章
[Quan, Weize]的文章
[Guo, Jianwei]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2023-TOG-Layout-Aware Single-Image Document Flatening.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。