CASIA OpenIR  > 毕业生  > 硕士学位论文
Thesis Advisor张树武
Degree Grantor中国科学院大学
Place of Conferral北京
Degree Discipline计算机技术
Keyword页面倾斜检测 页面分割 区域分类
本文的目标是研究并实现一个针对文档的版面分析系统,当用户输入一张文档图像时,该系统能够完全自动化地对文档图像进行分析,无需人工干预, 就能得到较好的版面分析结果。该系统要能够兼顾准确性和速度,同时还要能适应不同大小、不同分辨率的文档图像。
1. 针对污染严重或光照不好的文档,鉴于全局阈值法的结果不好,应用了一个局部自适应阈值的二值化算法,实验表明,局部自适应阈值法二值化的效果不错,能有效解决这种类型的文档;
2. 针对不同类型页面的倾斜问题,传统的霍夫变换法进行角度检测精度不高,本文改进了LSD(直线段检测)、KNN(K最近邻)聚类和DFT(离散傅里叶变换)相结合的方法来检测页面偏转,实验表明,此方法鲁棒性好、速度快、精度高;
3. 针对复杂的非曼哈顿版面存在分割准确率低以及多数分割算法没有足够关注非文本元素的分类等问题,在版面分割和区域类型识别部分,本文对最小同质区域算法(目前很好的版面分析算法,采用多层级分类法,分类和分割同时进行),进行改进。实验结果显示,对于简单的论文版面,特别是复杂的杂志版面,都能够比较准确和快速地划分出不同类型的独立区域并加以识别。
Other AbstractDigitization of paper documents is widely used in office automation, digital libraries, industrial automation and other fields. Due to the long time-consuming manual process, we should seek an automatic way to solve the problem. This paper intents to study and implement an automated document analysis system.
 The purpose of this paper is to study and implement a system of document layout analysis. When the user inputs a document image, the system should conduct document image analysis automatically without any human intervention, and the user can get a good layout analysis result finally. The system can deal with various sizes, all kinds of types, and different resolutions of the document image. The accuracy and speed should be taken into account too.
 Three main steps of system implementation are: Page skew detection and correction, image binarization, edge noise removal, page segmentation and region type recognition.
 In summary, the key proposed algorithms are:
1. For the serious pollution or illumination bad document, given the global threshold of poor results, the application of a local adaptive threshold binarization algorithm, experiments show that local adaptive threshold binarization can produce good results.
2. For different types of pages, the traditional Hough transform method for angle detection whose accuracy is not good, we get a combination of LSD (line segment detection), KNN (K-nearest neighbor) clustering and DFT (Discrete Fourier Transform), to detect the deflection of the page, the experiment shows that this method is robust, high speed, and high precision;
3. For complex non-Manhattan layout whose resolution is low and most segmentation methods are not focused on non-text elements, in the part of page segmentation and region classification, this article improves the smallest homogeneous region algorithm (currently a very good page analysis algorithm, using multi-level classification, classification and segmentation simultaneously). The results show that for simple essay layouts, especially the complex magazine pages, our method is accurate, robust and quick.
Keywords: skew detection, page segmentation, region classification
Document Type学位论文
Recommended Citation
GB/T 7714
郜巍. 文档版面分析技术研究与系统实现[D]. 北京. 中国科学院大学,2016.
Files in This Item:
File Name/Size DocType Version Access License
2013E8014661090-郜巍-工(5007KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
