机票行程单拍照识别技术研究

CASIA OpenIR > 复杂系统管理与控制国家重点实验室 > 影像分析与机器视觉

	机票行程单拍照识别技术研究
	蒋经中
	2021-11-26
页数	83
学位类型	硕士
中文摘要	随着科技的发展进步以及综合国力的不断提升，我国民用航空运输行业获得了巨大的发展机遇和发展空间，也取得了重大的发展成就。改革开放以来，人民生活水平日渐提高，飞机成为人们出行越来越重要的交通工具。航空运输电子客票行程单（本文称为“机票行程单”）作为一种运输合同，是旅客购买航空运输电子客票的付款及报销凭证。2008年7月1日新版行程单正式启用，然而由于某些方面的原因，目前机票行程单的报销业务仍然处于从纸质化向电子化的过渡阶段。本文旨在通过拍照技术获取的机票行程单照片，利用深度学习技术实现机票行程单关键信息识别及结构化输出，从而实现纸质版机票行程单自动化、智能化的财务报销，进一步推动报销业务的电子化。深度学习技术的发展为我们实现机票行程单拍照图片的识别提供了有力的研究工具，然而由于机票行程单自身所具有的复杂性，以及拍照取像过程中所引入的复杂性，使得对票面关键文字信息的检测与识别，以及后续的识别结果结构化输出均面临较大的挑战。同时，本文所提出的模型是否足够轻量，从而使得其能较好地部署在移动端及嵌入式设备，也是我们需要重点考虑的问题。针对上述问题，本文进行了一系列的实验研究，主要内容如下： 1. 为了避免票面套打内容及票面以外的背景文字信息的干扰，本文设计了定位块检测模块，实现对票据的框取，为票面关键文字信息的检测和关键信息识别结果的结构化输出提供坐标约束，提升模型的检测和识别性能。该方法基于性能优秀的YOLOv5算法实现。 2. 针对机票行程单底版信息的复杂性，本文实现了一种关注内容的轻量化文字检测方法。该方法基于EAST算法，采用了轻量级的FairNAS-B作为主干网络，并在特征融合层进行了相应的改进。同时，为了消除小样本、样本不均衡问题，本文采用了数据增广策略来增加训练样本的多样性。实验证明该方法在模型大小和检测性能上都具有较好地优势。 3. 为了实现机票行程单票面关键文字信息的识别，同时考虑到模型的轻量化要求，本文在文字检测的基础上，设计并实现了一种基于CNN的单字识别模型。实验通过单字生成的方式进行数据增广，并用于模型训练。实验证明，该模型在识别精度、识别速度和模型大小方面都具有一定的优势。 4. 为了将票面关键文字信息对应到机票行程单上的相关条目，本文结合文字识别的结果，利用定位块提供区域约束，减少了关键信息与条目对应错误的情况发生。并将前文的定位块检测模型、单字检测模型、单字识别模型进行集成，实现了一种用于机票行程单拍照图片关键信息结构化输出的应用程序。该应用程序能够有效实现机票行程单从拍照获取到关键信息结构化输出的功能。
英文摘要	With the development and progress of science and technology and the continuous improvement of comprehensive national strength, China's civil air transportation industry has obtained huge development opportunities and development space, and has also made significant development achievements. Since the reform and opening up, people’s living standards have improved day by day, and airplanes have become an increasingly important means of transportation for people to travel. Air transport e-ticket itinerary (referred to as "ticket itinerary" in this thesis), as a kind of transportation contract, is a proof of payment and reimbursement for passengers to purchase air transport e-ticket. The new version of the itinerary was officially launched on July 1, 2008. However, due to some reasons, the reimbursement business of the air ticket itinerary is still in the transition stage from paper to electronic. The purpose of this thesis is to use the photo of the ticket itinerary obtained by taking pictures, and use deep learning technology to realize the key information identification and structured output of the ticket itinerary, so as to realize the automatic and intelligent financial reimbursement of the paper ticket itinerary, and further promote the electronization of reimbursement business. The development of deep learning technology provides a powerful research tool for us to realize the recognition of the pictures taken on the ticket itinerary. However, due to the complexity of the ticket itinerary itself and the complexity introduced in the process of taking pictures, it is critical to the ticket. The detection and recognition of text information, as well as the structured output of subsequent recognition results, face greater challenges. At the same time, whether the model proposed in this thesis is light enough so that it can be deployed in mobile and embedded devices is also a key issue we need to consider. In response to the above problems, this thesis has carried out a series of experimental studies, the main contents are as follows: 1. In order to avoid the interference of the content of the ticket and the background text information outside the ticket, this paper designs a positioning block detection module to realize the frame of the ticket, and provide the key text information detection of the ticket and the structured output of the key information recognition result. Coordinate constraints improve the detection and recognition performance of the model. This method is implemented based on the excellent YOLOv5 algorithm. 2. Aiming at the complexity of the information on the bottom of the ticket itinerary, this paper implements a lightweight text detection method that focuses on the content. This method is based on the EAST algorithm, uses the lightweight FairNAS-B as the backbone network, and makes corresponding improvements in the feature fusion layer. At the same time, in order to eliminate the problem of small samples and unbalanced samples, this paper adopts a data augmentation strategy to increase the diversity of training samples. Experiments show that this method has good advantages in model size and detection performance. 3. In order to realize the recognition of the key text information on the single face of a ticket itinerary, while taking into account the lightweight requirements of the model, this paper designs and implements a CNN-based single word recognition model on the basis of text detection. The experiment uses single word generation to augment data and use it for model training. Experiments show that the model has certain advantages in recognition accuracy, recognition speed and model size. 4. In order to map the key text information on the ticket face to the relevant items on the ticket itinerary, this thesis combines the results of text recognition and uses the positioning block to provide regional constraints, reducing the occurrence of errors in key information and item correspondence. We also integrate the previous paper's positioning block detection model, single-word detection model, and single-word recognition model to realize an application for the structured output of key information on photographic images of the air transport e-ticket itinerary. The application can effectively realize the function from photo acquisition to structured output of key information for air ticket itinerary sheets.
关键词	机票行程单+文字检测+文字识别+数据增广+轻量化模型+结构化数据输出
语种	中文
七大方向——子方向分类	文字识别与文档分析
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/46630
专题	复杂系统管理与控制国家重点实验室_影像分析与机器视觉
推荐引用方式 GB/T 7714	蒋经中. 机票行程单拍照识别技术研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis_蒋经中-签名版论文.pdf（4299KB）	学位论文		开放获取	CC BY-NC-SA