CASIA OpenIR  > 毕业生  > 博士学位论文
行人分割与识别研究
宋纯锋
Subtype博士
Thesis Advisor王亮
2019-12
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Name工学博士
Degree Discipline模式识别与智能系统
Keyword行人图像分割,行人识别,步态识别,弱监督学习
Abstract

随着国家和社会对于大范围智能视频分析和个体身份识别技术日益增长的需求,远距离身份识别技术的研究吸引了众多研究者。当前,常用的身份识别技术主要有指纹识别、人脸识别、虹膜识别等。这些技术由于具有较高的识别准确率在金融身份验证、人机交互、门禁打卡等任务上得到大规模的应用。这些技术都需要识别对象的主动配合,并且通常对于识别距离、识别视角、识别环境等都有较为严格的限制,因此它们难以实现在较远距离、任意姿态、任意视角、非配合情况下的行人身份识别。对于这些问题,本文提出的基于行人图像分割与行人身份识别算法提供了解决思路,研究了弱监督情况下的图像分割以及基于行人图像分割的行人识别等关键问题。本文的研究注重行人识别方法的实际使用场景,在数据供应难度、算法效率、环境多样性等方面都予以考量,设计并研发了可在实际监控环境应用的行人识别方案。本文针对以下问题展开具体研究

(1)考虑到图像分割模型对于标注数据的依赖,在数据受限的条件下,探索了弱监督行人图像分割解决方案。行人图像分割在全卷积分割网络的帮助下取得了一系列的进展。这些全卷积分割模型通常依赖于代价高昂且耗时的像素级别的人工标注数据来训练而使得其无法快速部署。为了解决这一问题,通过弱监督的方式利用廉价的检测框来指导分割学习是一个不错的选择。本文首先提出利用检测框来构建类别敏感的目标蒙板来滤除与该类别无关的背景,这种方式可以帮助后续的弱监督分割学习。在此基础上,本文发现每个类别的检测框中前景目标的像素填充率具有统计规律上的一致性,并将其作为一个先验信息来指导分割模型动态选择最具有置信度的区域。本文提出的方法在PASCAL VOC 2012数据集上的大量实验表明,该方法是有效的并取得了当前最优的性能。通过这种方法,可达到在数据受限场景下的接近全监督模型的行人图像分割性能。

(2)为了减少背景噪声对于行人识别的影响,研究了基于行人分割的行人再识别方法。行人再识别问题是一个具有挑战性的经典计算机视觉任务。通常摄像头采集到的行人图像中含有杂乱的背景,并且图像中的行人通常有多种多样的姿态和视角,这些多样性造成的困难在之前的研究中都尚未得到很好的解决。本文引进了二值化的行人分割轮廓图作为额外输入,并与彩色图像合成为四通道的新输入,然后设计了一种基于分割轮廓图的对比注意模型来学习背景无关的行人特征。在此基础上,本文提出了一种区域级别的三元组损失函数,分别来约束来自全图区域、行人身体区域、背景区域的特征,最终达到去除背景的作用。所提出的方法在MARS,Market-1501以及CUHK03等三个行人再识别数据集上验证了有效性,取得了当前最好的性能。

(3) 结合成熟的行人分割方法,提出一种步态分割与识别一体化方法。考虑到当前步态识别方法通常包括图像分割、步态模板生成、特征提取以及度量学习等多个人工设置的步骤。一方面,这些硬性的操作过程中可能会导致一些有益特征的丢失,比如步态中的纹理、时序等信息;另一方面,这一系列的步骤包含了一些冗余的信息,以及由于中间步骤误差导致的累积误差。因此,需要一种自动的精确设计的端到端学习的框架,从原始步态图像学习步态特征。本文提出一种基于卷积神经网络的步态分割与识别一体化模型,其中步态分割包含多个通道,每个通道负责分割一帧步态图像;这些通道的输出直接作为步态识别的输入,最终识别与分割两个约束同时作用于整个模型。所提出的方法在三个步态数据库(包括CASIA-B、SZU RGB-D以及新建立的Outdoor-Gait)上的步态识别性能优于目前主流的步态识别方法,验证了其有效性。

Other Abstract

With the increasing demands for large-scale intelligent video analysis and individual identity identification technologies from the country and society, the research of long-distance human identification technology has attracted many researchers. At present, the mostly used identification technologies mainly include fingerprint recognition, face recognition, iris recognition and so on. Due to the high identification accuracy, these technologies have been widely used in financial authentication, human-computer interaction, access control and other tasks. These technologies often require the active cooperation of the identified person, and generally have strict restrictions on the recognition distance, recognition perspective, and recognition environment. Therefore, it is difficult for them to realize pedestrian identification with a relatively long distance, arbitrary attitude, arbitrary perspective, and non-coordination persons. To address these problems, this paper propose a human image segmentation based pedestrian identification method. This paper also studies the key problems such as weakly supervised image segmentation and pedestrian recognition based on human image segmentation. The study of this paper focuses on the actual scenarios of pedestrian recognition methods, taking into account the difficulties of data supply, algorithm efficiency, environmental diversity and other aspects. Finally, this paper propose to design and develop the pedestrian recognition scheme that can be applied in the actual monitoring environment. In summary, this paper will focus on the following aspects:

1) Considering the dependences of the image segmentation models on the annotated data, the weakly supervised human image segmentation method is explored in the condition of limited data. With the help of Fully Convolutional Network (FCN), human image segmentation has made a series of progresses. These full convolutional segmentation models usually rely on expensive and time-consuming manual annotation pixel-level data, which makes them unable to be deployed quickly. In order to overcome this problem, it is a good choice to adopt the cheap detection box to guide the segmentation learning with weak supervisions. In this paper, we first propose a box-driven class-wise masking module to filter out the background irrelevant to the category, which can help the subsequent weakly supervised segmentation learning. In addition, this paper finds that the pixel filling rate of the foreground target in the detection box of each category has a statistical consistency, which can be regarded as a prior information to guide the segmentation model to dynamically select the region with the highest confidence. A series of experiments have been implemented on PASCAL VOC 2012 dataset, showing that the proposed method is effective and achieves the state-of-the-art performance. This method can be used for the segmentation of pedestrian image, even under the condition with just limited data.

2) To reduce the influence of the background noise in pedestrian images, a segmentation based pedestrian recognition method is proposed in this paper. Person re-identification is a challenging classical computer vision task. Generally, the person images captured by the camera usually have messy backgrounds, while the persons in the images usually have a variety of attitudes and perspectives. These difficulties have not been well addressed in previous studies. In this paper, the binary human segmentation mask is introduced as an additional input. It can be synthesized with the RGB image into a new 4-channel input. Then a mask-guided contrast-attention model is proposed to learn the background-invariant person features. In addition, a region-level triplet loss is proposed to constrain the features of the whole map, body region and background region. The proposed method is evaluated on MARS, Market-1501, CUHK03, achieving the state-of-the-art performance.

3) Combining with the human segmentation method, this paper propose a uniform framework to joint learn the gait segmentation and recognition. Considering that current gait recognition pipelines usually include several separated steps, such as the image segmentation, gait template generation, feature extraction and metric learning. On one hand, these rigid operations may lead to the loss of some beneficial features, such as the texture and temporal information in gait. On the other hand, these steps contain cumulative errors due to intermediate steps. Therefore, an end-to-end learning framework is needed to learn gait features directly from the original gait images. To this end, this paper proposes a uniform framework with gait segmentation and recognition modules. The proposed method is evaluated on three gait databases, including CASIA-B, SZU RGB-D gait and the newly established Outdoor-Gait. The experimental results show that the proposed method is effective and outperforms the compared methods.

Pages124
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/28371
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
宋纯锋. 行人分割与识别研究[D]. 中国科学院自动化研究所. 中国科学院大学,2019.
Files in This Item:
File Name/Size DocType Version Access License
Thesis_宋纯锋_2019.pdf(9416KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[宋纯锋]'s Articles
Baidu academic
Similar articles in Baidu academic
[宋纯锋]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[宋纯锋]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.