CASIA OpenIR  > 毕业生  > 博士学位论文
监控场景下行人属性识别与再辨识关键问题研究
其他题名Pedestrian Attribute Recognition and Re-identification in Surveillance Scenarios
李党伟
学位类型博士
导师黄凯奇
2018-12
学位授予单位中国科学院大学
学位授予地点中国科学院自动化研究所
学科专业模式识别与智能系统
关键词行人属性识别 行人再辨识 多标签分类 智能视频监控
中文摘要

随着社会对公共安全需求的日益增长以及大规模监控摄像机网络的搭建,如何快速的从海量视频中找到感兴趣的行人,已经成为现代化智能视频分析系统监亟待解决的重要问题。作为计算机视觉领域的重要研究内容之一,行人属性识别和行人再辨识技术互相补充,分别致力于基于属性和基于图像的行人检索,为智能视频分析系统提供了有效的技术支撑。近年来,行人检索研究取得了较快的发展,但仍然面临着诸多挑战,如属性数据分布不均衡、行人的姿态变化和背景噪声、场景差异、遮挡等,这些都对行人的特征表达和属性分类器学习带来了许多困难。本文着眼于行人检索中的行人属性识别和行人再辨识这两个关键问题,针对它们面临的重要挑战,开展了如下研究工作:

  1. 研究了基于卷积神经网络的深度多属性识别模型。传统的行人属性识别中,行人的特征表达通常由手工设计的特征组成,并且不同行人属性之间的关系通常被忽略。考虑到手工设计的特征不能很好的应对真实的复杂监控场景,本文提出了基于深度卷积神经网络的端到端的行人属性识别模型,来同时学习行人的特征表达和每个属性的分类器。此外,为了减少模型复杂度,同时更好地利用属性之间的关系来提升多个属性的识别效果,本文提出了深度多属性识别模型,其中不同的行人属性通过共享特征表达,在学习的时候可以相互促进。最后,本文提出了一种改进的交叉熵损失,来解决模型学习过程中行人属性分布不均衡的问题。实验表明,本文提出的方法较传统的手工设计特征有一定的优势,通过利用属性之间的关系和克服样本不均衡,本文的深度多属性识别模型也可以取得较好的结果。
  2. 研究了基于姿态指导的深度行人属性识别模型。已有的深度属性识别模型通常把属性识别建模成为端到端的多标签分类问题,很少人考虑行人属性和行人结构之间的关系。但是行人属性和行人结构具有很强的相关性,比如戴眼镜只存在于头部,如何利用行人的先验结构知识来提升这些局部属性的识别是一个重要的研究点。为了更好地挖掘行人结构信息来辅助属性识别,本文提出了一种基于行人姿态指导的深度模型来进行属性识别。本文提出的模型通过挖掘姿态关键点相关的部件区域,学习部件区域的特征表达,最后组合所有部件区域的表达进行属性识别。在测试阶段,本文提出融合基于全局身体的属性识别得分和基于姿态指导的部件区域的属性识别得分作为最终的属性识别结果。实验结果表明,本文提出的基于姿态指导的深度模型对局部属性具有较好的识别效果,同时融合的结果在主流数据集上取得了不错的实验结果。
  3. 研究了基于行人全身和隐式部件的深度上下文感知特征的行人再辨识。如何提取一个有效的行人特征表达,来克服行人姿态变化、背景噪声等,是行人再辨识的一个重要问题,也是一个热点问题。
    首先,本文提出了一个多尺度上下文感知的深度学习网络,来增强模型对上下文感知的能力,同时通过逐层的多尺度融合,来更好地学习细粒度的特征表达。其次,为了克服行人的姿态变化和背景噪声,本文提出了一种行人隐式部件定位模型,可以自适应的学习到行人的不同部件,进而学习部件相关的行人特征表达。最后,本文提出融合基于行人整体的和基于行人隐式部件的特征表达,作为最终的行人再辨识的特征表达。实验表明,基于隐式部件的特征表达可以一定程度上克服行人的姿态变化和背景噪声,同时融合的表达在主流数据库上也取得了较好的实验结果。
  4. 构建了一个丰富标注的行人检索数据集。行人属性识别和行人再辨识是行人检索的两个重要的互补的分支。已有的公开的数据集通常只关注其中一个子任务。为了更好地研究行人检索,本文构建了一个新的行人检索数据集,可以同时支持行人属性识别、基于属性的行人检索和基于图像的行人检索这三个任务,同时也给出了不同子任务的实验基准。此外,针对属性识别,本文提出了一个偏向于检索的属性识别指标,可以更好地测量不同属性在同一张图像的预测结果的一致性。基于构建的数据集,本文探索了一些新的研究问题,其中包括行人属性识别和行人再辨识之间的关系、跨天的行人再辨识等。
英文摘要

With the fast increment of the requirement of public safety and the construction of the large-scale surveillance camera networks, how to retrieve the person of interest in large-scale videos has been a key problem in intelligent video analysis systems. As important research topics in computer vision, pedestrian attribute recognition and person Re-IDentification (ReID), which are complementary to each other and aim to solve attribute-based and image-based person retrieval, respectively, has been an indispensable technology in modern intelligent video analysis system. Although person retrieval has made great progress in recent years, it still has many challenges, such as unbalanced distribution in pedestrian attributes, person pose variance, background clusters, and occlusion, which make it a great problem in the learning of pedestrian representation and attribute classifiers. In this paper, we focus on the pedestrian attribute recognition and person ReID in person retrieval, and the contributions are as follows:

  1. Deep multi-attribute recognition model. Typical pedestrian attribute recognition methods usually adopt hand-crafted features and the relationship among different attributes are ignored. Instead of the hand-crafted features which may be hard to handle complex surveillance scenarios, for each attribute, we propose to use the convolutional neural networks to learn pedestrian features and attribute classifier jointly. Furthermore, to better utilize the relationship among attributes, we propose the deep multi-attribute recognition model, where the features are shared by all the attributes, and one attribute can assist the learning of another. Finally, an improved cross entropy loss is proposed to handle the unbalance attribute distribution. Experimental results show that the proposed end-to-end models can obtain better representations than hand-crafted features, and by exploring the relationship among attributes, the proposed deep multi-attribute recognition model has achieved state-of-the-art results. 
  2. Pose guided deep multi-attribute recognition model. Existing methods typically treat pedestrian attribute recognition as an end-to-end multi-label classification problem, where the relationships between pedestrian attributes and person structure are usually ignored. In this paper, we propose a pose guided deep model to recognize pedestrian attributes, where the relationships between pedestrian attributes and pedestrian structure, such as the sunglasses only exist at the head, can be better utilized. The proposed model first discovers pose related body regions, then learns the region-based features, and finally combines all the regions' features for attribute recognition.  In the test stage, we fuse the scores from full body and body part regions as the final results. Experimental results show that the proposed model can recognize local attributes much better and the  fused results  have obtained near state-of-the-art results on several popular datasets.
  3. Deep context-aware features over body and latent parts for person ReID.  Although person ReID has made great progress in recent years, how to extract a powerful feature representation to handle the pose variations and background clusters is still an important problem. In this paper, we propose a deep multi-scale context-aware network to better learn fine-grained feature representations. To handle the person pose variance and cluster backgrounds, we propose a latent part localization model, which could learn body parts adaptively without ground truth part supervision, and the localized body parts are further used to obtain body parts based pedestrian representation. Finally, the feature representations from full body and body parts are fused for person ReID. Experimental results show that the latent body parts based representation could partially handle the person pose variations and cluster backgrounds, and the fused representation can obtain much better results in popular person ReID datasets.
  4. A richly annotated pedestrian dataset. Pedestrian attribute recognition and person ReID, which are  complementary to each other, are two core components in person retrieval. Existing public datasets typically focus one of these two problems. To take further research on person retrieval, we propose a new richly annotated dataset which could support pedestrian attribute recognition, attribute-based person retrieval, and person ReID simultaneously, as well as the corresponding baseline results on these three tasks. Furthermore, the instance-based metric for attribute recognition is introduced to better measure the dependency of the prediction of multiple attributes. Finally, some interesting problems, e.g., the joint feature learning of attribute recognition and ReID, and the problem of cross-day person ReID, are explored to show the challenges and future directions in person retrieval. 
页数1-124
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/22390
专题毕业生_博士学位论文
通讯作者李党伟
推荐引用方式
GB/T 7714
李党伟. 监控场景下行人属性识别与再辨识关键问题研究[D]. 中国科学院自动化研究所. 中国科学院大学,2018.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Thesis_sig.pdf(8945KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李党伟]的文章
百度学术
百度学术中相似的文章
[李党伟]的文章
必应学术
必应学术中相似的文章
[李党伟]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。