复杂场景下的行人再识别方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	复杂场景下的行人再识别方法研究
	徐博强
	2023-05-19
页数	126
学位类型	博士
中文摘要	随着智慧城市建设的不断推进，大量的监控设备已经被部署在各种公共场所，形成了庞大的分布式监控网络，从而产生了大量的视频监控数据。行人再识别技术结合了计算机视觉、机器学习和模式识别等领域的方法，通过提取监控画面中行人图像的外观特征并比较特征相似度，实现了关联不同摄像机中同一行人的轨迹图像。在公安刑侦、人物检索和人机交互等场景中，行人再识别技术具有广泛的应用前景。与手工特征相比，基于深度学习的行人再识别方法利用深度神经网络提取出了更具判别性的行人特征，并在公开的学术数据集上展现了优异的再识别性能。然而，在实际应用过程中，行人再识别模型经常面临跨域、低光照、遮挡等复杂环境，具体而言可能遇到以下问题：（1）常规的行人再识别模型通常基于行人的衣着服饰提取特征，进而对行人身份进行判断。然而，在学校、工厂、银行等场景中，由于人们穿着相似的衣服，导致无法再根据衣着特征对行人身份进行判断，因此常规的行人再识别模型的准确率会大幅度下降。如何在这种情况下进行准确的行人再识别，对行人再识别模型的判别能力提出了挑战。（2）行人再识别模型在实际应用中涵盖学校、商场、车站、机场等多个场景。但是，当行人再识别模型在未经训练的领域上进行部署时，由于不同领域之间的差异较大，使得行人再识别模型在跨场景应用时性能显著下降，甚至无法使用，这对行人再识别模型的泛化性和域适应性提出了挑战。（3）实际应用场景中，人遮人和物遮人的情况经常出现。遮挡物的存在会在特征提取和特征匹配过程中引入噪声，并使得行人的部分特征缺失，从而导致遮挡行人再识别的准确率下降，这对行人再识别模型的鲁棒性提出了挑战。本文针对上述挑战对行人再识别任务展开了研究，主要工作和创新点包括：针对人们穿着相似衣服的细粒度行人再识别问题，由于这种情况下衣着特征不再可靠，本文提出了一种同时利用头肩特征和与颜色无关的特征辅助进行行人身份判别的行人再识别算法。该算法会通过一个轻量级的头肩分割层对头肩区域进行定位并提取对应特征。同时利用实例归一化来学习与颜色无关的特征，并且通过注意力机制来挖掘更加丰富的判别线索，以确保提取出的与颜色无关的特征拥有较强的表征性。本文建立了用于研究细粒度行人再识别的数据集FG-reID，并通过与其他方法的对比，验证了该行人再识别算法的有效性。针对跨域行人再识别问题，本文提出了一种基于多专家网络的行人再识别算法。传统的多专家网络会面临两个问题：第一个问题是由于每个源域都需要设计一个对应的专家网络分支，整个模型的参数量会随着源域数目的增加而大幅度增加；第二个问题是大多数多专家网络只利用了领域特有的特征，忽略了对于跨域的不变特征的应用。针对上述2个问题，本文设计的行人再识别算法的所有专家网络共享除了批量归一化层之外的所有参数，从而使得即使源域数目增加，模型参数量也会保持相对稳定。同时该算法还利用实例归一化来提取跨域的不变特征，并且通过源域的批量归一化统计量和测试样本的实例归一化统计量之间的距离来衡量目标样本与源域之间的相关性，从而自适应的融合多个专家网络的特征。此外，本文还设计了一致性损失函数和情景训练算法，来增加模型的泛化性。本文在设计的3个测试协议上超过了现有的基线方法和其他相关工作，验证了该行人再识别算法的有效性。针对遮挡行人再识别问题，本文总结出了主要面临的两个挑战：第一个问题是在特征匹配过程中遮挡的存在会引入噪声，使特征匹配的准确率降低；第二个问题是遮挡的存在会使得行人的部分特征缺失，从而使得提取的行人特征表征能力降低。针对上述两个问题，本文提出了一种主要由图匹配模块和遮挡特征恢复模块组成的行人再识别算法。为了减少特征匹配过程中遮挡的干扰，图匹配模块在特征匹配过程中主要关注于两幅图像中公共可见的区域。为了解决遮挡造成的行人部分特征缺失的问题，遮挡特征恢复模块利用查询图像的k阶近邻的特征来对查询图像被遮挡部分的特征做特征重建。本文在4个遮挡数据集上的性能超过了现有的基线方法和其他相关工作，验证了该行人再识别算法的有效性。总而言之，本文针对行人再识别模型在实际复杂场景中可能面临的穿着相似衣服、跨域和遮挡的问题进行了深入的研究，并提出了多种有效的方法，提升了行人再识别模型的判别能力、泛化能力和鲁棒性，增强了模型的实际应用能力，推动了行人再识别课题的发展。
英文摘要	With the continuous promotion of the construction of smart cities, a large number of monitoring devices have been deployed in various public places, forming a massive distributed monitoring network, which has generated a large amount of video surveillance data. Person re-identification technology combines methods from the fields of computer vision, machine learning, and pattern recognition to extract appearance features from pedestrian images in surveillance footage and compare feature similarity to achieve association of trajectory images of the same pedestrian in different cameras. Person re-identification technology has a wide range of application prospects in scenarios such as public security criminal investigation, character retrieval, and human-machine interaction. Compared with handcrafted features, person re-identification methods based on deep learning utilize deep neural networks to extract more discriminative pedestrian features and have demonstrated outstanding re-identification performance on publicly available academic datasets. However, in practical applications, person re-identification models often face complex environments such as cross-domain, low lighting, and occlusion, which may encounter the following issues: (1) Conventional person re-identification models usually extract features based on a pedestrian's clothing and apparel, and then make judgments on the pedestrian's identity. However, in scenarios such as schools, factories, and banks, people may wear similar clothing, resulting in a significant decrease in the accuracy of conventional person re-identification models based on clothing features. This poses a challenge to the discriminative ability of person re-identification models in accurately recognizing pedestrians under such circumstances. (2) person re-identification models cover multiple scenarios such as schools, shopping malls, stations, and airports in practical applications. However, when deploying person re-identification models in untrained domains, significant differences between different domains may cause a significant decline in the model's performance in cross-domain applications, or even make it unusable, which poses a challenge to the generalization and domain adaptation of person re-identification models. (3) In real-world applications, situations where people or objects occlude pedestrians often occur. The presence of occlusions introduces noise in the feature extraction and matching processes and causes some features of the pedestrian to be missing, resulting in a decrease in the accuracy of occluded person re-identification, which poses a challenge to the robustness of person re-identification models. This paper addresses the aforementioned challenges in person re-identification and presents research in this field. The main contributions and innovations of this work include: To address the fine-grained person re-identification problem caused by people wearing similar clothing, where clothing features are no longer reliable, this paper proposes a person re-identification framework that utilizes both head-shoulder features and color-independent features to assist in pedestrian identity recognition. The framework uses a lightweight head-shoulder segmentation layer to locate and extract corresponding features from the head-shoulder region. Simultaneously, it employs instance normalization to learn color-independent features and attention mechanisms to mine richer discriminative cues to ensure that the extracted color-independent features have strong representational power. The paper establishes a dataset called FG-reID for studying fine-grained person re-identification and validates the effectiveness of the proposed person re-identification framework through comparison with other methods. To address the cross-domain person re-identification problem, this paper proposes a person re-identification framework based on multiple expert networks. Traditional multiple expert networks face two issues: the first is that the parameter volume of the entire model increases significantly with the number of source domains because each source domain requires a corresponding expert network branch; the second is that most multiple expert networks only utilize domain-specific features and ignore the application of invariant features for cross-domain. To address these two issues, all expert networks of the person re-identification framework proposed in this paper share all parameters except for batch normalization layers, thereby keeping the model parameter volume relatively stable even when the number of source domains increases. The framework also utilizes instance normalization to extract cross-domain invariant features and measures the correlation between the target sample and the source domain by the distance between the batch normalization statistics of the source domain and the instance normalization statistics of the test sample, thereby adaptively fusing features from multiple expert networks. Additionally, this paper designs a consistency loss function and scenario training algorithm to increase the model's generalization ability. The proposed person re-identification framework outperforms existing baseline methods and other relevant work on the three test protocols designed in this paper, validating its effectiveness. To address the occluded person re-identification problem, this paper summarizes the two main challenges faced: the first is that occlusion introduces noise in the feature matching process, which reduces the accuracy of feature matching; the second is that occlusion causes some features of the pedestrian to be missing, which reduces the representational power of the extracted pedestrian features. To address these two issues, this paper proposes a person re-identification framework mainly composed of a graph matching module and an occlusion feature recovery module. To reduce the interference of occlusion in the feature matching process, the graph matching module mainly focuses on the common visible regions in two images during the feature matching process. To address the problem of missing pedestrian features caused by occlusion, the occlusion feature recovery module uses the features of k-nearest neighbors of the query image to perform feature reconstruction of the occluded parts of the query image's features. The proposed person re-identification framework outperforms existing baseline methods and other relevant work on four occlusion datasets, validating its effectiveness. In summary, this paper conducts in-depth research on the fine-grained person re-identification problem, cross-domain, and occlusion that person re-identification models may face in practical complex scenarios, and proposes various effective methods to enhance the discriminative ability, generalization ability, and robustness of person re-identification models, enhancing their practical application capabilities and promoting the development of person re-identification.
关键词	行人再识别细粒度检索跨域检索遮挡行人再识别
语种	中文
七大方向——子方向分类	目标检测、跟踪与识别
国重实验室规划方向分类	视觉信息处理
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52316
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	徐博强. 复杂场景下的行人再识别方法研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
博士论文.pdf（8990KB）	学位论文		限制开放	CC BY-NC-SA