嵌入目标先验知识的结构化输出学习模型研究

CASIA OpenIR > 毕业生 > 博士学位论文

	嵌入目标先验知识的结构化输出学习模型研究
其他题名	Incorporating Target Prior Knowledge into Structured Output Learning Models
	吴保元
	2014-05-20
学位类型	工学博士
中文摘要	本文的主要研究内容是将目标先验知识嵌入到结构化输出学习模型中。结构化输出学习模型的输出是一个具有内在关联性的多变量的结构体。输出变量之间的内在关联性是该模型的关键所在。描述输出变量关系的目标先验知识恰好可以帮助结构化输出学习模型建立更可靠的关系，从而帮助提高模型的性能。先验知识往往是跟具体问题相关的，因此本文将重点研究三个具体的结构化输出学习问题，来揭示目标先验知识与结构化输出学习模型的关系。这三个问题分别为视频处理，多标签学习和耦合条件随机场模型。本文的主要贡献包括以下三个方面： 1. 我们发现两个相关的结构化输出学习模型可以为彼此提供额外的目标先验知识。因此，如果联合学习这两个模型，则他们的性能都可以得到提高。为实现该想法，我们提出了耦合隐马尔可夫随机场模型（CHMRF），其基本思路是利用模型相关性将两个不同的随机场耦合成一个完整的随机场。为验证该模型的有效性，我们研究了视频处理中的两个相关问题，包括人脸聚类和人脸跟踪。我们探索了视频中客观存在的多种目标先验知识，包括时空知识，样本/约束平滑性假设，聚类标签与轨迹片段连接的相关性，以及关于轨迹片段连接的整体约束等。这些目标先验知识都被系统地嵌入到CHMRF模型中。基于CHMRF模型，我们将聚类和跟踪的联合问题建模成一个贝叶斯推断问题，并给出了有效地优化算法。 2. 在多标签学习中，我们研究了带有缺失标签的多标签学习问题（MLML）。其假设是每个训练数据只提供了部分标签，而其他标签是缺失的。我们首先严格区分了正标签，负标签和缺失标签，从而避免了在已有相关工作中经常出现的标签偏差问题，即将缺失标签直接当做负标签。基于标签一致性，以及样本和类别层面的标签平滑性假设，我们将MLML问题建模成一个归纳式学习问题，并利用迭代加权最小二乘法对该问题进行优化。在图像标注和人脸表情识别上的实验充分说明了该方法的优越性能。 3. 我们提出了一种新的耦合随机场模型（C2RF）。受到结构化输出学习和半监督学习的启发，C2RF利用了样本层面和类别层面的标签相关性，将不同样本所对应的条件随机场耦合起来，形成了一个完整的随机场模型。C2RF可以同时处理多种不同类型的样本，包括全标注，部分标注和无标注样本。因此C2RF可以涵盖监督，半监督和无监督等多种结构化学习模型。我们利用变分期望最大方法（VEM）来优化C2RF模型，这使得C2RF模型同时具备了归纳式和直推式学习模型的优点：不仅能学习一个参数化的分类器，而且能直接推断未观测标签变量的状态。我们在MLML问题上对C2RF模型进行了验证，实验结果证明了其优越性能。
英文摘要	This thesis focuses on incorporating target prior knowledge into structured output learning models. The output of the structured output learning model is a structure of several interdependent output variables. Obviously the interdependencies among the output variables play the key roles in the whole model. The target prior knowledge describing the output dependencies can just help the structured model construct more reliable dependencies, such that can enhance the model performance. Considering that the prior knowledge is always dependent on specific problems, this thesis studies three specific structured output learning problems, to reveal the correlation between target prior and structured output learning models. The three problems are video processing, multi-label learning and coupled conditional random field model respectively. The main contributions of this thesis are highlighted in three aspects: 1. We find that two related structured output learning models can provide addition- al target prior knowledge to each other. Such that if they can be jointly learned, then the overall performance of both models can be enhanced. To formulate this idea, we present a coupled hidden Markov random field model (CHMRF), which couples two different HMRF models based on the dependency between them. Specifically, we study two related problems in video processing, includ- ing face clustering and face tracking. Several target prior knowledge in videos is explored, including spatiotemporal knowledge, example/constraint smoothness assumption, the dependencies between cluster labels and tracklet associations, as well as the constraints among tracklet associations. Aforementioned target prior knowledge is systemically incorporated into the CHMRF model. Fur- thermore, based on CHMRF, we formulate the joint problem of simultaneous clustering and tracklet linking as a Bayesian inference problem, which can be effectively solved by a coordinate descent algorithm. 2. In multi-label learning, we study a general setting, i.e., multi-label learning with missing labels (MLML). MLML assumes that the class labels of training examples are partially provided, while other labels are missing. The positive, negative and missing labels are explicitly distinguished in our MLML setting. Such that the label bias of treating missing as negative labels that often occurs in existing works can be avoided. Furthermore, based on label consistency and example/class-level label smoothness, we formu...
关键词	目标先验知识结构化输出学习耦合隐马尔可夫随机场视频处理多标签学习缺失标签耦合条件随机场 Target Prior Knowledge Structured Output Learning Video Processing Coupled Hidden Markov Random Field Multi-label Learning Missing Labels Coupled Conditional Random Field
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6580
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	吴保元. 嵌入目标先验知识的结构化输出学习模型研究[D]. 中国科学院自动化研究所. 中国科学院大学,2014.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20111801462806（4543KB）			暂不开放	CC BY-NC-SA