面向大规模信息检索的哈希学习方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	面向大规模信息检索的哈希学习方法研究
	丁昆
	2017-05-24
学位类型	工学博士
中文摘要	随着互联网的发展，人类已经进入大数据时代。如何有效地存储和检索这些数据是发挥大数据高级价值中至关重要的一环。传统方法在对数据进行有效地压缩和检索，处理高维数据，利用更现代的机器学习方法改进性能等方面还存在很大不足。哈希学习由于其灵活性和有效性，逐渐成为大数据信息检索的重要方法，并在机器学习领域占得一席之地。虽然目前对于不同的应用场景、不同的问题已经提出了很多哈希方法，但仍有一些亟待解决的重要难题。例如，提高现有哈希方法进行大规模训练的能力，处理多种不同模态的数据，展开更深入的理论分析，等等。本文以大规模数据的信息检索为应用背景，开展哈希学习的研究，并取得了若干研究成果。本文的主要贡献如下： 1. 提出了一种基于近邻表示分解的KNN哈希方法。该方法通过最大化KNN分类器的分类精度学习紧凑的二值码。为降低训练过程中的存储和时间复杂度，提出一种分解的近邻表示方法，该方法将方形的近邻矩阵表示成两个瘦长的参数矩阵的乘积。这一表示不但具有明确的物理意义而且通过优化原始目标的下界目标显著降低了学习复杂度。实验证实了所提方法可以取得不错的哈希性能，同时显著减少训练时间。 2. 提出了一种排序保持跨模态哈希方法。该方法首次将排序保持方法论引入到跨模态哈希中。提出了一种基于回归的排序保持损失函数，并具有大间隔性质。为求解导出的优化问题，引入一个辅助二值矩阵，并使用交替优化技术求解。关于辅助二值矩阵的二值二次规划问题满足子模属性，可以使用图割快速求解。在三个公开数据集上的实验表明所提方法显著提高了跨模态检索的排序性能。 3. 提出了一种局部敏感的两步哈希方法。该方法使用基于局部敏感哈希的随机方法而不是基于优化的方法生成二值码。理论上证明了在一定条件下所提方法是一种局部敏感的哈希策略，具有语义保持属性，当使用哈希表进行查找时具有次线性时间复杂度。同时证明了非对称的编码方式优于对称的编码方式，这有助于解释为什么离散哈希方法在实际中表现得非常有效。在实验层面，验证了与现的两步方法相比所提方法不仅训练速度更快，而且可以取得与基于优化的两步哈希方法可比的检索性能。
英文摘要	With the rapid development of Internet, we human beings have entered the era of big data. How to effectively store and retrieve so many data is one of the most vital steps in exploring the high-level value of big data. Traditional methods have limited capability in effective compression and retrieval, dealing with high-dimensional data and employing more modern machine learning methods to improve performance. Learning-to-hash, due to its high flexibility and effectiveness, is becoming an important method for large-scale information retrieval and has a critical place in machine learning. Although lots of hashing methods have been developed for various application scenarios and for solving different problems, there are still some important issues remain unsolved, such as, improving the ability of handling large-scale training data, processing data of multiple modalities, carrying out more in-depth theoretical analyses, and so on. Taking large-scale information retrieval as the application background, this thesis studies learning-based hashing technique and obtains some new methods and results. The main contributions and novelties can be summarized as follows. 1. A KNN based hashing method is proposed, which learns compact binary codes by maximizing KNN classification precision. To reduce the storage and time complexity during training, a factorized neighborhood system is proposed, which approximates the square similarity matrix with the product of two parametric lanky similarity matrices. This representation not only has specific meaning but allows more efficient training by deriving a lower bound objective. Experiments demonstrate that the proposed method can obtain good hashing performance with remarkably reduced training time. 2. A rank-order preserving hashing method is proposed for efficient cross-modal retrieval, wherein the rank-order preserving methodology is first introduced into the area of cross-modal hashing. A novel regression-based rank-order preserving loss function is proposed, which has provable large margin property. To solve the involved optimization problem, an auxiliary binary matrix is introduced into the objective, with which the problem can be solved favourably in the alternating optimization framework. Specifically, the subproblem w.r.t. the new binary variable satisfies the submodularity, and thus it can be solved by graph-cuts efficiently. Experiments on three datasets demonstrate the proposed method significantly improves the existing ranking performance in cross-modal retrieval. 3. A locality-sensitive two-step hashing method is proposed. It uses locality-sensitive hashing based random methods instead of optimization based methods to generate binary codes. Theoretically, under certain conditions, the proposed method is in essence a locality-sensitive hashing scheme, it has semantics-preserving property and sub-linear query complexity for hash lookup. Meanwhile, it is also provable that asymmetric encoding strategy is better than symmetric strategy, which helps to explain why the discrete hashing technique works quite well in practice. Experimentally, we demonstrate that, compared to existing two-step methods, the proposed method can not only acquire faster training speed, but also obtain comparable retrieval performance.
关键词	哈系学习二值编码矩阵分解监督学习深度学习跨模态检索
学科领域	机器学习+数据挖掘
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14746
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	丁昆. 面向大规模信息检索的哈希学习方法研究[D]. 北京. 中国科学院研究生院,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
丁昆博士论文-面向大规模信息检索的哈希学（14412KB）	学位论文		限制开放	CC BY-NC-SA