深度学习新模型及其应用研究

CASIA OpenIR > 毕业生 > 博士学位论文

	深度学习新模型及其应用研究
	黄岩1,2
	2017-05
学位类型	工学博士
中文摘要	近些年大数据的兴起和高性能计算的普及，缓解了传统深度神经网络易过拟合和计算复杂度高等缺陷。由此，深度神经网络强大的数据表示能力得以被释放，并逐渐发展成为一个独立的领域，即深度学习。深度学习的兴起直接推动了人工智能、计算机视觉、模式识别、自然语音处理、语音处理以及机器人等多个领域的飞速发展，并获得政产学研各方面人士的广泛关注。本文围绕深度学习开展了一系列创新研究，针对现有模型结构设计方面的不足，提出了多种新型模型能够更有效地用于计算机视觉与模式识别等领域的多个任务。本论文的具体工作概括如下： 1，通过将多标签预测问题转换为多任务学习问题，提出了多任务条件玻尔兹曼机模型来进行无约束的多模态学习，能够在一个统一框架下同时解决多模态数据中所出现的部分模态数据缺失、多源数据融合和类别共生关系建模等问题； 2，通过在传统高阶玻尔兹曼机的基础上引入监督信息，提出了深度条件高阶玻尔兹曼机模型及其相应的判别式学习算法，该模型能够精确度量数据之间相似性关系，进而适用于处理一些对判别性要求较高的关系学习任务，包括人脸验证和行为关系分类等； 3，通过将所有的全连接操作替换为卷积操作，提出一种全卷积的循环神经网络，可以将模型的参数由百万级降低至几万的同时还能够保持视觉内容的空间结构信息。该模型在视频超分辨应用上相对于其它同类方法在取得高精度的同时还可以在测试速度方面提升两个数量级； 4，提出了一种基于选择式多模态循环网络的匹配方法，可以选择性关注成对图像文本中的所包含的语义实例，并动态地融合多个局部相似性以最终得到全局相似性。该模型探究了基于上下文信息的视觉注意机制建模，在跨模态检索任务上取得很好结果。
英文摘要	Along with the development of big data and high performance computing, the major drawbacks of conventional deep neural networks in terms of overfitting and high computational complexity have been largely overcame。Since the year of 2006, the powerful capability of representation learning of deep neural networks has been uncovered, which gradually becomes a new research area, namely deep learning. Deep learning has promoted fast developments of many related ares, including artificial intelligence, computer vision, pattern recognition, natural language processing, speech processing and robotics, and drawn much attention from government, industry, university and research. In this thesis, we focus on developing novel deep learning models and applying them to various pattern recognition applications. The details are illustrated as follows: 1，By formulating the multi-label learning problem as a multi-task problem, we propose a deep multi-task conditional Boltzmann machine for unconstrained multimodal learning. In this framework, we can jointly deal with missing modality generation, multimodal fusion and label co-occurrence modeling. 2，By introducing class label information into conventional high-order Boltzmann machines, and studying the corresponding discriminative learning algorithms, we propose new deep conditional high-order Boltzmann machines for accurately measuring the similarity relation between pairwise data. 3，Through replacing all the full connections with weight-sharing convolutions, we propose a fully convolutional version of recurrent neural network, which can reduce the number of learning parameters from millions to several thousand. In the application of video super-resolution, the model can achieve better performance and much more fast speed than existing methods. 4，We propose a selective multimodal long short memory network for image and sentence matching. The model can selectively attend to pairwise salient semantic instances, and dynamically measure their local similarities, as well as aggregate all of them to obtain the global similarity. The model achieves good performance on two publicly available datasets.
关键词	深度学习
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14819
专题	毕业生_博士学位论文
作者单位	1.中国科学院自动化研究所 2.中国科学院大学
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	黄岩. 深度学习新模型及其应用研究[D]. 北京. 中国科学院大学,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（12794KB）	学位论文		限制开放	CC BY-NC-SA