CASIA OpenIR  > 毕业生  > 博士学位论文
深度学习新模型及其应用研究
黄岩1,2
Subtype工学博士
Thesis Advisor王亮
2017-05
Degree Grantor中国科学院大学
Place of Conferral北京
Keyword深度学习
Abstract
近些年大数据的兴起和高性能计算的普及,缓解了传统深度神经网络易过拟合和计算复杂度高等缺陷。由此,深度神经网络强大的数据表示能力得以被释放,并逐渐发展成为一个独立的领域,即深度学习。深度学习的兴起直接推动了人工智能、计算机视觉、模式识别、自然语音处理、语音处理以及机器人等多个领域的飞速发展,并获得政产学研各方面人士的广泛关注。本文围绕深度学习开展了一系列创新研究,针对现有模型结构设计方面的不足,提出了多种新型模型能够更有效地用于计算机视觉与模式识别等领域的多个任务。本论文的具体工作概括如下:
1,通过将多标签预测问题转换为多任务学习问题,提出了多任务条件玻尔兹曼机模型来进行无约束的多模态学习,能够在一个统一框架下同时解决多模态数据中所出现的部分模态数据缺失、多源数据融合和类别共生关系建模等问题;
2,通过在传统高阶玻尔兹曼机的基础上引入监督信息,提出了深度条件高阶玻尔兹曼机模型及其相应的判别式学习算法,该模型能够精确度量数据之间相似性关系,进而适用于处理一些对判别性要求较高的关系学习任务,包括人脸验证和行为关系分类等;
3,通过将所有的全连接操作替换为卷积操作,提出一种全卷积的循环神经网络,可以将模型的参数由百万级降低至几万的同时还能够保持视觉内容的空间结构信息。该模型在视频超分辨应用上相对于其它同类方法在取得高精度的同时还可以在测试速度方面提升两个数量级;
4,提出了一种基于选择式多模态循环网络的匹配方法,可以选择性关注成对图像文本中的所包含的语义实例,并动态地融合多个局部相似性以最终得到全局相似性。该模型探究了基于上下文信息的视觉注意机制建模,在跨模态检索任务上取得很好结果。
Other Abstract
Along with the development of big data and high performance computing, the major drawbacks of conventional deep neural networks in terms of overfitting and high computational
complexity have been largely overcame。Since the year of 2006, the powerful capability of representation learning of deep neural networks has been uncovered, which gradually becomes a new research area, namely deep learning. Deep learning has promoted fast developments of many related ares, including artificial intelligence, computer vision, pattern recognition, natural language processing, speech processing and robotics, and
drawn much attention from government, industry, university and research. In this thesis,
we focus on developing novel deep learning models and applying them to various pattern
recognition applications. The details are illustrated as follows:
1,By formulating the multi-label learning problem as a multi-task problem, we propose
a deep multi-task conditional Boltzmann machine for unconstrained multimodal learning. In this framework, we can jointly deal with missing modality generation, multimodal fusion and label co-occurrence modeling.
2,By introducing class label information into conventional high-order Boltzmann machines, and studying the corresponding discriminative learning algorithms, we propose new deep conditional high-order Boltzmann machines for accurately measuring the similarity relation between pairwise data.
3,Through replacing all the full connections with weight-sharing convolutions, we propose a fully convolutional version of recurrent neural network, which can reduce the
number of learning parameters from millions to several thousand. In the application
of video super-resolution, the model can achieve better performance and much more
fast speed than existing methods.
4,We propose a selective multimodal long short memory network for image and sentence matching. The model can selectively attend to pairwise salient semantic instances, and dynamically measure their local similarities, as well as aggregate all of them to obtain the global similarity. The model achieves good performance on two publicly available datasets.
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/14819
Collection毕业生_博士学位论文
Affiliation1.中国科学院自动化研究所
2.中国科学院大学
Recommended Citation
GB/T 7714
黄岩. 深度学习新模型及其应用研究[D]. 北京. 中国科学院大学,2017.
Files in This Item:
File Name/Size DocType Version Access License
Thesis.pdf(12794KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[黄岩]'s Articles
Baidu academic
Similar articles in Baidu academic
[黄岩]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[黄岩]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.