CASIA OpenIR  > 多模态人工智能系统全国重点实验室  > 深度强化学习
宽度神经架构搜索
丁子祥
2021-11
Pages140
Subtype博士
Abstract

近年来,深度学习在多个领域取得了巨大的成功,其中深度模型的设计发挥
着至关重要的作用。因此,许多优秀的手工设计模型被相继提出,并取得了极高
的性能。然而,手动地设计模型是一个既耗时又易出错的过程。为了能够自动地
设计网络结构,神经架构搜索应运而生。然而,最初的神经架构搜索方法需要极
大的计算代价。为此,相关研究人员通过改进搜索空间、搜索策略以及性能评估
提出了许多高效的神经架构搜索方法。目前,所有的神经架构搜索方法均采用具
有深度拓扑结构的搜索空间,以发现高性能结构。然而,深度搜索空间在搜索过
程中存在两个问题:1) 单步训练时间长:神经架构搜索方法需要更多的时间通过
代理数据集对搜索空间进行训练;2) 内存效率低下:在一个特定的计算设备上,
神经架构搜索无法同时处理更多的训练数据。将搜索空间的拓扑结构变浅可以
有效地解决上述两个问题,但会因搜索、评估两阶段的模型差异过大而导致性能
下降。
宽度学习系统利用较浅的宽度拓扑结构取得了与深度网络类似甚至更高的
性能,能够很好地解决上述问题。受到宽度学习系统的启发,本文提出了三种高
效的宽度神经架构搜索方法,能够在提升神经架构搜索搜索效率的同时兼顾所
得模型的性能。首先,通过设计三种宽度搜索空间解决深度搜索空间中存在的两
个问题,并采用基于策略梯度的强化学习算法对其进行优化;其次,利用连续松
弛策略将宽度神经架构搜索映射到连续搜索空间,从而进一步提升其搜索效率;
然后,对宽度搜索空间进行重新设计,并通过提前停止策略进一步提升其搜索效
率;最后,通过CIFAR-10 和ImageNet 数据集对三种宽度神经架构搜索算法的性能进行了验证。本文的贡献总结如下:
1 基于宽度卷积神经网络的宽度神经架构搜索
针对深度搜索空间存在的两个问题,提出了宽度搜索空间——宽度卷积神
经网络。该搜索空间能够通过较浅的拓扑结构取得与深度搜索空间类似甚至更
高的分类性能。其中,多尺度信息融合及知识嵌入能够有效地提升宽度卷积神经
网络的性能。此外,在宽度搜索空间的基础上,结合基于策略梯度的强化学习方
法提出了宽度神经架构搜索——BNAS-v1。实验结果显示:BNAS-v1 的搜索效率在基于强化学习的神经架构搜索方法中排名第一,且所得模型具有较高的分
类性能。
2 可微分的宽度神经架构搜索
针对搜索过程中存在的不公平训练问题,提出了可微分的宽度神经架构搜
索——BNAS-v2。通过连续松弛策略同时更新所有候选子网络,从而解决单路径
采样-更新优化方式产生的不公平训练问题,进而进一步提升宽度神经架构搜索
的效率。此外,为了缓解连续松弛导致的性能崩塌问题,提出了置信学习率并同
时引入了部分通道连接策略。实验结果显示:与BNAS-v1 相比,BNAS-v2 在效
率提升4 倍的前提下,取得了更高的分类精度。
3 堆叠式宽度神经架构搜索
针对宽度卷积神经网络中存在的尺度信息多样性丢失以及知识嵌入设计耗
时问题,提出了堆叠式BNAS。一方面,在宽度卷积神经网络的基础上,设计了
一种能够保存所有尺度信息的堆叠式宽度卷积神经网络。另一方面,提出了一种
可微分的知识嵌入搜索算法以解决手工设计知识嵌入耗时的问题。实验结果显
示:堆叠式宽度卷积神经网络能够取得比原始宽度卷积神经网络更高的分类性
能;知识嵌入搜索算法能够有效地消除手工知识嵌入中的冗余信息,从而能够在
保证模型精度的前提下降低参数量。

Other Abstract

Recently, in many fields, deep learning achieves a great success where deep model design plays an important role. Consequently, a great number of excellent hand-crafted models are proposed, and delivers very high performance. However, it is extreme time consuming to design deep model by human. In order to automatically design architecture, Neural Architecture Search (NAS) is proposed. However, vanilla NAS requires
huge computational cost. For that, researchers propose many efficient NAS approaches, via improving search space, search strategy and performance estimation. At present, all NAS approaches employ deep-topology search space to discover high-performance architecture.
Nevertheless, there are two issues in the search procedure of deep-topology search pace: 1) time-consuming single-step training: NAS needs more time to train the search space with proxy dataset; 2) inefficient memory: NAS can not deal with more training time simultaneously on a specific computing device. Shallow-topology search space not only can effectively solve the above two issues, but also can lead to performance
drop due to large model gap between search and evaluation phases.

Broad learning system employs shallow broad topology to deliver similar even better performance than deep network, so that the above issue can be solved well by broad learning system. Inspired by broad learning system, this thesis proposes three efficient Broad Neural Architecture Search (BNAS) approaches which can improve the search efficiency while avoiding performance drop of the learned architecture. First of
all, three broad search spaces are designed to solve the above two issues in deep search space, and policy gradient based reinforcement learning algorithm is employed for architecture optimization; Next, the strategy of continuous relaxation is used to transfer the search space from discrete to continuous, so that the efficiency of BNAS can be improved further; Then, broad search space is redesigned to a new one, and further improve the search efficiency of BNAS via early stopping strategy; At last, CIFAR-10
and ImageNet are used to verify the performances of BNASs. The contributions of this thesis are summarized as follows:

1 Broad convolutional neural network based broad neural architecture search
Broad search space dubbed Broad Convolutional Neural Network (BCNN) is proposed to solve the above two issues in deep search space. Compared with deep search space, BCNN is able to obtain similar or better classification performance with shallow topology. Furthermore, BNAS-v1 is proposed by combining the broad search space and policy gradient based reinforcement learning. Experimental results show that the
efficiency of BNAS-v1 ranks the best in reinforcement learning based NAS approaches, and the learned architecture delivers satisfactory classification performance.
2 Differentiable broad neural architecture search
Differentiable BNAS named BNAS-v2 is proposed to solve the unfair training issue in search procedure. BNAS-v2 employs the strategy of continuous relaxation to update every candidate child network which can solve the unfair training issue caused by single-path sampling-update optimization manner, for larger efficiency improvement. Furthermore, both confident learning rate and partial connection are employed to mitigate the consequent issue of continuous relaxation called performance collapse. Experimental results show that BNAS-v2 delivers 4× faster efficiency and better classification accuracy compared with BNAS-v1.
3 Stacked broad neural architecture search
Stacked BNAS is proposed to solve two issues of BCNN: 1) scale information diversity losing; 2) time-consuming knowledge embedding design. On the one hand, stacked BCNN is proposed based on vanilla BCNN to preserve all scale informations. On the other hand, a differentiable Knowledge Embedding Search (KES) algorithm is
proposed to solve the issue of time-consuming knowledge embedding design. Experimental results show that stacked BCNN can deliver better classification performance than vanilla BCNN; KES can effectively reduce the redundant information of handcrafted knowledge embedding, so that the parameter counts of stacked BCNN can be reduced without performance loss.

Keyword神经架构搜索 宽度卷积神经网络 宽度神经架构搜索
Language中文
Sub direction classification强化与进化学习
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/46582
Collection多模态人工智能系统全国重点实验室_深度强化学习
Recommended Citation
GB/T 7714
丁子祥. 宽度神经架构搜索[D]. 中国科学院自动化研究所智能化大厦三层. 中国科学院大学人工智能学院,2021.
Files in This Item:
File Name/Size DocType Version Access License
宽度神经架构搜索-签字版.pdf(5152KB)学位论文 开放获取CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[丁子祥]'s Articles
Baidu academic
Similar articles in Baidu academic
[丁子祥]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[丁子祥]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.