宽度神经架构搜索

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 深度强化学习

	宽度神经架构搜索
	丁子祥
	2021-11
页数	140
学位类型	博士
中文摘要	近年来，深度学习在多个领域取得了巨大的成功，其中深度模型的设计发挥着至关重要的作用。因此，许多优秀的手工设计模型被相继提出，并取得了极高的性能。然而，手动地设计模型是一个既耗时又易出错的过程。为了能够自动地设计网络结构，神经架构搜索应运而生。然而，最初的神经架构搜索方法需要极大的计算代价。为此，相关研究人员通过改进搜索空间、搜索策略以及性能评估提出了许多高效的神经架构搜索方法。目前，所有的神经架构搜索方法均采用具有深度拓扑结构的搜索空间，以发现高性能结构。然而，深度搜索空间在搜索过程中存在两个问题：1) 单步训练时间长：神经架构搜索方法需要更多的时间通过代理数据集对搜索空间进行训练；2) 内存效率低下：在一个特定的计算设备上，神经架构搜索无法同时处理更多的训练数据。将搜索空间的拓扑结构变浅可以有效地解决上述两个问题，但会因搜索、评估两阶段的模型差异过大而导致性能下降。宽度学习系统利用较浅的宽度拓扑结构取得了与深度网络类似甚至更高的性能，能够很好地解决上述问题。受到宽度学习系统的启发，本文提出了三种高效的宽度神经架构搜索方法，能够在提升神经架构搜索搜索效率的同时兼顾所得模型的性能。首先，通过设计三种宽度搜索空间解决深度搜索空间中存在的两个问题，并采用基于策略梯度的强化学习算法对其进行优化；其次，利用连续松弛策略将宽度神经架构搜索映射到连续搜索空间，从而进一步提升其搜索效率；然后，对宽度搜索空间进行重新设计，并通过提前停止策略进一步提升其搜索效率；最后，通过CIFAR-10 和ImageNet 数据集对三种宽度神经架构搜索算法的性能进行了验证。本文的贡献总结如下： 1 基于宽度卷积神经网络的宽度神经架构搜索针对深度搜索空间存在的两个问题，提出了宽度搜索空间——宽度卷积神经网络。该搜索空间能够通过较浅的拓扑结构取得与深度搜索空间类似甚至更高的分类性能。其中，多尺度信息融合及知识嵌入能够有效地提升宽度卷积神经网络的性能。此外，在宽度搜索空间的基础上，结合基于策略梯度的强化学习方法提出了宽度神经架构搜索——BNAS-v1。实验结果显示：BNAS-v1 的搜索效率在基于强化学习的神经架构搜索方法中排名第一，且所得模型具有较高的分类性能。 2 可微分的宽度神经架构搜索针对搜索过程中存在的不公平训练问题，提出了可微分的宽度神经架构搜索——BNAS-v2。通过连续松弛策略同时更新所有候选子网络，从而解决单路径采样-更新优化方式产生的不公平训练问题，进而进一步提升宽度神经架构搜索的效率。此外，为了缓解连续松弛导致的性能崩塌问题，提出了置信学习率并同时引入了部分通道连接策略。实验结果显示：与BNAS-v1 相比，BNAS-v2 在效率提升4 倍的前提下，取得了更高的分类精度。 3 堆叠式宽度神经架构搜索针对宽度卷积神经网络中存在的尺度信息多样性丢失以及知识嵌入设计耗时问题，提出了堆叠式BNAS。一方面，在宽度卷积神经网络的基础上，设计了一种能够保存所有尺度信息的堆叠式宽度卷积神经网络。另一方面，提出了一种可微分的知识嵌入搜索算法以解决手工设计知识嵌入耗时的问题。实验结果显示：堆叠式宽度卷积神经网络能够取得比原始宽度卷积神经网络更高的分类性能；知识嵌入搜索算法能够有效地消除手工知识嵌入中的冗余信息，从而能够在保证模型精度的前提下降低参数量。
英文摘要	Recently, in many fields, deep learning achieves a great success where deep model design plays an important role. Consequently, a great number of excellent hand-crafted models are proposed, and delivers very high performance. However, it is extreme time consuming to design deep model by human. In order to automatically design architecture, Neural Architecture Search (NAS) is proposed. However, vanilla NAS requires huge computational cost. For that, researchers propose many efficient NAS approaches, via improving search space, search strategy and performance estimation. At present, all NAS approaches employ deep-topology search space to discover high-performance architecture. Nevertheless, there are two issues in the search procedure of deep-topology search pace: 1) time-consuming single-step training: NAS needs more time to train the search space with proxy dataset; 2) inefficient memory: NAS can not deal with more training time simultaneously on a specific computing device. Shallow-topology search space not only can effectively solve the above two issues, but also can lead to performance drop due to large model gap between search and evaluation phases. Broad learning system employs shallow broad topology to deliver similar even better performance than deep network, so that the above issue can be solved well by broad learning system. Inspired by broad learning system, this thesis proposes three efficient Broad Neural Architecture Search (BNAS) approaches which can improve the search efficiency while avoiding performance drop of the learned architecture. First of all, three broad search spaces are designed to solve the above two issues in deep search space, and policy gradient based reinforcement learning algorithm is employed for architecture optimization; Next, the strategy of continuous relaxation is used to transfer the search space from discrete to continuous, so that the efficiency of BNAS can be improved further; Then, broad search space is redesigned to a new one, and further improve the search efficiency of BNAS via early stopping strategy; At last, CIFAR-10 and ImageNet are used to verify the performances of BNASs. The contributions of this thesis are summarized as follows: 1 Broad convolutional neural network based broad neural architecture search Broad search space dubbed Broad Convolutional Neural Network (BCNN) is proposed to solve the above two issues in deep search space. Compared with deep search space, BCNN is able to obtain similar or better classification performance with shallow topology. Furthermore, BNAS-v1 is proposed by combining the broad search space and policy gradient based reinforcement learning. Experimental results show that the efficiency of BNAS-v1 ranks the best in reinforcement learning based NAS approaches, and the learned architecture delivers satisfactory classification performance. 2 Differentiable broad neural architecture search Differentiable BNAS named BNAS-v2 is proposed to solve the unfair training issue in search procedure. BNAS-v2 employs the strategy of continuous relaxation to update every candidate child network which can solve the unfair training issue caused by single-path sampling-update optimization manner, for larger efficiency improvement. Furthermore, both confident learning rate and partial connection are employed to mitigate the consequent issue of continuous relaxation called performance collapse. Experimental results show that BNAS-v2 delivers 4× faster efficiency and better classification accuracy compared with BNAS-v1. 3 Stacked broad neural architecture search Stacked BNAS is proposed to solve two issues of BCNN: 1) scale information diversity losing; 2) time-consuming knowledge embedding design. On the one hand, stacked BCNN is proposed based on vanilla BCNN to preserve all scale informations. On the other hand, a differentiable Knowledge Embedding Search (KES) algorithm is proposed to solve the issue of time-consuming knowledge embedding design. Experimental results show that stacked BCNN can deliver better classification performance than vanilla BCNN; KES can effectively reduce the redundant information of handcrafted knowledge embedding, so that the parameter counts of stacked BCNN can be reduced without performance loss.
关键词	神经架构搜索宽度卷积神经网络宽度神经架构搜索
语种	中文
七大方向——子方向分类	强化与进化学习
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/46582
专题	多模态人工智能系统全国重点实验室_深度强化学习
推荐引用方式 GB/T 7714	丁子祥. 宽度神经架构搜索[D]. 中国科学院自动化研究所智能化大厦三层. 中国科学院大学人工智能学院,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
宽度神经架构搜索-签字版.pdf（5152KB）	学位论文		开放获取	CC BY-NC-SA