CASIA OpenIR  > 毕业生  > 硕士学位论文
神经网络结构快速搜索方法研究
陈逸飞
Subtype硕士
Thesis Advisor黄凯奇
2022-05-25
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机技术
Keyword深度学习 深度神经网络 神经网络结构自动搜索
Abstract

近年来,深度神经网络在很多人工智能任务中都取得了优异的表现,这在很大程度上要归功于优秀的神经网络结构设计。然而,人工设计神经网络结构不仅需要专业的领域知识,还需要大量的试错和迭代。因此,仅依靠人工设计神经网络结构,难以满足未来日益增多且日益多样化的应用需求。神经网络结构自动搜索旨在由人给定神经网络的输入和输出后,在完全不依赖人类专家干预或依靠较少人类专家干预的情况下,利用计算机自动地设计出性能优异的神经网络结构。然而,当前的神经网络结构自动搜索算法仍然存在着计算开销大,搜索效率低,泛化性能差等一系列问题。本文首先回顾了经典的神经网络结构设计,紧接着在此基础上给出了神经网络结构自动搜索的研究框架,并根据该研究框架回顾了此前比较具有代表性的神经网络结构自动搜索算法,之后本文围绕如何降低计算开销、如何提升搜索效率、如何减少人的干预等难点,开展了深入的研究,本文的主要研究内容和贡献如下:

(1)搜索空间决定了神经网络结构自动搜索的上限,鉴于搜索空间的重要性,本文提出了一种用于搜索轻量级卷积神经网络结构的细粒度搜索空间。与同类型的搜索空间相比,该搜索空间有着更大的搜索范围和更多的可能性。另外,在可微分神经网络结构自动搜索中,搜索空间通常被表示为一种具有特殊结构的神经网络——超网络。由于超网络的结构不仅会影响搜索算法的性能,也在很大程度上决定了搜索算法的计算开销,为了尽可能地降低搜索过程的计算开销,本文设计了一种用于表示上述细粒度搜索空间的超网络结构,该超网络结构的训练开销与一般的轻量级卷积神经网络相当。

(2)可微分神经网络结构自动搜索可被建模为带有多个约束条件的单目标优化问题,本文提出了一种基于随机交替方向乘子法的可微分神经网络结构自动搜索算法,用于带有多个组合约束条件的神经网络结构自动搜索,并给出了该搜索算法收敛速度的下界。该搜索算法能以较快的速度搜索到严格满足约束条件且性能优异的神经网络结构,而且避免了在搜索过程中引入手工设计的启发式正则项,进而减少了人对搜索过程的干预。

(3)本文提出了一种基于分层监督学习的 One-Shot 神经网络结构自动搜索算法,该算法利用局部损失函数对超网络中的可训练参数进行训练,并在训练时通过堆叠超网络中的局部模块以使梯度信号递归地反向传播。该算法减少了超级网络中的表征漂移,从而缓解了当前 One-Shot 神经网络结构自动搜索算法存在的排序紊乱问题。另外,该搜索算法具有较快的收敛速度,并减少了超网络训练过程 $ 47.4 \% $ 的 GPU 显存开销。

Other Abstract

Recently, Deep Neural Networks have achieved excellent performance in many artificial intelligence tasks, which is largely due to the excellent neural network structure design. However, manually designing neural network architectures not only requires specialized domain knowledge but also requires a burdensome trial-and-error process. Therefore, it is difficult to meet the increasing and increasingly diverse demands in the future only by manually designing neural network architectures. Neural Architecture Search aims to automatically design neural network architectures with excellent performance without relying on human expert intervention at all or only relying on a little human expert intervention after the input and output of the neural network are given. However, current search algorithms suffer from a series of issues such as high computational cost, low search efficiency, and poor generalization. This dissertation firstly reviews the classical neural network architecture design, and then gives a research framework for Neural Architecture Search and reviews previous representative search algorithms. Afterward, this dissertation carried out in-depth research on critical issues of Neural Architecture Search, such as how to reduce computational overhead, how to improve search efficiency, and how to reduce human intervention. The main research contents and contributions of this dissertation are as follows:

(1) Since the search space determines the upper limit of architecture search algorithms, this dissertation proposes a fine-grained search space for searching lightweight convolutional neural network architectures. Compared with previous works, this search space has a larger search range and more possibilities. Moreover, a dense super-net is constructed to cover all candidate architectures in Differentiable Architecture Search, and the architecture of super-net not only affects the performance of the search algorithm but also greatly affects the computational cost of the search algorithm. To reduce the computational cost of the search process as much as possible, this dissertation designs a super-net architecture for representing the above fine-grained search space, and the training cost of super-net is reduced to regular lightweight networks.

(2) Differentiable Architecture Search can be formulated as a single-objective optimization problem with multiple constraints. This dissertation proposes a differentiable search algorithm based on the stochastic alternating direction method of multipliers to search neural network architectures under multiple combinatorial constraints and gives the lower bound of the convergence speed of the proposed search algorithm. The search algorithm can quickly search for the neural network architecture that strictly meets the constraints and has excellent performance. Moreover, it avoids the introduction of hand-crafted heuristic regularization terms in the search process, thereby greatly reducing human expert intervention in the search process.

(3) This dissertation proposes a one-shot architecture search algorithm based on layer-wisely supervised learning, which uses a local loss function to train trainable parameters in the super-net, and stacks the local super-net modules during the super-net training to recursively backpropagate the gradient signal. The proposed algorithm reduces the representation shift in the super-net, thus significantly alleviating the ranking disorder issue of the current One-Shot Neural Architecture Search. In addition, the search algorithm achieves a fast convergence speed and reduces the GPU memory overhead of the super-net training process by $47.4 \% $.

Pages94
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/48660
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
陈逸飞. 神经网络结构快速搜索方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
Files in This Item:
File Name/Size DocType Version Access License
学位论文.pdf(6338KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[陈逸飞]'s Articles
Baidu academic
Similar articles in Baidu academic
[陈逸飞]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[陈逸飞]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.