网络结构搜索方法及其在气象预测中的应用研究
张新邦
2022-05-24
页数136
学位类型博士
中文摘要

网络结构搜索旨在针对特定应用场景自动地生成网络结构,是自动机器学习中一个核心的研究问题。它打破了“人工积木式堆积”的设计思路,极大地扩展了深度学习方法在开放环境中的适应能力。然而,现有的搜索方法仍存在以下问题:(1)搜索效率低,搜索代价高昂;(2)搜索空间受到定长编码限制;(3)搜索与验证过程存在网络结构表征差异等。这些问题给网络结构搜索方法的应用带来巨大的挑战。从应用角度分析,现有网络结构搜索方法大多面向计算机视觉任务。在视觉任务中,数据来源单一,数据结构固定,数据语义明确,与实际场景下复杂的数据特征不符。以气象预测任务为例,气象大数据存在长时大范围时空关联,气压、温度、湿度与风速等气象模态耦合关系复杂,采用人工设计的网络结构无法取得满意的结果。因此,对气象预测这一特定应用场景,需重新构建和优化相应的网络结构搜索方法。

为此,本文将从方法及应用两个维度开展网络结构搜索算法的研究工作。具体地,在方法维度上,针对现有方法的缺陷提出新的搜索策略,提升算法性能;在应用维度,针对气象大数据特点、气象预报任务特点及其面临的诸多挑战性问题,开展网络结构搜索方法的应用研究。本文主要研究内容与创新点归纳如下:

1.提出了一种基于稀疏表示的网络结构搜索方法。其核心思想是将网络剪枝与网络结构搜索任务融合。传统搜索方法将网络结构优化任务视为黑箱优化问题,在搜索过程中需对大量的子网络结构进行训练与评估。这一搜索过程效率低下,且无法实现面向大型数据集和复杂视觉计算任务的网络结构搜索策略。此外,传统搜索方法大多基于定长网络结构编码,搜索空间受限。为此,基于包含所有网络连接的超网络模型,本文首先引入网络结构参数对网络连接进行缩放,扩展了网络结构搜索范围;其次,引入基于梯度的稀疏约束优化方法,实现了对超网络中的网络结构参数的高效剪枝和优化;最后,利用网络结构参数非零的网络连接组成最终的网络结构。所提方法打破了网络结构定长编码的壁垒,提高了搜索算法的效率。在大型图像数据集以及复杂计算机视觉任务中的实验验证了所提方法的有效性。

2.提出了一种基于网络结构搜索的气象多模态特征融合网络。其核心思想是引入网络结构搜索框架,建模气象预测问题中的多模态耦合关系。由于气象预测问题中气压、温度、湿度、风速等多模态气象要素耦合关系复杂,构建多模态跨物理量融合的特征学习网络是一个难点。为此,在基于注意力机制的编解码网络框架中,提出了“将气象多模态特征融合网络分解为多个模态特征提取分支与一个特征融合分支”的结构设计策略。考虑到该网络框架的复杂性与气象预测任务中对算法性能与效率的要求,在所提网络框架下,引入本文提出的基于稀疏表示的网络结构搜索方法,以纯数据驱动的方式学习各模态特征提取分支和特征融合分支的特征交互关系及网络结构。在所提多模态特征融合网络的基础上,进一步利用编解码网络提取时序气象特征,并预测未来气象状态。实验结果表明,所提方法在气象预测任务中能取得较好的性能提升。

3.提出了一种基于网络结构搜索的气象时空特征提取网络。其核心思想是在网络结构搜索框架下,构建多源气象数据明晰的时空依赖关系。气象数据中时空依赖关系极为复杂。在空间维度上,特定区域内的气象状态受到相邻区域气象条件的影响。在时间维度上,气象数据既存在短时近邻时序特征,还存在长时周期特征。为此,在网络结构搜索框架下,构建明晰的时间轴依赖关系和“时空”结构同时搜索策略,实现基于多源气象要素数据融合的气象预测。在网络结构自适应学习框架下,为充分利用气象数据中的时空关联性,首先,在所提基于稀疏表示的搜索方法中引入网络结构概率约束,有效地降低了搜索与验证过程中网络结构表征差异的影响;其次,引入网络结构搜索方法对局部特征提取模块的结构进行自适应学习;最后,在本文所提多模态融合网络的框架下,学习气象数据的时序关联方式。所提气象时空特征提取网络提高了对气象数据的长时特征提取能力和空间多尺度信息的建模能力。实验结果验证了所提方法的有效性。

英文摘要

Aiming to generate suitable networks automatically for specific tasks, Neural Architecture Search (NAS) is the main application of automatic machine learning. NAS breaks the current architecture designing paradigm that stacks blocks manually and improves the adaptability of deep learning methods in the open world. Unfortunately, most existing NAS methods still face the following problems: (1) low efficiency and high computational cost; (2) the search space being limited by the fix-length coding; (3) the gap of architectures during searching and validating. These problems induce great challenges to the development of NAS methods. As for the application aspect, most existing NAS methods mainly focus on computer vision tasks with a single data source, fixed data structure, and explicit semantics. This is inconsistent with the complex characteristic in real applications. Taking the meteorological forecasting task as an example, meteorological data exhibits large scale spatio-temporal correlation and complex coupling relationship between different modalities, i.e., pressure, temperature, humidity, and wind speed, applying artificial network architecture could not achieve satisfactory results. Therefore, the architecture search method should be reconstructed and optimized for the meteorological forecasting task.

To solve these problems, this thesis focuses on the study of NAS methods and their applications in meteorological forecasting. Methodologically, a novel search strategy is proposed to address the aforementioned problems and improve the performance of the NAS algorithm. As for the application aspect, we study the application of NAS methods in the meteorological forecasting task, considering the characteristic of meteorological big data and relative challenging problems. The main contributions of this thesis are summarized as follows:

1. A novel neural architecture search method via direct sparse optimization is proposed. The motivation of this method is to address the NAS problem in the view of model pruning. Traditional approaches treat the search problem as a black-box optimization problem, which indicates that numerous architectures are required to be trained and validated during the search process. As a result, the search cost on large datasets and complex computation vision tasks could be unaffordable. Furthermore, subjected to the applied architecture coding, most existing methods are limited to a fixed-length search space. To solve these problems, we start from a completely connected block and then introduce scaling factors to scale the information flow between operations. In this way, the search space is expanded. Next, sparse regularizations are imposed to prune useless connections in the architecture and an efficient optimization method is derived to solve it. Finally, the network structure is constructed with connections whose scaling factor is not zero. The proposed method overcomes the barrier of fixed-length coding and enjoys the advantage of high efficiency. Experiments on large scale datasets and complex computer vision tasks demonstrate the effectiveness of the proposed method.

2. A meteorological multi-modal feature fusion network based on NAS is proposed for the meteorological forecasting task. The motivation of this work is to apply the NAS method to model the correlation of different meteorological modalities. The coupling relationship of different meteorological modalities such as pressure, temperature, humidity, and wind speed could be extremely complex in the long-term meteorological forecasting task. It remains an issue how to integrate multi-modal and cross-physical information. To solve this problem, we propose a multi-model fusion framework that consists of several modality feature learning branches and a feature fusion branch. Considering the complexity of the whole structure and the requirement for efficiency, we introduce the proposed NAS method to learn the optimal scheme to fuse multi-modal information and network structure in a purely data-driven method. Based on the multimodal feature fusion network, the encoder-decoder framework captures the temporal context and makes predictions. Experiments demonstrate the superiority of the proposed method.

3. A meteorological spatio-temporal feature extracting network based on NAS is proposed for the meteorological forecasting task. The motivation of this work is to explicitly model the dependencies along both spatial and temporal dimensions under the NAS framework. The spatio-temporal dependencies of meteorological data could be extremely complex. Spatially, meteorological data at a particular location has an obvious correlation with its neighbors. As for the temporal dimension, meteorological data exhibits both neighbor relationships and long-term periodic patterns. In this work, we propose to search spatial and temporal structures simultaneously under the NAS framework for the multi-modal meteorological forecasting task. To exploit the spatiotemporal correlation of meteorological data, the architecture distribution constraint is introduced to effectively bridge the gap of architectures during searching and validating. After that, the proposed NAS method is adopted to learn the structure of the local feature extraction module and the temporal fusion module. Equipped with the proposed multimodal feature fusion network, the proposed spatio-temporal feature extraction network is capable of extracting long-term temporal features and modeling multi-scale spatial correlation. Our experiments demonstrate the effectiveness of the proposed method.

关键词网络结构搜索 自动机器学习 深度学习 气象预测 时空数据挖掘
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/48953
专题多模态人工智能系统全国重点实验室_先进时空数据分析与学习
毕业生_博士学位论文
推荐引用方式
GB/T 7714
张新邦. 网络结构搜索方法及其在气象预测中的应用研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
张新邦-网络结构搜索方法及其在气象预测中(8827KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[张新邦]的文章
百度学术
百度学术中相似的文章
[张新邦]的文章
必应学术
必应学术中相似的文章
[张新邦]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。