基于高维神经网络动力学的自监督学习理论研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于高维神经网络动力学的自监督学习理论研究
	孟令寰
	2024-05-26
页数	104
学位类型	硕士
中文摘要	自监督学习作为一种利用海量无标记数据进行特征学习的重要方法，在计算机视觉和自然语言处理等领域取得了广泛成功。然而，目前自监督学习缺乏统一的理论框架，对于泛化保证、公平性、鲁棒性等方面的理论解释和分析尚显不足。本文基于神经网络动力学的方法，致力于面对这些挑战。本文从描绘特征空间的角度出发，尝试解释自监督学习方法，并通过实验证实了关键技术的有效性，包括超参数调节、投影网络、指数滑动平均和梯度停止等。同时，本文对单层非线性对比学习模型的训练动态进行了高维分析，发现了一些有意义的现象，为自监督学习中的理论问题提供了新的思路和方法。本文的主要工作包括对单层非线性对比学习模型的训练动态进行了高维分析。本文发现模型权重的经验分布收敛于由McKean-Vlasov 非线性偏微分方程决定的确定性度量函数。在L2正则化条件下，该偏微分方程简化为一组封闭的低维常微分方程，反映了训练过程中模型性能的演变。本文进一步分析了常微分方程的定点位置及其稳定性，发现了一些有趣的现象，例如隐藏变量的第二矩会影响特征可学性，更高阶矩会通过控制吸引区域影响特征选择的概率。最后，本文提出了两种基于理论分析的实验方法：基于互相关约束的自监督方法和相关高斯噪声增强，并通过实验证明了它们的有效性。综上所述，本文的研究对理解和应用自监督学习具有重要意义，为解决自监督学习中的理论难题提供了新的思路和方法。
英文摘要	Self-supervised learning, as an important method for feature learning using massive unlabeled data, has achieved widespread success in computer vision and natural language processing fields. However, currently, self-supervised learning lacks a unified theoretical framework, and theoretical explanations and analyses regarding aspects such as generalization guarantee, fairness, and robustness are still insufficient. This paper, based on the dynamics of neural network train ing, aims to address these challenges. We start by attempting to explain self supervised learning methods from the perspective of depicting feature space, and validate the effectiveness of key techniques, including hyperparameter tuning, pro jection networks, exponential moving averages, and gradient stopping, through experiments. Additionally, we conduct high-dimensional analysis of the train ing dynamics of single-layer non-linear contrastive learning models, discovering interesting phenomena such as the convergence of the empirical distribution of model weights to a deterministic measure governed by the McKean-Vlasov non linear PDE. Under L2 regularization conditions, this PDE simplifies to a set of closed low-dimensional ODEs, reflecting the evolution of model performance dur ing training. Furthermore, we analyze the stability of fixed points of the ODEs, observing that the second moment of hidden variables affects feature learnability, and higher-order moments influence feature selection probability by controlling at traction regions. Finally, we propose two experimental methods based on theoret ical analysis: self-supervised methods based on cross-correlation constraints and correlated Gaussian noise enhancement, demonstrating their effectiveness through experiments. In summary, this research is significant for understanding and ap plying self-supervised learning, providing new insights and methods for addressing theoretical challenges in self-supervised learning.
关键词	自监督学习神经网络动力学高维分析非线性对比学习模型
语种	中文
七大方向——子方向分类	机器学习
国重实验室规划方向分类	人工智能基础前沿理论
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/56734
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	孟令寰. 基于高维神经网络动力学的自监督学习理论研究[D],2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
学位论文_孟令寰.pdf（7061KB）	学位论文		限制开放	CC BY-NC-SA