知识与数据驱动机器学习模型的参数可辨识性理论研究

CASIA OpenIR > 毕业生 > 博士学位论文

	知识与数据驱动机器学习模型的参数可辨识性理论研究
其他题名	Theoretical Study on Parameter Identifiability of Knowledge and Data-driven Machine Learning Models
	冉智勇
	2014-05-27
学位类型	工学博士
中文摘要	参数可辨识性研究是增加模型透明度和可理解性的重要手段，也是进行参数估计的必要前提。当模型参数具有明确物理意义时，参数可辨识更是系统建模的本质要求。参数可辨识性对统计学习理论、模型结构学习、模型选择、参数估计、学习算法、学习过程动态分析等诸方面有着重要的作用和意义。本文以机器学习，系统辨识和神经计算为应用背景，系统地研究了参数模型的可辨识性问题。根据建模的特性，本文将参数模型分为两个框架：(1) 非时变框架。在此框架下，本文推导了多输入多输出非线性变换和参数统计模型的可辨识性定理。(2) 时变框架。在此框架下，本文推导了动态模型和随机过程模型的可辨识性定理。本文的主要贡献总结如下： (a) 对非时变框架下的非线性变换模型，我们将模型看作从输入空间到输出空间的静态的，无噪声的确定性映射，推导了多输入多输出情况下的参数可辨识性定理。此定理将以前单输入单输出和多输入单输出模型的可辨识性结果作为其特例，从而理论上推广了单输入单输出和多输入单输出模型的可辨识性准则。而且，本文对此结果给出了一个代数上合理和几何上直观的对偶解释。相比较以前的方法，本文的方法不仅可以判断模型参数是否可辨识，还可以明确得出观测等价的参数向量。 (b) 对非时变框架下的参数统计模型，我们将参数化的统计分布族看作具有几何结构的统计流形，利用信息论中的Kullback-Leibler散度，将无约束参数模型和参数受限模型的可辨识性问题分别转化为一个无约束优化和约束优化问题，第一次从最优化理论的角度系统地研究了可辨识性问题，并得到了相应的可辨识性准则。这些结果为研究参数可辨识性问题提供了一个新颖的视角，并建立了可辨识性理论、信息论和最优化理论之间的联系，因此具有比较深刻的理论意义。相比较以前的解方程组方法，本文方法的主要优点是：只需要计算一个数值矩阵的秩就可以判断模型参数是否可辨识，而不需要求解非线性方程组的根，从而将计算复杂度从NP完全降低到参数维数的三次方。 (c) 对时变框架下的参数模型，本文采用辨识函数的方法，基于黎曼几何中的秩定理，对包括动态常微分方程模型和随机过程模型在内的一大类时变模型提供了统一的处理方式。辨识函数方法的应用范围相当广泛，不仅可以用来处理时变模型，还可以用来处理非时变模型，从而理论上揭示了各种参数模型可辨识性问题的共性。此方法的优点在于：一方面，我们可以通过计算辨识函数导矩阵的符号秩来判断模型是否参数冗余，从而求得模型的内在参数维数；另一方面，基于此导矩阵，我们得到了求解可辨识的独立参数函数的方法。进而，我们得到了不可辨识，参数冗余和参数相关三者之间的关系。在本文的理论分析方面，我们用了文献中大量例子来解释相关定义和定理的具体含义。在实际应用方面，我们用了大量真实模型(比如广义约束神经网络、参数学习机、动态微分方程、HIV模型、生物动力学模型、滑动平均模型、部分线性支持向量机模型等)来研究其参数可辨识性，由此阐释本文方法的正确性和实用性。
英文摘要	The study of parameter identifiability is an important way for enhancing model transparency and comprehensibility, and is the perquisite for parameter estimation. Identifiability is an essential requirement for system modeling when the parameters to be estimated have a physically interpretable meaning. The importance and utility of identifiability analysis can be recognized in statistical learning theory, model structure learning, model selection, parameter estimation, learning algorithm, learning dynamic, etc. This thesis presents a systematic study of identifiability for parametric models on the basis of machine learning, system identification and neural computing. According to the model nature, we categorize parametric models into two frameworks: (1) time-invariant framework. Within this framework, identifiability theorems for nonlinear Multiple-input Multiple-output (MIMO) mappings and parametric statistical models are derived. (2) time-variant framework. Within this framework, identifiability theorems for dynamic models and stochastic process models are derived. The main contribution of this thesis is given in the following: (a) For nonlinear mappings within time-invariant framework, we view the models as static, noise-free, deterministic mappings from input space to output space, identifiability theorem for MIMO models is derived. The resulting theorem includes the previous identifiability criteria for Single-input Single-output (SISO) and Multiple-input Single-output (MISO) models as its special cases, thus theoretically generalizing the previous identifiability criteria for SISO and MISO models. Further, this thesis presents a dual algebraically reasonable and geometrically perceivable interpretation for the result. Compared with the previous results, the superiority of the proposed method lies that, it is not only workable for checking model identifiability, but also explicitly gives the observationally equivalent parameter vectors. (b) For parametric statistical models within time-invariant framework, we view the parameterized family of statistical distributions as geometrically statistical manifold, and make use of Kullback-Leibler divergence in information theory to transform the identifiability problems of unconstrained and parameter-constrained models into unconstrained and constrained optimization problems, respectively. This is the first work that systematically studies identifiability problem from the optimization theory perspec...
关键词	可辨识性参数冗余信息论 Kullback_leibler 散度最优化理论辨识函数 Identifiability Parameter Redundancy Information Theory Kullback-leibler Divergence Optimization Theory Identifying Function
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6616
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	冉智勇. 知识与数据驱动机器学习模型的参数可辨识性理论研究[D]. 中国科学院自动化研究所. 中国科学院大学,2014.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20101801462805（2492KB）			暂不开放	CC BY-NC-SA