基于无信息先验的判别式学习

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于无信息先验的判别式学习
其他题名	Noninformative Knowledge-Based Discriminative Learning
	杨双红
	2008-05-31
学位类型	工学硕士
中文摘要	本文对基于无信息先验的判别式学习进行初步的研究。与相关的其他研究相比我们的工作有三个特点：(1)与纯粹数据驱动的以通用求解器为目的的机器学习不同，本文充分利用先验知识提高算法的性能；(2)与面向应用领域、融合领域知识的专用机器学习研究不同，本文只考虑独立于特定应用背景的无信息先验，因此，所有算法均是足够通用的；(3)有别于传统的基于无信息先验概率的产生式方法，本文中先验知识均以非概率形式表达，所有算法均是判别式的。本文内容涉及分类、回归、模型选择、特征选择四个机器学习研究中的重要问题，具体研究内容主要包括： 1. 提出了一种新的分类方法--S-学习。与其他分类方法相比，该方法在理论和经验上都具有优良的性质，例如，泛化性能好，对野点鲁棒，计算代价小，分析性能良好，解在在核空间内自然稀疏等。 2. 提出了一种新颖的回归方法--重构参数学习，理论上证明了等价分解模型的存在性，给出了一种可行的推导等价分解模型的算子，并对一种特定的非线性模型，推导出了全线性的等价分解。 3. 提出了基于结构可辨识性的模型选择方法，强调了机器学习研究中定性试验设计的重要性，给出了两种基于符号计算的、易于编程的模型结构可辨识性判决定理，为模型设计和选择提供了一种可参照的准则。 4. 提出了应用于分类问题的特征选择的判别式最优框架，在该框架下提出了基于非参数Bayes误差最小的特征选择方法，并以特征赋权作为搜索策略的情形为例，给出了一个优良的特征赋权算法。文中提供了大量对比实验用于比较我们提出的算法和其他相关算法的性能，实验结果证实了文中所提出的算法的有效性。关键词：机器学习，有监督学习，分类，回归，模型选择，定性试验设计，降维，特征选择，损失函数，核方法，S-学习，重构参数学习，等价分解模型,结构可辨识性，参数冗余，参数依赖，非参数Bayes误差最小，特征赋权，Relief
英文摘要	This paper conducts a preliminary study on Non-Informational Knowledge-based Discriminative Learning. Compared with other related works, our study is distinguished from the following aspects: (1) Unlike purely data-driven learning approaches that targeted at general solutions, we integrate knowledge into learning model to improve performance; (2) Different from ad-hoc approaches that use domain knowledge and are thus problem-dependent, our approaches only take advantage of non-informational knowledge and hence the solutions obtained are totally domain-independent; (3) Compared with the non-informational prior based probabilistic generative learning, all approaches we propose are discriminative. The main content covers four important research topics, i.e., classification, regression, model selection and feature selection, and in particular includes the following: 1. A novel approach to classification, namely the S-learning approach, which has both empirical and theoretical advantages over other counterparts, such as generalization accuracy, computational efficiency, robustness to outliers, theoretical amenability, and natural sparseness in the kernel space; 2. A new regression scheme: reformulated parametric learning, which learns parameters equivalently based on a simpler model in the equivalence set of the original model; 3. Structural identifiability based model selection: a functional framework for parameter dependence examination and to criteria to detect parameter redundancy; 4. A theoretical optimal criteria for feature selection that directly reflects Bayes error, and an algorithmic framework that selects a subset of features by minimizing nonparametric Bayes error of the training data set. We evaluate our proposed algorithms on various benchmark data set and problems in comparison with other state-of-art methods. The experimental results confirm the effectiveness of our proposed approaches. Keywords: Machine Learning, Supervised Learning, Classification, Regression, Model Selection, Qualitative Experiment Design, Dimensionality Reduction, Feature Selection, Loss Function, Kernel Methods, S-Leaning, Reformulated Parametric Learning, Equivalent Decomposition Model, Structural Identifiability, Parameter Redundancy, Parameter Dependence, Nonparametric Bayes Error Minimization, Feature Weighting, Relief
关键词	机器学习人工智能分类回归模型选择特征选择无信息先验判别式学习 Machine Learning Artificial Intelligence Classification Regression Model Selection Feature Selection Noninformative Prior Knowledge Discriminative Learning
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7448
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	杨双红. 基于无信息先验的判别式学习[D]. 中国科学院自动化研究所. 中国科学院研究生院,2008.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20052801462804（2145KB）			暂不开放	CC BY-NC-SA