天体光谱的自动分类与自动测量技术

CASIA OpenIR > 毕业生 > 博士学位论文

	天体光谱的自动分类与自动测量技术
其他题名	Automated spectra classification andmeasurements of celestial objects
	许馨
	2005-06-01
学位类型	工学博士
中文摘要	大天区面积多目标光纤光谱天文望远镜（LAMOST）是国家“九、五”重大科学工程项目之一。预计2005年底建成之后，每个观测夜将获得1～2万条光谱数据，预计所获得的光谱数据总量达107。LAMOST项目急需研制天体光谱数据的自动处理和分析系统。本文正是在这种背景下展开的，重点探索天体光谱数据的自动分类与自动红移参数测量方法，以满足LAMOST项目的需要。本文的主要工作包括以下四个部分：（1）基于局部 PCA（LPCA）和核 PCA（KPCA）的光谱分类技术针对恒星、星系和类星体光谱分类任务，本文提出了基于 LPCA 和KPCA 的两种分类算法。对于 LPCA，实验表明它所提取的特征比使用原始 PCA 所提取的特征包含更好的关于恒星和类星体的分类信息，提高了恒星、类星体的分类正确率。该方法计算量小，适用于大规模的光谱数据处理。对于 KPCA，实验表明在高斯核宽为 2 时，KPCA 提取的特征具有更好的分类鉴别性能，且 KPCA 的平均分类正确率略高于 PCA 方法。实验还表明：当主分量个数取 20 时，两者的分类正确率都到达最高。这些结论对将来 LAMOST 的实测数据处理和分析有重要的参考价值。（2）基于核技巧的光谱分类技术（a）给出了基于核的广义判别分析（GDA）的光谱分类算法。实验对比了LDA、GDA、PCA、KPCA算法对于恒星、星系和类星体的光谱分类性能。结果表明基于GDA的算法对于这三种类型光谱的分类正确率最高，LDA次之。尽管KPCA也是基于核的方法，但是当主成份个数少时效果不好，甚至低于LDA。基于PCA的分类效果最差。（b）提出了一种基于核技巧的覆盖算法--核覆盖算法。该算法将核技巧与覆盖算法相结合，并在特征空间中抽取支持向量。实验表明，核覆盖算法与覆盖算法在分类正确率上大致相当，但核覆盖算法得到的支持向量个数大大少于覆盖算法（3）提出了一种自动测量正常星系（NG）红移的快速方法首先，由NG模板根据红移范围位于区间Ⅰ:0.0～0.3和区间Ⅱ:0.3～0.5模拟得到两类星系样本，对这些样本进行PCA变换获得样本特征向量。然后，利用概率神经网络设计两类样本特征向量的Bayes分类器。最后，对于实际NG光谱数据，利用已训练的Bayes分类器确定其红移的大致范围，然后在这个范围内进行模板匹配得到红移的测量值。与传统的模板匹配方法相比，此方法不但节省了50%的模板匹配运算量，而且还大大提高了红移值测量的精度。该方法对于大型光谱巡天所产生的海量数据的自动处理具有重要意义。
英文摘要	LAMOST (Large Sky Area Multi-Object Fiber Spectroscopic Telescope) project is one of a few key scientific projects during the period of the national ninth five-year plan. After its expected completion at the end of 2005, about 10,000~20,0000 spectra will be collected per observation night. LAMOST project urgently needs a fully automatic spectral processing and analysis system due to its voluminous data, a dataset of up to 107 spectra. To this end, this work is particularly focused on finding and designing suitable techniques for spectral classification and radial velocity redshift measurements. The main contributions are following:(1) Spectra classification techniques based on local PCA (LPCA) and Kernel PCA (KPCA). Two classification algorithms based on LPCA and KPCA are proposed for the classification of stars, galaxies and quasars. Experiments show that LPCA is capable of extracting more pieces of useful information on stars and quasars than the original PCA does, and as a result, the corresponding correct classification rate is higher. It is particularly suitable for large-scale spectra processing thanks to its high computational efficiency. Experiments show that KPCA reaches its best performance when the width of Gauss window equals 2. Comparatively, KPCA performs slightly better than PCA does for classification, and both reach their best result when the number of he principal components is fixed to 20. These experimental conclusions are useful for the design of future LAMOST. (2) Spectra classification based on kernel trick. (a) A kernel based Generalized Discriminant Analysis (GDA) technique is proposed for the classification of stars, galaxies, and quasars. LDA, GDA, PCA and KPCA are experimentally compared with these 3 different kinds of spectra. Among these 4 techniques, GDA obtains the best result, followed by LDA, and PCA is the worst one. Although KPCA is also a kernel based technique, its performance is not satisfactory if the selected number of the principalcomponents is small, and in some cases, it appears worse than LDA, anon-kernel based technique. (b) A kernel based covering algorithm, called the kernel covering algorithmis proposed. This algorithm is a combination of kernel trick with the coveringalgorithm, and is used to extract the support vectors in feature space. Theexperiments show that although the kernel covering algorithm has a comparableclassification performance compared with the covering algorithm, the numberof its resulting support vectors is significantly smaller than that of the coveringalgorithm.
关键词	天体光谱自动分类红移测量局部pca 核方法概率神经网络流形学习 Celestial Object Spectra Automated Classification Redshifts Measurement
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/5871
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	许馨. 天体光谱的自动分类与自动测量技术[D]. 中国科学院自动化研究所. 中国科学院研究生院,2005.