CASIA OpenIR  > 毕业生  > 硕士学位论文
基于时间序列挖掘的股票交易数据分析研究
毛颖
2016-05
学位类型工程硕士
英文摘要

近年来,随着市场经济的快速发展,参与股票投资的投资者日益增多,我国的股票市场也趋于成熟。股票市场的价格变化,能够反映国民经济的发展状况以及股票发行行业的运行状况,股市的暴涨暴跌将会直接影响到经济及社会的稳定,因此人们希望能够从大量的历史股票交易数据中挖掘并总结市场规律,从而对股票市场的未来走势做出基本的判断。

在股票市场中,影响股票市场变动趋势及股票价格变化的因素有很多,如何利用有效分析方法对市场规律进行挖掘是研究重点,并且具有重要的理论意义和实际应用价值。国内外许多学者分别根据技术分析法、机器学习方法、时间序列分析等方法对股票价格进行预测,并取得了较好的预测效果。

通过对现有研究方法的调研及各自优缺点的分析,确定本文的研究目的是分别根据全部股票及单只股票进行股票的涨跌分类。基于全部股票数据选用支持向量分类方法实现对股票数据的短期涨跌预测,基于单只股票价格数据选用时间序列ARIMA预测模型实现对股票数据的长期价格预测。通过以上过程,将宏观分析全体股票的涨跌情况细化到微观分析单只股票的价格走势,更具实际应用意义。

本文的主要工作和创新点包括以下几个方面:

1.研究了股票数据分析领域的相关技术

本文研究了股票数据分析领域的相关技术,主要包括股票领域常用术语、支持向量机分类模型、参数优化方法以及时间序列分析常用模型。

2.提出了一种短期股票涨跌分类方法

本文提出了一种短期股票涨跌分类方法,即利用全体股票每天开市后的前3.5小时的股票交易数据来预测闭市时股票的涨跌情况。涨跌预测分类方法基于支持向量机分类模型进行实验验证,分别采用网格搜索法、遗传算法、粒子群优化算法进行模型最优化参数的确定,为了提高参数优化效率,对网格搜索法参数优化过程进行了改进。最终在保障模型泛化程度的基础上将涨跌分类准确率由57.25%提升至70%以上。

3.提出了一种长期股票价格预测及分类方法

本文提出了一种长期股票价格预测及分类方法,即将单只股票一个月内的价格数据按小时进行抽样形成时间序列,从而进行下一天的价格预测。价格预测及分类方法基于时间序列预测模型ARIMA进行实验验证,并对模型有效性的判定方法进行了改进,针对本章实验制定了两种模型预测结果评价标准。实验表明,预测模型所预测的价格数据在变动趋势上与实际数据相似,且每个时刻的预测值与实际值的相对误差极小。

4.对短期涨跌预测与长期价格预测进行了对比

基于支持向量机分类模型的短期涨跌预测结果表明,股票涨跌分类准确率可提高至70%,基于时间序列ARIMA模型的长期价格预测结果表明,股票涨跌分类准确率为65%。两种方法的分类准确率相接近,但在时间序列预测实验中仅选取了100只不同股票进行价格的预测,并且在预测过程中需要人工定阶,因此分类准确率略低于支持向量机分类模型。

;

Recently, with the rapid development of market economy, increasingly growing investors are involved in the stock market, and the stock market is also more and more mature. The stock price changes can reflect the development of the national economy and the status of stock issuing industry. The ups and downs will directly affect economic and social stability, thus people want to exploit a large number of historical stock transactions to discover the laws behind and predict the future trend of the stock market.

In the stock market, there are many factors which are affecting the trend and changes of the stock market. The key problem is how to efficiently exploit classical methods to analyze the markets rules. Domestic and foreign researchers predicted the stock prices based on technique analysis, machine learning and time series analysis methods, and achieved promising performances.

After the research on existing methods, this study aims to study three following related problems, i.e., up/down classification on single stock and all stocks one day, short-term up/down prediction on all stock prices based on support vector machine and long-term price prediction on single stock based on time series ARIMA model. Through the above processes, the ups and downs of the macroscopic analysis of all stocks can be refined to microscopic analysis of a single stock price movements, therefore this study has more practical significances.

To summarize, this study includes several aspects below:

1. We reviewed related work in stock data analysis

We reviewed related techniques in stock data analysis field, including mutual stock technical terms, support vector machine, parameter optimization methods and time series analysis models.

2. We proposed a short-term stock up/down classification approach

On one hand, we proposed a short-term stock up/down classification approach, i.e., taking the stock transaction data in the first 3.5 hours for all stocks as input, and predicting the up/down status of the stocks when the market is closed. This method was based on support vector machine, and the optimal parameters can be determined through grid search, genetic algorithms and particle swarm optimization methods. To advance the efficiency of parameter optimization, we improved the grid search parameters optimization process. Finally, on the premise of model generalization ability, the classification accuracy increased from 57.25% to 70% or more.

3. We further proposed a long-term stock price prediction and classification method

On the other hand, we proposed a long-term stock price prediction and classification method, i.e., extracting a time series for each single stock during one month via hour sampling to predict the stock price for the next day. Stock price prediction and classification models were based on a time series model - ARIMA, and we improved the determination method for the model effectiveness, further designed two new evaluation metrics for prediction models. The experimental results demonstrated that the predicted stock prices varied like actual data, and at each moment, the errors between them are tiny.

4. We compared these two methods for stock price classification and prediction

At last, we compared these two methods for stock price classification and prediction. The classification accuracy based on support vector model can achieve 70% while the long-term time series model can achieve 65%. The classification accuracies are close to each other, however, for the long-term time series model, we only choose 100 stocks for prediction, thus the classification accuracy is a little lower than short-term support vector model.

关键词涨跌分类 价格预测 分类标准
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/11786
专题毕业生_硕士学位论文
作者单位中国科学院自动化研究所
第一作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
毛颖. 基于时间序列挖掘的股票交易数据分析研究[D]. 北京. 中国科学院大学,2016.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
基于时间序列挖掘的股票交易数据分析研究.(1858KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[毛颖]的文章
百度学术
百度学术中相似的文章
[毛颖]的文章
必应学术
必应学术中相似的文章
[毛颖]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。