基于数据挖掘的量化选股策略的研究
发布时间:2018-05-07 08:47
本文选题:量化投资 + 选股策略 ; 参考:《天津商业大学》2017年硕士论文
【摘要】:近年来,由于股票市场的不断发展,量化投资技术越来越受到投资者的关注,我国的量化投资体系也逐渐走向成熟。随着股市规则的不断完善,上市股票的数量及与之相关的数据在不断的增加,而股票的这些数据多且复杂,却又隐含着很多有用的信息,那么如何从这些海量的数据中发现有用的信息,用常规的方法显然已经无法解决,而近些年发展起来的数据挖掘技术则可以帮助我们从那些海量的股票数据中挖掘出我们所需要的数据信息,通过对这些数据进行分析、建模得到我们想要的信息。本文主要讨论了基于数据挖掘的量化选股模型。首先我们根据两个条件对2013年-2015年沪深市场类全部A股的3000多支股票进行初步筛选:一是连续3年净资产收益率稳定且不小于10%,并剔除ST等公司股票;二是主营业务增长率与净利润增长率基本一致并且在10%以上。经过筛选,51支基本面较好的股票被保留。其次,我们选取了财务数据中能够反映公司盈利、偿债、成长等能力的17个重要指标作为数据分析的基础,考虑到因子之间存在重叠性、相关性,并且若模型解释变量太多则容易出现主次不分等问题,因此我们对这些指标做了主成分分析。通过主成分分析,在保留原数据绝大部分信息的同时,我们选出了无相关性的五个综合指标,进而达到了降维的目的。在众多的数据挖掘的算法中,聚类分析是特别容易理解而且已经被证明在选股方面是很有效的一种方法,所以本文选择了K均值聚类来研究选股策略,并且对K的选取做了对比,通过R软件选出了最优的K,从而将选股问题演变为选类问题。事实证明,针对我们的数据,当K取5时聚类效果最好,因此我们选出了7支股票作为最终选股结果,通过wind平台调出已选股票的历史K线,发现所选的股票的整体走势几乎都可以跑赢大盘,而且未来有上升的趋势,事实证明文章所做的工作对股票投资者分析选择股票具有一定的参考作用。
[Abstract]:In recent years, due to the continuous development of the stock market, the quantitative investment technology has attracted more and more attention of investors, and the quantitative investment system of our country has gradually matured. As the rules of the stock market continue to improve, the number of listed stocks and their related data are constantly increasing, and these data of stocks are many and complex, but contain a lot of useful information. So, how to find useful information from these massive amounts of data is obviously not solved by conventional methods. The data mining technology developed in recent years can help us to mine the data information we need from the massive stock data. Through the analysis of these data, we can model the information we want. This paper mainly discusses the quantitative stock selection model based on data mining. Firstly, according to two conditions, we preliminarily screen more than 3000 A-share stocks in Shanghai and Shenzhen stock market from 2013 to 2015: first, the return of net assets is stable and not less than 10% for three consecutive years, and the stock of St and other companies are excluded; Second, the main business growth rate and net profit growth rate is basically consistent and above 10%. After screening, 51 stocks with better fundamentals were retained. Secondly, we select 17 important indicators in the financial data that can reflect the company's profitability, debt service, growth and so on as the basis of the data analysis, considering the overlap and correlation among the factors. And if there are too many variables explained by the model, the primary and secondary problems are easy to occur, so we do the principal component analysis of these indexes. Through principal component analysis, we select five uncorrelated synthetic indexes while retaining most of the original data, and then achieve the goal of dimensionality reduction. Among the many algorithms of data mining, clustering analysis is especially easy to understand and has been proved to be a very effective method in stock selection, so this paper chooses K-means clustering to study stock selection strategy. By comparing the selection of K, the optimal K is selected by R software, and the stock selection problem is transformed into a class selection problem. It turns out that for our data, when K takes 5, the clustering effect is the best, so we select 7 stocks as the final stock selection result, and through the wind platform, we call out the historical K line of the selected stock. It is found that the overall trend of the selected stocks can almost outperform the market, and there is an upward trend in the future. The facts show that the work done in this paper has a certain reference role for stock investors to analyze and select stocks.
【学位授予单位】:天津商业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.13;F832.51
【参考文献】
相关期刊论文 前5条
1 李磊;;基于spss的股票量化投资决策[J];北方经贸;2014年10期
2 郭茜;;股票市场中主成分分析及聚类分析的综合应用[J];科技风;2013年11期
3 李建军;虞跃;;基于主成分分析的股票投资策略[J];长春师范学院学报(自然科学版);2009年02期
4 曹文平;;一种有效k-均值聚类中心的选取方法[J];计算机与现代化;2008年03期
5 吴元奇,冯荣扬;聚类分析计算方法的理论及结果比较[J];湛江海洋大学学报;2002年01期
相关硕士学位论文 前6条
1 李慧兰;基于数据挖掘的量化投资策略实证研究[D];浙江大学;2014年
2 张利平;基于多因子模型的量化选股[D];河北经贸大学;2014年
3 何裕;基于数据挖掘组合模型的股价预测研究[D];西南财经大学;2014年
4 朱博雅;一种基于数据挖掘的量化投资系统的设计与实现[D];复旦大学;2012年
5 石煜;基于数据挖掘的数量化模型选股分析平台[D];电子科技大学;2012年
6 刘毅;因子选股模型在中国市场的实证研究[D];复旦大学;2012年
,本文编号:1856257
本文链接:https://www.wllwen.com/jingjilunwen/huobiyinxinglunwen/1856257.html