基于XGBoost算法的多因子量化选股方案策划
本文关键词:基于XGBoost算法的多因子量化选股方案策划 出处:《上海师范大学》2017年硕士论文 论文类型:学位论文
【摘要】:近年来,量化投资凭着其纪律性、系统性、及时性及分散化的特点,日益受到机构投资者和对冲基金的重视。同时,我国证券投资市场的规模和证券开户数都在迅猛的增加,从我国证券市场有效性和国外证券市场的发展经验来看,量化投资的发展前景毋庸置疑且值得期待。尽管如此,目前国内量化投资产品依然存在总体规模小、量化策略单一、策略业绩分化等缺点。此时,研究新的量化投资方式和挖掘新的建模思路的重要性对于丰富量化投资产品,提升市场规模,推动量化投资的发展意义重大。在众多的量化策略中,多因子选股策略凭借其稳定性和覆盖广等优势被许多研究者关注。多因子选股量化策略方案主要致力于解决多因子的选取够全面,其次是分类模型有良好的泛化能力,基于此两大方向,本文都进行了一定的优化和改进,其一本文首次相对全面的搜集了因子数据,除了大部分研究者使用的财务、红利、动量等因子,总共使用了307个因子,我还加入了规模、估值、宏观、债券和楼市相关因子;其二本文首次使用较为新颖的XGBoost提升算法,此算法的主要优势是:XGBoost支持线性分类器,而且自带L1和L2正则化项的逻辑回归或者线性回归。其次,XGBoost在代价函数里加入了正则项,使学习出来的模型更加简单,防止过拟合;最后,XGBoost借鉴了随机森林的做法,支持列抽样,不仅能降低过拟合,还能减少计算,并且XGBoost工具支持并行,速度较快。并比较了SVM、随机森林和XGBoost三种算法的优缺点和建模交过对比,证实XGBoost算法效果和稳定性最好;其三,本文改变了以往的因子筛选方式以及建模流程,使用边训练边筛选的方式,筛选的方法更为科学合理。基于以上策划思路,最后成功设计出了利用机器学习的方法量化选股,并取得了超越沪深300指数的超额收益率的多因子量化选股方案,经过23个持有期所选出的股票组合的总收益为287%,年化复合收益率高达127%,夏普比率为0.91,信息比率为2.41,有82%的季度跑赢沪深300指数,有59%的季度取得正收益,最后净值达到3.87,远超基准沪深300指数收益率。
[Abstract]:In recent years, quantitative investment, with its characteristics of discipline, systematization, timeliness and decentralization, has been paid more and more attention by institutional investors and hedge funds. The scale of China's securities investment market and the number of securities account opening are increasing rapidly. From the perspective of the effectiveness of China's securities market and the development experience of foreign securities market. The development prospect of quantitative investment is beyond doubt and worthy of expectation. However, at present, domestic quantitative investment products still have some shortcomings, such as small overall scale, single quantitative strategy, differentiation of strategy performance, and so on. It is of great significance to study the new quantitative investment mode and the importance of mining new modeling ideas for enriching the quantitative investment products, improving the market scale and promoting the development of quantitative investment. Multi-factor stock selection strategy has been concerned by many researchers because of its stability and wide coverage. The multi-factor stock selection strategy is focused on solving the problem of multi-factor selection. Secondly, the classification model has a good generalization ability. Based on these two directions, this paper has carried out certain optimization and improvement. First, this paper has collected the factor data relatively comprehensively for the first time. In addition to the financial, dividend, momentum and other factors used by most researchers, a total of 307 factors were used, and I added scale, valuation, macro, bond and housing related factors; Second, this paper first uses a novel XGBoost lifting algorithm, the main advantage of this algorithm is that the XGBoost boost support linear classifier. And the logical regression or linear regression with L1 and L2 regularization terms. Secondly, XGBoost adds a regular term to the cost function, which makes the learning model simpler and prevents over-fitting. Finally, XGBoost draws on the random forest approach and supports column sampling, which can not only reduce over-fitting but also reduce computation, and the XGBoost tool supports parallelism. The advantages and disadvantages of SVM, stochastic forest and XGBoost are compared with each other, and it is proved that XGBoost algorithm is the best in effect and stability. Third, this paper changes the previous factor screening method and modeling process, using the method of training while screening, the screening method is more scientific and reasonable. Based on the above planning ideas. Finally, the paper successfully designs the method of machine learning to quantify stock selection, and obtains a multi-factor quantification stock selection scheme that surpasses the Shanghai and Shenzhen 300 index. The total income of the stock portfolio selected after 23 holding periods is 2870.The annualized compound yield is as high as 127. The Sharp ratio is 0.91and the information ratio is 2.41. 82% beat the CSI 300 index for the quarter, and 59% for the quarter, with a net worth of 3.87, well ahead of the benchmark CSI 300 yield.
【学位授予单位】:上海师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F832.51;F224
【参考文献】
相关期刊论文 前9条
1 陈健;宋文达;;量化投资的特点、策略和发展研究[J];时代金融;2016年29期
2 何亚莉;;论量化投资对中国资本市场的影响[J];现代商贸工业;2016年19期
3 王淑燕;曹正凤;陈铭芷;;随机森林在量化选股中的应用研究[J];运筹与管理;2016年03期
4 李姝锦;胡晓旭;王聪;;浅析基于大数据的多因子量化选股策略[J];经济研究导刊;2016年17期
5 董素娟;;国内量化产品分类及现状[J];新经济;2016年06期
6 杨喻钦;;基于Alpha策略的量化投资研究[J];中国市场;2015年25期
7 唐炜怡;孟小菊;鄢方方;;量化投资盛行对中国资本市场的影响[J];经营管理者;2013年31期
8 方浩文;;量化投资发展趋势及其对中国的启示[J];管理现代化;2012年05期
9 王博;;国内量化基金现状分析及展望[J];经济视角(下);2011年11期
相关博士学位论文 前1条
1 汪东;基于支持向量机的选时和选股研究[D];上海交通大学;2007年
相关硕士学位论文 前6条
1 张伟;支持向量分类机(SVC)在量化选股中的应用[D];山东大学;2014年
2 王昭栋;多因子选股模型在中国股票市场的实证分析[D];山东大学;2014年
3 卢钰;基于参数优化的支持向量机股票市场趋势预测[D];浙江工商大学;2013年
4 许芳;基于支持向量机的对优质股票选取的研究[D];重庆交通大学;2013年
5 江方敏;基于多因子量化模型的A股投资组合选股分析[D];西南交通大学;2013年
6 陈军华;基于多分类支持向量机的选股模型研究[D];华中科技大学;2010年
,本文编号:1392309
本文链接:https://www.wllwen.com/jingjilunwen/huobiyinxinglunwen/1392309.html