嵌入向量在金融时间序列中的应用
发布时间:2018-03-10 17:31
本文选题:嵌入向量 切入点:金融时间序列 出处:《吉林大学》2017年硕士论文 论文类型:学位论文
【摘要】:金融是现代经济的核心,也是调控宏观经济的重要杠杆,它对于人类社会具有非常重要的作用。金融具有两面性,带给我们利益的同时也带来一系列消极影响,如通货膨胀、金融泡沫、金融危机等。为使金融带来的负面影响降至最低,对金融的相关分析则至关重要,金融的分析可以通过对其建立时间序列模型来实现。所谓金融时间序列是指金融相关数据按时间顺序排列而成的数列,对金融时间序列的分析能够直接的反应出金融的活动规律。金融时间序列中包括股票数据按既定时间顺序的排列,对股票数据的走势预测分析可侧面反映出金融市场的活动趋势。在金融时间序列中,不同时刻的数据之间必然存在着一定的关系,但金融市场的影响因素众多而且复杂,导致这种关系很难用清晰的数学模型刻画出来,这一点对于金融时间序列的分析和预测有着重要的影响。在自然语言处理领域,已有研究通过所谓的嵌入向量表示方法将文本的上下文在语法语义上的连贯关系刻画出来,那么同样的前后具有关联的金融时间序列,是否可以使用嵌入向量思想将这种关联表现出来,进一步实现金融时间序列的预测,正是本文所要研究的主要内容。论文的主要工作可以归纳为两方面:首先通过引入嵌入向量思想提出了金融日向量和金融周向量两种金融时间序列的表示方法,以表示这些向量在时间序列中的“上下文”相关性;之后采用提出的金融嵌入向量表示模型,对金融时间序列进行了预测。传统嵌入向量是指词嵌入向量和句子嵌入向量,在自然语言处理领域中的应用主要是对文本进行分析。文本中所包含的词的个数是有限的,而金融时间序列一般是连续的,因此要将嵌入向量的思想应用于金融时间序列分析中,首先需要采用离散化方法将金融数据映射到一个有限的集合,之后便可将嵌入向量应用于金融时间序列分析中。与“词嵌入向量”相对应,我们可以得到所谓“金融日向量”,即将某一个离散化之后的股市数据映射为一个实数向量,其对应的是单个交易日的信息。与“词嵌入向量”类似,得到的“金融日向量”之间能够反映出它们在金融时间序列中的连贯关系。类似的,与“句子嵌入向量”相对应,我们得到了所谓“金融周向量”,将周一到周五连续5天交易日的股市数据映射为一个实数向量,对应的是股市一周的信息。为验证嵌入向量在金融时间序列应用的可行性,本文将所提出的方法应用于标准普尔500指数数据集。选取5种相关的股市参数作为原始数据,即开盘价、最高价、最低价、收盘价、交易量5个参数进行分析。首先对数据集进行离散化,并使用本文提出的方法进行训练得到金融嵌入向量:“金融日向量”和“金融周向量”,之后选用径向基函数(Radial Basis Functions,RBF)神经网络作为预测模型,利用得到的金融嵌入向量,分别对日收盘价和周收盘价进行预测分析。实验的结果表明,金融日向量和金融周向量能够实现对收盘价的预测,与使用原始股市数据相比取得了更好的效果。嵌入向量的思想为金融时间序列分析提供了一种新的思路。
[Abstract]:Finance is the core of modern economy, is an important lever of macroeconomic regulation, it plays a very important role in human society. The finance has two sides, bring us interests at the same time also brought a series of negative effects, such as inflation, financial bubble, financial crisis and other negative effects. As the financial brought to a minimum, correlation analysis it is of vital importance to the financial, financial analysis through the establishment of time series model to achieve the so-called financial time series refers to the financial data and arranged in chronological order of the series, on the analysis of financial time series can directly reflect the law of the financial activities. Including stock data according to the established time sequence arrangement in financial time series, the data on the stock trend prediction analysis can reflect the trend of financial market activities. In the financial time series, not the same time The data must have a certain relationship, but the influence factors of financial market are numerous and complex, the relationship is difficult to describe with mathematical model, which has an important impact point for financial time series analysis and prediction. In the field of Natural Language Processing, has been studied by embedding the so-called vector representation will the text in the context of grammatical and semantic coherence relations describe, financial time series, then the same and have related, whether you can use the embedded vector thought this correlation performance, further realize the forecasting of financial time series, is the main content of this paper is to study. The main work of this paper can be divided into two aspects: firstly, by introducing the idea of embedded vector representation on the financial and financial week two vector vector financial time series, on the table These vectors in the time series of the "context" correlation; after using the proposed financial embedding vector representation model of financial time series are forecasted. The traditional embedded vector refers to words and sentences embedded embedded vector vector, in Natural Language Processing in the field should be used mainly for text analysis. The number contained in the document the word is limited, and the financial time series is continuous, so it will be applied to the ideological embedded vector financial time series analysis, first of all need to adopt the discretization method of the financial data is mapped to a finite set, then the embedded vector is applied to the analysis of financial time series corresponding to. And "the word embedded vector", we can get the so-called "financial daily vector" and is mapping the stock market data after one of the discretization is a real vector, which is to be A single trading day. Similar to "word embedded vector" and the "financial daily vector" can reflect between them in financial time series in the coherent relationship. Similar to the corresponding "sentence embedded vector", we get the so-called "financial week, Monday to Friday will be the vector mapping of stock market data 5 consecutive days of trading days as a real vector, corresponding to the stock market a week. To verify the feasibility of embedded in the vector financial time series application, the proposed method is applied to the S & P 500 index data sets. Select 5 kinds of stock market related parameters as the original data, namely the opening price, the most high price, lowest price, closing price, trading volume of 5 parameters. The first set of data discretization, and embedded vector financial training using the method proposed in this paper:" financial daily vector "and" financial week Vector, after using radial basis function (Radial Basis Functions RBF) neural network as the predictive model, the use of financial embedding vectors are obtained, respectively, to the closing price and the closing price of the week were predicted and analyzed. The experimental results show that the financial and financial week on vector vector can be achieved on the prediction of the closing price, compared with the use of the original stock data achieved better results. Provide a new idea for the idea of embedding vector financial time series analysis.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F831.51;O211.61
【参考文献】
相关期刊论文 前5条
1 刘欣;王波;毛二松;;基于PV-DM模型的多文档摘要方法[J];计算机应用与软件;2016年10期
2 金桃;岳敏;穆进超;宋伟国;何艳珊;陈毅;;基于SVM的多变量股市时间序列预测研究[J];计算机应用与软件;2010年06期
3 孙延风;梁艳春;张文力;吕英华;;RBF神经网络最优分割算法及其在股市预测中的应用[J];模式识别与人工智能;2005年03期
4 赵清林,郭艳兵,梅强,齐占庆;确定RBF神经网络中心点的方法综述[J];广东自动化与信息工程;2002年02期
5 朱明星,张德龙;RBF网络基函数中心选取算法的研究[J];安徽大学学报(自然科学版);2000年01期
,本文编号:1594344
本文链接:https://www.wllwen.com/jingjilunwen/huobiyinxinglunwen/1594344.html