当前位置:主页 > 管理论文 > 货币论文 >

基于并行统计计算的金融数据分析

发布时间:2018-02-11 03:49

  本文关键词: 统计计算 回归 非参数推断 随机过程 出处:《山东大学》2012年博士论文 论文类型:学位论文


【摘要】:现代计算机系统更加强大,使许多统计计算可以在瞬间完成。然而,一些重要的情况计算时间仍然需要用天来算,尤其是大样本海量数据或较大复杂抽样数据的统计推断。故一般的处理方法是用速度较快,但不太准确的方法,或完全跳过潜在的重要计算。因此,并行统计计算的发展是非常重要的。 在这篇论文中,我们研究了工资数据,破产数据的加速机会,和养老基金数据的统计方法。我们发现了,并行统计计算处理大型统计推断问题良好的速度性能。本论文由五个章节,其主要内容描述如下: 第一章并行统计计算是一个非常有趣的问题:在统计中,有很多统计计算是密集并行,因此并行和统计计算之间交叉的研究非常重要。本章重点关注的是回归问题,非参数推断,随机过程。特别是,我们综述的方法有并行多分裂法,线性回归最小二乘的并行统计解法和非线性回归并行统计算法,并行自助在非参数推断的理论结构:马氏链的并行统计解法,并行马氏链蒙特卡洛。非常重要的是,我们对并行GPU处理非图形的应用给出了综述。我们的结论是,并行统计算法的进一步研究是必须的。并对一些重要且悬而未决的问题给予了描述。 第二章对于执行多元线性模型,子集选择和运行时间是很重要的问题。为了解决这些问题,我们引入一个新的并行估计。首先给出这一方法和广义最小二乘估计的等价条件,并考虑了投影和特征值的秩。然后,当存在一个稳定解时,我们给出它的误差。此外,我们所提出的方法,被用于破产数据,获得了一个数据集的估计方程,并报告了两个数据模拟的执行时间。 第三章探讨解大样本方程的乘性和阻尼加性施瓦茨法的收敛理论。对于大样本的广义线性模型和广义加性模型,我们建议施瓦茨法解拟似然和惩罚拟似然。施瓦茨法用于一个子模型的序列,其中每个子模型对应两步估计参数中元素的一个子集,组合的子模型一起产生整个模型的解。这项技术可被用于模型比较,其中子模型的拟合值被用来作为一个更大模型的初始值。 第四章并行自助是一个非常有用,时间性能突出的统计方法。然而,该法的理论研究还没有出现。在本章,介绍一个关于该法的工作相关矩阵,称为并行自助矩阵。我们考虑该重抽样的一些性质,以及光滑函数模型的相关最优子样本长度。我们出现了并行自助估计的时间性能研究;对于金融时间序列数据,给出了子样本长度选择的一些性能研究结果。 第五章研究马氏链拟平稳分布的计算方法。这里的矩阵为拟随机阵,即,每行的和小于或等于1。我们发展施瓦茨法解该分布。特别是,得到了加性和乘性施瓦茨以及两水平的半收敛性。为了解释建议的方法,我们给出了马氏链拟平稳分布的两个例子。
[Abstract]:Modern computer systems are more powerful, so that many statistical calculations can be completed in an instant. However, in some important cases, computing time still needs to be calculated in days. In particular, the statistical inference of large sample mass data or large and complex sample data. Therefore, the general processing method is to use a faster, but less accurate method, or skip the potentially important calculation completely. The development of parallel statistical computing is very important. In this paper, we looked at wage data, accelerated opportunities for bankruptcy data, and statistical methods for pension fund data. Parallel statistical computation has good speed performance in dealing with large scale statistical inference problems. This paper consists of five chapters, the main contents of which are described as follows:. Chapter 1 parallel statistical computing is a very interesting problem: in statistics, there are many statistical computations that are dense and parallel, so the study of the intersection between parallel and statistical computing is very important. Nonparametric inference, stochastic processes. In particular, the methods we review include parallel multisplitting, linear regression least squares parallel statistical solution and nonlinear regression parallel statistical algorithm. The theoretical structure of parallel self-help nonparametric inference: the parallel statistical solution of Markov chain, the parallel Markov chain Monte Carlo. Very important, we give an overview of the application of parallel GPU processing non-graph. Further research on parallel statistical algorithms is necessary, and some important and unsolved problems are described. In chapter 2, subset selection and running time are very important problems for multivariate linear models. In order to solve these problems, we introduce a new parallel estimator. First, we give the equivalent conditions of this method and generalized least square estimation. And we consider the rank of projection and eigenvalue. Then, when there is a stable solution, we give the error of it. In addition, our method is applied to the ruin data, and the estimation equation of a data set is obtained. The execution time of two data simulations is reported. In chapter 3, we discuss the convergence theory of the multiplicative and damped additive Schwartz method for solving large sample equations. For the generalized linear model and generalized additive model of large sample, We suggest that Schwartz's method be used to solve quasi-likelihood and punish quasi-likelihood. Schwartz's method is used for the sequence of a submodel, where each submodel corresponds to a subset of elements in a two-step estimation parameter. This technique can be used for model comparison where the fitting value of the submodel is used as the initial value of a larger model. Chapter 4th parallel self-help is a very useful statistical method with outstanding time performance. However, the theoretical study of this method has not yet appeared. In this chapter, a work correlation matrix is introduced. We consider some properties of this resampling and the relevant optimal subsample length of smooth function model. We have studied the time performance of parallel self-help estimation. Some research results on the selection of subsample length are given. In chapter 5th, we study the calculation method of quasi-stationary distribution of Markov chains. The matrix here is quasi random matrix, that is, the sum of each row is less than or equal to 1.We develop the Schwartz method to solve the distribution. We obtain additive and multiplicative Schwartz and semi-convergence of two levels. In order to explain the proposed method, we give two examples of quasi-stationary distribution of Markov chains.
【学位授予单位】:山东大学
【学位级别】:博士
【学位授予年份】:2012
【分类号】:F224;F830

【共引文献】

相关博士学位论文 前1条

1 周春英;超数据集成挖掘方法与技术研究[D];浙江大学;2012年



本文编号:1502140

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/huobilw/1502140.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户66378***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com