两种增强双标图可视化的方法及其在成分数据上的应用
发布时间:2019-01-23 08:29
【摘要】:双标图是一种广泛应用的可视化分析方法,但是当所研究的数据包含较多变量时,如果直接用双标图进行分析会导致图中较多变量重叠,不能很清晰地观察变量间的相关关系,可视化程度较低,分析效果不精确,因此寻找一些能够有效解决一般的多变量数据的统计方法就非常必要.针对上述问题,本文提出了两种增强双标图的可视化的分析方法,第一种是基于聚类分析的双标图分析方法,首先通过对原始数据进行聚类分析,得到新的数据集,然后对得到的新数据集进行双标图分析.另一种方法是基于主成分和聚类分析提出一种新的双标图分析方法.此两种方法不仅保留了数据间的绝大多数信息,而且使得双标图的可视化程度增强.对两种新的双标图方法进行实证分析,并与原始数据构成的双标图进行比较研究,验证了该方法的有效性,最后将两种新的双标图方法推广应用到成分数据上.论文主要由五章组成.第一章是引言,主要介绍了本文的研究背景,问题的提出及其实际意义,简要说明本文的工作及创新之处,并给出了本文的主要结构.第二章是双标图的简介,对双标图的一般模型进行了描述,简单介绍了双标图的基础理论知识,并简单介绍了三种类型的双标图.第三章简绍了两种增强双标图可视化的方法.针对多变量数据集,如果直接用双标图进行分析会导致图中较多变量重叠,不能很清晰地观察变量间的相关关系,可视化程度较低,分析效果不精确,故本章提出了两种增强双标图的可视化的分析方法.第一种是基于聚类分析的双标图分析方法,首先对原始数据集进行分类,得到一些新的数据集,然后利用双标图对新的数据集进行分析,分析每类中原始变量与均值变量之间的关系.对新的双标图分析方法进行实例分析,并与原始数据构成的双标图进行比较研究,验证了该方法的有效性.第二种是基于聚类分析和主成分分析的双标图分析方法,首先基于主成分分析和聚类分析,对原始数据集进行分类,得到新的数据集,对新的数据集进行双标图方法进行了实例验证,验证了该方法的有效性.以上两种方法不仅保留了数据间的绝大多数信息,而且使得双标图的可视化程度增强.第四章介绍了成分数据双标图的构造步骤及其成分数据的基本理论,将第三章提出的两种方法应用到成分数据中进行实例验证.第五章是结论部分.本文对两种增强双标图可视化分析方法进行了总结,发现在多变量数据集条件下,直接利用传统的双标图分析方法存在一些弊端,即可视化可能会降低,而本文提出的这两种增强双标图可视化的分析方法很好的解决了双标图可视化低的问题.本文的目的是希望找到一种既不丢失数据,又能很好的分析多变量数据集的双标图分析方法,使得可视化增强.
[Abstract]:Double plot is a widely used visual analysis method, but when the data under study contains more variables, if the data is analyzed directly, it will lead to more variables overlap in the graph, so the correlation between variables can not be observed clearly. The degree of visualization is low and the analysis effect is not accurate. Therefore, it is necessary to find some statistical methods that can effectively solve the general multivariable data. In order to solve the above problems, this paper proposes two methods to enhance the visualization of double plot. The first method is based on clustering analysis. Firstly, a new data set is obtained by clustering the original data. Then the new data set is analyzed by double plot. Another method is a new method based on principal component and cluster analysis. These two methods not only retain most of the information between the data, but also enhance the visualization degree of the double plot. Two new double mapping methods are empirically analyzed and compared with those of original data. The validity of this method is verified. Finally, two new double mapping methods are extended to component data. The thesis consists of five chapters. The first chapter is the introduction, which mainly introduces the research background, the problem and its practical significance, briefly explains the work and innovation of this paper, and gives the main structure of this paper. The second chapter is a brief introduction of double plotting. The general model of double plotting is described, the basic theoretical knowledge of double plotting is briefly introduced, and three types of double plotting are briefly introduced. In the third chapter, two methods to enhance the visualization of double map are introduced briefly. In view of multivariate data sets, if the analysis of multivariate data sets is carried out directly, it will lead to the overlapping of more variables in the graph, so the correlation between variables can not be observed clearly, the visualization degree is low, and the analysis effect is not accurate. Therefore, this chapter proposes two methods to enhance the visualization of double maps. The first method is based on clustering analysis. First, the original data set is classified and some new data sets are obtained, then the new data set is analyzed by using double scale graph. The relationship between the original variable and the mean variable in each class is analyzed. A case study of the new double plot analysis method is carried out and compared with that of the original data. The validity of the method is verified. The second method is based on cluster analysis and principal component analysis. Firstly, based on principal component analysis and clustering analysis, the original data sets are classified and a new data set is obtained. An example is given to verify the validity of the new method. The above two methods not only retain most of the information between the data, but also enhance the visualization of the two maps. In chapter 4, the construction steps and the basic theory of the composition data are introduced. The two methods proposed in chapter 3 are applied to the component data for example verification. Chapter five is the conclusion. In this paper, two methods of enhanced double map visualization analysis are summarized. It is found that under the condition of multivariable data set, there are some disadvantages in using traditional double map analysis method directly, that is, visualization may be reduced. The two analysis methods proposed in this paper can solve the problem of low visualization of double diagrams. The purpose of this paper is to find a bivariate map analysis method that can analyze multivariate data sets without losing data, so as to enhance visualization.
【学位授予单位】:山西大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O212
本文编号:2413634
[Abstract]:Double plot is a widely used visual analysis method, but when the data under study contains more variables, if the data is analyzed directly, it will lead to more variables overlap in the graph, so the correlation between variables can not be observed clearly. The degree of visualization is low and the analysis effect is not accurate. Therefore, it is necessary to find some statistical methods that can effectively solve the general multivariable data. In order to solve the above problems, this paper proposes two methods to enhance the visualization of double plot. The first method is based on clustering analysis. Firstly, a new data set is obtained by clustering the original data. Then the new data set is analyzed by double plot. Another method is a new method based on principal component and cluster analysis. These two methods not only retain most of the information between the data, but also enhance the visualization degree of the double plot. Two new double mapping methods are empirically analyzed and compared with those of original data. The validity of this method is verified. Finally, two new double mapping methods are extended to component data. The thesis consists of five chapters. The first chapter is the introduction, which mainly introduces the research background, the problem and its practical significance, briefly explains the work and innovation of this paper, and gives the main structure of this paper. The second chapter is a brief introduction of double plotting. The general model of double plotting is described, the basic theoretical knowledge of double plotting is briefly introduced, and three types of double plotting are briefly introduced. In the third chapter, two methods to enhance the visualization of double map are introduced briefly. In view of multivariate data sets, if the analysis of multivariate data sets is carried out directly, it will lead to the overlapping of more variables in the graph, so the correlation between variables can not be observed clearly, the visualization degree is low, and the analysis effect is not accurate. Therefore, this chapter proposes two methods to enhance the visualization of double maps. The first method is based on clustering analysis. First, the original data set is classified and some new data sets are obtained, then the new data set is analyzed by using double scale graph. The relationship between the original variable and the mean variable in each class is analyzed. A case study of the new double plot analysis method is carried out and compared with that of the original data. The validity of the method is verified. The second method is based on cluster analysis and principal component analysis. Firstly, based on principal component analysis and clustering analysis, the original data sets are classified and a new data set is obtained. An example is given to verify the validity of the new method. The above two methods not only retain most of the information between the data, but also enhance the visualization of the two maps. In chapter 4, the construction steps and the basic theory of the composition data are introduced. The two methods proposed in chapter 3 are applied to the component data for example verification. Chapter five is the conclusion. In this paper, two methods of enhanced double map visualization analysis are summarized. It is found that under the condition of multivariable data set, there are some disadvantages in using traditional double map analysis method directly, that is, visualization may be reduced. The two analysis methods proposed in this paper can solve the problem of low visualization of double diagrams. The purpose of this paper is to find a bivariate map analysis method that can analyze multivariate data sets without losing data, so as to enhance visualization.
【学位授予单位】:山西大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O212
【参考文献】
相关期刊论文 前7条
1 陈享光;孙科;;我国行业间工资差距的动态考察[J];中国人民大学学报;2014年02期
2 严威凯;;双标图分析在农作物品种多点试验中的应用[J];作物学报;2010年11期
3 薛付忠;王洁贞;郭亦寿;胡平;;人类群体遗传结构的双标图模型及其应用[J];科技导报;2006年05期
4 严威凯,盛庆来,胡跃高,L A Hunt;GGE叠图法─分析品种×环境互作模式的理想方法[J];作物学报;2001年01期
5 孟宪伟,杜德文,吴金龙;成分数据的因子分析及其在地质样品分类中的应用[J];长春科技大学学报;2000年04期
6 周蒂;地质成分数据统计分析——困难和探索[J];地球科学;1998年02期
7 张崇甫,陈述云;成分数据主成分分析及其应用[J];数理统计与管理;1996年04期
,本文编号:2413634
本文链接:https://www.wllwen.com/kejilunwen/yysx/2413634.html