基于并行计算的交互式数据挖掘和可视化系统
[Abstract]:With the development of information technology, the amount of data increases explosively. The traditional data mining technology based on CPU can not deal with such a huge amount of data efficiently. In addition, the human brain is easier to recognize color and geometry for boring numbers. Using data visualization technology, data mining results can be more naturally and intuitively presented in the operation interface, which can better meet the needs of users. But at present, the traditional data visualization tools used in data mining can only draw 2D or 3D graphics, and lack of interactivity. This paper presents an interactive data mining and visualization system based on parallel computing. In this paper, the classical data stream mining algorithm is optimized by using GPU (Graphics Processing Unit) programming method. The traditional data mining technology based on CPU adopts serial data processing method, which can not meet the needs of multiple computer resources running at the same time. When the amount of data is large, the number of iterations will be many, the memory requirement will be large, the processing speed will be very slow and the efficiency will be low. The GPU programming method uses the parallel way to process the data. The multiple threads run independently and simultaneously, so the operation efficiency is very high, so it is more suitable to deal with a large amount of data. Aiming at the data independence and data dependence in big data, this paper optimizes the clustering algorithm K-Means and the connected area marking algorithm (Connected Component Labeling,CCL by using GPU programming technology, and completes the mining analysis of big data. In this paper, an interactive method of data visualization is proposed. In order to realize the visualization of data, we use the software development kit of DirectX to transform the original data set or data mining result into vertex, line, surface, color and other graphics. The multi-dimensional model is built by using various clear graphic functions provided in the software development toolkit, and the final visualization results are rendered. In addition, we also create a graphical user interface (GUI),) which can change the clustering parameters according to their different requirements and get the visualization results that meet their needs. Based on the above algorithm, the energy consumption data generated by air conditioning operation are experimented in this paper, and the traditional algorithm is optimized by using GPU programming method, which not only realizes the clustering analysis of the data, The experimental data show that the speed of the system is greatly improved and the operation efficiency is higher when the system is used to deal with the huge amount of data. In addition, we use the software development kit of DirectX to represent the abstract data mining results as concrete four-dimensional three-dimensional graphics and images, and users can change the visual view of the visual results and the K value of clustering through keyboard operation. Get the results you want to meet the real needs of users.
【学位授予单位】:北方工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.13
【参考文献】
相关期刊论文 前10条
1 权国龙;冯园园;冯仰存;顾小清;;面向知识的可视化技术分析与观察[J];远程教育杂志;2016年01期
2 邓仲华;刘伟伟;陆颖隽;;基于云计算的大数据挖掘内涵及解决方案研究[J];情报理论与实践;2015年07期
3 Yang Ju;Heping Xie;Zemin Zheng;Jinbo Lu;Lingtao Mao;Feng Gao;Ruidong Peng;;Visualization of the complex structure and stress field inside rock by means of 3D printing technology[J];Chinese Science Bulletin;2014年36期
4 Yufeng Zhao;Qi Xie;Liyun He;Baoyan Liu;Kun Li;Xiang Zhang;Wenjing Bai;Lin Luo;Xianghong Jing;Ruili Huo;;Comparsion analysis of data mining models applied to clinical research in Traditional Chinese Medicine[J];Journal of Traditional Chinese Medicine;2014年05期
5 潘巍;李战怀;;大数据环境下并行计算模型的研究进展[J];华东师范大学学报(自然科学版);2014年05期
6 Amani Tahat;Jordi Marti;Ali Khwaldeh;Kaher Tahat;;Pattern recognition and data mining software based on artificial neural networks applied to proton transfer in aqueous environments[J];Chinese Physics B;2014年04期
7 Chenyang Ge;Zuoxun Hou;Huimin Yao;Nanning Zheng;Wenzhe Zhao;;A new implementation of image-processing engine for 3D visualization and stereo video stream display[J];Chinese Science Bulletin;2014年Z1期
8 Zhen Chen;Fuye Han;Junwei Cao;Xin Jiang;Shuo Chen;;Cloud Computing-Based Forensic Analysis for Collaborative Network Security Management System[J];Tsinghua Science and Technology;2013年01期
9 孙大为;常桂然;高尚;靳立忠;王兴伟;;Modeling a Dynamic Data Replication Strategy to Increase System Availability in Cloud Computing Environments[J];Journal of Computer Science & Technology;2012年02期
10 牛东晓;王永利;马小勇;;Optimization of support vector machine power load forecasting model based on data mining and Lyapunov exponents[J];Journal of Central South University of Technology;2010年02期
相关博士学位论文 前4条
1 李秋虹;基于MapReduce的大规模数据挖掘技术研究[D];复旦大学;2013年
2 周勇;基于并行计算的数据流处理方法研究[D];大连理工大学;2013年
3 张小庆;基于云计算环境的资源提供优化方法研究[D];武汉理工大学;2013年
4 任永功;面向聚类的数据可视化方法及相关技术研究[D];东北大学;2006年
相关硕士学位论文 前1条
1 王莉;基于Hadoop的大数据平台数据挖掘云服务研究[D];长江大学;2016年
,本文编号:2228283
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/2228283.html