基于校园资源云的Spark图书推荐技术的研究
[Abstract]:With the development of information construction in colleges and universities, the construction of campus cloud platform has become the focus of attention. The construction of campus resource cloud platform can meet and protect the needs of the school in all aspects, and provide an efficient and reliable computing storage platform for the analysis of campus big data. The research of this topic depends on the campus resource cloud platform. Because of this also obtained the strong information infrastructure support. At the same time, the extensive application of various business management information systems makes the data accumulate continuously. Among them, the library management application system accumulates a large number of historical data of the circulation of books, and with the development of time, the data in the system is increasing. And there's a lot of valuable information lurking behind these data. In order to make full use of the library book circulation data and improve the information experience of teachers and students, this paper makes a deeper analysis and research on it, so that teachers and students can get personalized book recommendation service. In this paper, the cloud platform of campus resources is first calculated, storage resources and platform functions are designed, then the cloud platform is used as the test and running platform of book recommendation, on which Spark cluster is built, HDFS as storage system and Spark as computing platform. This paper studies the technology of book recommendation. In order to solve the problem of missing data and data form, this paper preprocesses the original data and constructs the user-book scoring matrix. In order to solve the problem of data sparsity, this paper adopts the cooperative filtering algorithm of ALS matrix decomposition, and then integrates K-Means clustering algorithm into ALS matrix decomposition algorithm to solve the cold start problem of users. Aiming at the problem of attribute weight and initial value of K-Means algorithm, the weighted Euclidean distance and the maximum minimum algorithm are used to optimize the algorithm. Finally, the algorithm is implemented on Spark, and the experiment is designed to verify the implementation of personalized book recommendation for different users. Through experiments, the optimal parameters of ALS matrix decomposition algorithm are determined. It is proved that the proposed hybrid recommendation algorithm can solve the problem of data sparsity and cold start, and the optimization of K-Means algorithm can improve the clustering effect. The integration of clustering algorithm improves the prediction accuracy and computing speed. Finally, the advantage of Spark cluster is verified by parallel computing speedup on Spark platform.
【学位授予单位】:西安科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3
【参考文献】
相关期刊论文 前10条
1 吴金李;张建明;;基于二分K-means的协同过滤推荐算法[J];软件导刊;2017年01期
2 董文俊;李艳;郎建华;张晨;沐士光;;云计算背景下的云存储服务研究[J];中小企业管理与科技(上旬刊);2016年08期
3 李东兴;;虚拟化技术及其在数据中心的应用研究[J];中国教育技术装备;2015年10期
4 李彦广;;基于Spark+MLlib分布式学习算法的研究[J];商洛学院学报;2015年02期
5 朱扬勇;孙婧;;推荐系统研究进展[J];计算机科学与探索;2015年05期
6 闫晓丽;;云计算安全问题[J];信息安全与技术;2014年03期
7 曹磊;;世界云服务市场发展趋势研究[J];竞争情报;2013年03期
8 Jun Li;Baochun Li;;Erasure Coding for Cloud Storage Systems: A Survey[J];Tsinghua Science and Technology;2013年03期
9 龚强;;当代云计算发展研究现状[J];测绘与空间地理信息;2013年05期
10 张建莉;;云存储技术在高校信息化建设中的应用分析[J];科技视界;2012年28期
相关博士学位论文 前1条
1 黎明;云计算资源管理关键技术研究[D];电子科技大学;2015年
相关硕士学位论文 前10条
1 徐江辉;基于Hadoop的聚类协同过滤推荐算法研究及应用[D];湖南大学;2016年
2 陈传瑜;基于聚类的协同过滤推荐算法研究[D];广东工业大学;2016年
3 杨志伟;基于Spark平台推荐系统研究[D];中国科学技术大学;2015年
4 王一霈;分布式全文检索系统中索引平台和信息过滤的研究与应用[D];中国科学技术大学;2015年
5 李文栋;基于Spark的大数据挖掘技术的研究与实现[D];山东大学;2015年
6 胡于响;基于Spark的推荐系统的设计与实现[D];浙江大学;2015年
7 谢欢;大数据挖掘中的并行算法研究及应用[D];电子科技大学;2015年
8 孙科;基于Spark的机器学习应用框架研究与实现[D];上海交通大学;2015年
9 王琪;基于聚类的商品推荐算法的研究与应用[D];北京交通大学;2014年
10 陈天昊;互联网电影推荐方法的研究与实现[D];中国科学技术大学;2014年
,本文编号:2302976
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2302976.html