基于树型网络的多源用户兴趣数据融合方法研究

发布时间：2018-08-19 09:24

【摘要】：随着电子商务的发展,网上购买成为了时下一种主流的购物方式,消费者在面对海量信息时,需要付出大量时间来找寻自身感兴趣的商品。在此情况下,个性化推荐系统应运而生,被认为是一种有效且符合消费者商品需求的营销方法,它能解决电子商务网站中消费者购物选择问题,是目前网络信息服务领域的热点之一。个性化服务系统通过分析使用对象的行为信息,来分析消费者个人的兴趣差异习惯,从而提供“一对一”精准营销服务。要实现个性化推荐系统,必须建立用户兴趣模型,用户建模在个性化推荐中处于核心地位,建模的质量直接影响到推荐系统的质量。对此,通过捕捉多源用户兴趣数据并进行数据融合,是提高用户兴趣建模质量的一条重要途径。本文研究的目的在于针对B2C网站环境下,传统协同过滤推荐精度不够高的问题,提出和实现基于用户树型网络的多源用户兴趣数据融合方法,以改善和优化原有方法的推荐质量。全文主要研究内容如下:首先,本文以建模流程为研究视角,从用户信息收集、信息表示、技术处理、更新方式四个方面对个性化推荐系统中的用户兴趣模型建立的现有研究成果进行比较分析,将信息收集归纳为信息来源、信息存储两个方面,用以获取建模的信息来源;将信息表示归纳为语义表示、量化表示两类方法,用以表征具体的用户兴趣偏好;将数据处理归纳为两类技术,即特征词权重、聚类技术,用以加工用户信息而生成用户兴趣模型;将数据更新归纳为时间窗口法、遗忘算法、混合模型等三类方法,用以体现模型中的用户兴趣漂移。其次,从用户购物流程角度出发,总结出能最大程度反映消费者兴趣偏好的4个因子:商品点击行为、商品收藏行为、放入购物车行为、下单行为。然后具体量化每种指标因子的计算,设置相应规则实现静态用户兴趣权重。考虑到用户兴趣变化,设计了随时间变化的兴趣值,弥补了静态系统推荐的不足。针对每个个体,进一步把兴趣区分为长期、短期兴趣,同时给出不同的指数衰减方法。通过上述处理,实现了用户多源兴趣数据的有效融合,可以更好地提高推荐精度。最后,实验基于阿里巴巴集团旗下天猫商城提供的真实用户数据集,通过实施数据融合,训练得到每个用户的兴趣模型,并计算出每位用户的长期、短期兴趣,以及各自的兴趣周期。本文共完成了三组实验,第一组为探讨各指标属性因子值;第二组为周期衰减模型与不区分兴趣周期的指数衰减模型作预测精确度对比实验;第三组为经典协同过滤算法与本文提出的带周期衰减过滤算法对比实验。实验结果表明,多源用户兴趣数据融合的推荐效果优于经典的协同过滤推荐效果。
[Abstract]:With the development of electronic commerce, online shopping has become a mainstream shopping method. Consumers need to spend a lot of time to find the goods they are interested in the face of a great deal of information. In this case, personalized recommendation system emerges as the times require, which is considered to be an effective and suitable marketing method to meet the needs of consumers. It can solve the problem of consumer shopping choice in e-commerce websites. It is one of the hot spots in the field of network information service. By analyzing the behavior information of the users, the individualized service system can analyze the consumers' different habits of interest, thus providing the "one to one" precision marketing service. In order to realize personalized recommendation system, user interest model must be established. User modeling is the core of personalized recommendation, and the quality of modeling directly affects the quality of recommendation system. Therefore, it is an important way to improve the quality of user interest modeling by capturing multi-source user interest data and data fusion. The purpose of this paper is to propose and implement a multi-source user interest data fusion method based on user tree network to solve the problem that the recommendation accuracy of traditional collaborative filtering is not high enough in B2C website environment. To improve and optimize the recommended quality of the original method. The main contents of this paper are as follows: firstly, from the perspective of modeling process, this paper focuses on user information collection, information representation, and technology processing. This paper compares and analyzes the existing research results of user interest model in personalized recommendation system from four aspects of updating mode, and summarizes the information collection into two aspects: information source and information storage, in order to obtain the information source of modeling. The information representation is classified into semantic representation and quantitative representation to represent specific user preferences, and the data processing is classified into two kinds of techniques, namely, the weight of feature words, the clustering technique. The user interest model is generated by processing user information, and the data update is summarized into three kinds of methods, such as time window method, forgetting algorithm and hybrid model, to reflect the drift of user interest in the model. Secondly, from the point of view of the user's shopping flow, four factors which can reflect the consumer's interest and preference to the greatest extent are summarized: commodity click behavior, commodity collection behavior, shopping cart behavior, order behavior. Then the calculation of each index factor is quantified and the corresponding rules are set to realize the static user interest weight. Considering the change of user's interest, the interest value changed with time is designed, which makes up for the deficiency of static system recommendation. For each individual, interest is further divided into long term and short term interest, and different exponential decay methods are given. Through the above processing, the effective fusion of user's multi-source interest data can be realized, and the recommendation accuracy can be improved better. Finally, the experiment is based on the real user data set provided by Tmall Mall, which is owned by Alibaba Group. Through the implementation of data fusion, the interest model of each user is trained, and the long-term and short-term interest of each user is calculated. And their respective interest cycles. In this paper, three groups of experiments have been completed, the first group is to discuss the attribute factor value of each index, the second group is to compare the prediction accuracy between the periodic attenuation model and the exponential attenuation model which does not distinguish the period of interest. The third group is a comparative experiment between the classical collaborative filtering algorithm and the periodic attenuation filtering algorithm proposed in this paper. The experimental results show that the recommendation effect of multi-source user interest data fusion is better than that of classical collaborative filtering.
【学位授予单位】：四川师范大学
【学位级别】：硕士
【学位授予年份】：2015
【分类号】：F724.6

【参考文献】