面向服务推荐的多源个人数据相关性分析

发布时间：2018-05-07 08:37

本文选题：多源个人数据 + 个人数据相关性　；参考：《哈尔滨工业大学》2017年硕士论文

【摘要】：在生活中,人们使用大量的服务来满足生活,工作,学习等各个方面的需求。用户使用服务的过程中,会产生大量个人相关的数据,这些个人数据刻画了个人在不同方面的偏好、习惯、兴趣。虽然这些个人数据分散在各个服务中,但是这些数据之间存在以用户为中心的潜在关联,这种相关性有助于在服务推荐中为用户提供更准确的推荐。现有的推荐算法很少有将用户的多源数据应用到算法当中。针对如上问题,本文将用户分散在各个服务中的数据融合在一起,根据服务数据间的相关性等信息,为用户提供更加准确的推荐。本文主要研究并解决了以下几个问题:(1)相关性度量:收集同一活跃用户在多个服务中的数据,基于LDA方法从多源个人数据中抽取主题,提出了基于主题相似性的多源个人数据相关性度量方法,进而挖掘多源数据之间的相关性呈现的多种相关性形态。研究表明即使用户在不同服务中产生的个人数据是相关的,相关性的形态也不一定相同。(2)相关性及其形态演化分析:从时间的角度,分析多服务个人数据之间的相关性演化遵循的规律。个人数据不是静态的,可能随时间,用户的经历,兴趣爱好等因素而产生变化,所以导致多服务的个人数据之间的相关性可能不是一成不变的,存在不同演化规律。因为用户的行为习惯不同,个体差异很大,导致其相关性形态可能存在差异,通过度量相关性形态之间的差异,分析用户相关性形态随着时间变化的演化规律。(3)推荐策略制定:根据服务数目的不同和服务数据间相关性信息,制定不同的数据融合策略,针对用户各个服务间的相关性,用户在服务中的活跃程度等信息,计算使用不同数据融合策略所进行推荐的效果存在的差异,得到相关性对推荐的准确性的影响。找到使用同一组数据做推荐时,对应的最优的推荐策略。根据用户服务数据间的相关性和推荐对应的最优策略等信息,为符合某种特征的用户找到最适合的推荐策略。本文收集了基于共同用户的多源个人数据,提出了一种基于主题相似性的多源个人数据相关性度量方法。进而,挖掘出若干种典型的个人数据相关性形态,并分析了多源个人数据之间的相关性形态随时间演化遵循的规律。制定六种服务数据融合策略,根据这些策略在服务推荐算法中分别融合不同的个人数据,为用户制定更加准确的推荐策略提供了帮助,提升推荐的性能。
[Abstract]:In life, people use a large number of services to meet the needs of life, work, learning and other aspects. In the process of using the service, users will produce a large number of personal related data, which describe the preferences, habits and interests of individuals in different aspects. Although these personal data are scattered in various services, there is a potential user-centric correlation between these data, which helps to provide more accurate recommendation for users in service recommendation. The existing recommendation algorithms rarely apply the user's multi-source data to the algorithm. In order to solve the above problem, the data scattered by users in each service are fused together in this paper, and more accurate recommendation is provided according to the information such as the correlation between the service data and so on. This paper mainly studies and solves the following problems: collecting the data of the same active user in multiple services and extracting the topic from the multi-source personal data based on the LDA method. A multi-source personal data correlation measurement method based on topic similarity is proposed to mine the correlation patterns of multi-source data. Studies have shown that even if the personal data generated by users in different services are relevant, the form of correlation is not necessarily the same. This paper analyzes the rules followed by the evolution of correlation between multi-service personal data. Personal data is not static, it may change with time, user's experience, interest and so on, so the correlation between personal data that leads to multi-service may not be fixed, and there are different evolution rules. Because the user's behavior habit is different, the individual is very different, which may lead to the difference of the correlation form, by measuring the difference between the correlation forms, According to the different number of services and the correlation information between the service data, different data fusion strategies are formulated, aiming at the correlation between different services. Based on the information of the user's activity in the service, this paper calculates the difference in the effect of recommendation with different data fusion strategies, and obtains the effect of correlation on the accuracy of recommendation. Find the optimal recommendation strategy when using the same set of data for recommendation. According to the information of the correlation between user service data and the recommendation corresponding to the optimal policy, the most suitable recommendation strategy is found for the users who conform to certain characteristics. In this paper, we collect multi-source personal data based on common users, and propose a method to measure the correlation of multi-source personal data based on topic similarity. Furthermore, several typical patterns of personal data correlation are mined, and the rules followed by the evolution of multi-source personal data over time are analyzed. Six kinds of service data fusion strategies are formulated according to which different personal data are fused in the service recommendation algorithm which can help users to formulate more accurate recommendation strategies and improve the performance of recommendation.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.3

【参考文献】