面向协同标记质量的用户激励机制研究

发布时间：2018-01-31 10:31

本文关键词： 协同标记标签质量激励机制　出处：《山东大学》2014年硕士论文　论文类型：学位论文

【摘要】：协同标记系统是利用众包机制实现网络资源管理的代表性应用,是展现群体智慧的平台。它允许众多用户对网络资源自由标记,由此产生的标签数据在海量Web资源的搜索、挖掘和推荐中发挥重要作用。然而,由于用户标记的自由性,在实际应用中,标签数据存在不相关、拼写错误、同义词、一词多义等问题,降低了系统资源的标记质量,成为制约标签应用的重要原因。因此,提升系统资源标记质量成为当前协同标记领域研究的热点。针对系统资源标记质量较低的问题,现有工作主要包括标签推荐、基于语义的标记以及资源足量标记方法。然而,标签推荐的方法容易限制用户的思维,不利于群体智慧的搜集；基于语义的标记方法实现过程较为复杂,一定程度上增加了用户标记负担。资源足量标记法是用户为资源添加足够多标记的简单自然方法,资源获得足够数量的标记后,标记状态趋于稳定,稳定状态的标签信息能够准确描述被标记资源。但是,实际中存在少数资源被过度标记而大部分资源标记不足的不平衡现象。激励机制通过激励用户对标记不足资源进行标记能够改善资源标记不平衡现象。但是现有激励机制没有衡量不同用户的标记质量,不能区分具有不同标记行为的用户。针对现有机制缺乏对用户标记质量度量的问题,本文提出了基于用户标记质量的动态激励机制PQIM (Post-Quality based dynamic Incentive Mechanism)和实施方案。具体内容如下：提出用户标记质量度量方法。引入资源相对稳定标签集合的概念,从资源对应标签频率的高低和种类的多少两方面给出资源相对稳定标签集合的度量标准,并给出分段点的概念。资源收到的标记数量接近分段点时,标签集合中的高频标签和频次排名较高的标签都趋于稳定。在分段点之后,对新来的用户标记,对比资源已收到的相对稳定的标签集合,分别从其所含标签的覆盖率,频率,标记自身大小和其对被标记资源稳定性的影响上,设计了适用于资源标记初期的基于密集分布和精确密集分布的标记质量度量方法,以及适用于后期的基于标签覆盖率和标记稳定性的标记质量度量方法。提出基于用户标记质量的动态激励机制。根据用户标记时间和标记质量两方面因素设定激励规则,用户对资源标记的越早,标记质量越高,奖励越多。具体实施中,将用户标记时间与资源的标记状态关联,并结合本文提出的用户标记质量度量方法,设计基于用户标记质量的动态激励函数。设定用户所获奖励与资源的标记状态负相关,与用户的标记质量正相关。最后,从博弈论的角度分析PQIM的有效性。设计PQIM系统框架和实施算法,并采用真实的数据集验证PQIM有效性。一方面分析系统标记质量与PQIM机制的关系。采用最高奖励优先(HA)策略模拟PQIM机制激励分配过程,并与现有机制中的优势策略对比。实验结果表明,本文方法不仅能够在确定的预算下使系统的标记质量更优,而且能够缩短系统达到预期标记质量的时间。另一方面针对用户效益,本文依据历史数据进行分析,选择历史数据中活跃的用户,分析其在PQIM机制下不同标记时间段的效益分布。同时,对比具有不同标记质量的用户效益,实验结果显示,在本文机制下,用户更倾向于以较高的质量更早对资源添加标记。
[Abstract]:Collaborative tagging system is typical applications of cyber source management using Crowdsourcing mechanism, is to show the group intelligence platform. It allows many users to mark cyber source free, resulting in massive Web tag data resource search, play an important role in mining and recommendation. However, due to the freedom of the user mark, in in practical application, there is no relevant data, label spelling errors, synonyms, polysemy and other issues, reduce the system resources marking quality, has become an important reason for restricting the label application. Therefore, improving the system resource marking quality has become a hot topic in the research on Collaborative marker field.
According to the system resources marking quality problems of low, existing work mainly includes the semantic markup tag recommendation, and adequate resources marking method based on tag recommendation method. However, easy to limit the user's thinking, is not conducive to collective intelligence collection; marking method of semantic realization process based on more complex, to a certain extent, increased the burden on the user mark adequate resources. Mark method is a simple and natural method for user resources to add enough marks, resources to obtain sufficient number of markers, marker state tends to be stable, steady state label information can accurately describe labeled resources. However, there is imbalance of minority resources are over mark but most of the resources in practice. The incentive insufficiency the mechanism of marking can improve the imbalance of resource mark insufficiency resources by motivating users. But the existing incentive mechanism Different users do not measure the marking quality, can not distinguish between markers with different behavior of users. Aiming at the lack of mechanism to measure the user mark quality problem, this paper proposes PQIM dynamic incentive mechanism based on the quality of user mark (Post-Quality based dynamic Incentive Mechanism) and the implementation of the program. The specific contents are as follows:
The user mark quality measurement method. By introducing the concept of resources relatively stable set of tags, metrics from a given level of resources and types of resources corresponding to the number of two tag frequency relatively stable label sets, and give the piecewise point concept. Mark number resources received close to the segmentation point, high frequency tag and a set of tags the frequency of higher ranked labels are stable. In the piecewise point, for users to tag new, comparison of resources has received a relatively stable set of tags, respectively from the containing label coverage, frequency, marking its size and its influence on the stability of labeled resources, designed for resources early marker measurement methods and accurate dense dense markers based on the quality metrics, label coverage and quality based on marker marker stability and method for later.
The dynamic incentive mechanism based on user mark quality. According to the two factors of the user mark time and marking quality incentive to set rules, users of the resources labeled earlier, higher quality marks more rewards. The specific implementation, will mark the state of the associated user mark of time and resources, and combined with the quality of the user mark measurement method, design of dynamic excitation function based on the quality of user mark. Set the user awards and resources of the state marked negative correlation, positive correlation with the user's label quality. Finally, the validity analysis of PQIM from the angle of game theory.
The framework and implementation of algorithm design of PQIM system, and validation of PQIM using real data. The relationship between hand marking quality analysis system and PQIM system. The highest award priority (HA) strategy simulation PQIM incentive allocation process, comparative advantage strategy and the existing mechanism. The experimental results show that this method not only can in determining the budget system to mark better quality, and can shorten the system to achieve the desired mark quality time. On the other hand for the user benefit, on the basis of historical data analysis, selection of active users of the history data, analysis of the distribution of benefits in the mechanism of PQIM under different time markers. At the same time, compared with different marking quality user benefits, the experimental results show that in this mechanism, users prefer to add tags to resources to higher quality earlier.

【学位授予单位】：山东大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TP393.07;TP391.3

【参考文献】