基于多兴趣的学术论文推荐研究
发布时间:2018-05-27 12:01
本文选题:论文推荐 + 聚类 ; 参考:《内蒙古大学》2017年硕士论文
【摘要】:随着学术论文数量呈爆炸式增长,科研工作者如何从庞大的论文库中迅速找到感兴趣的文献成为亟待解决的难题。学术论文推荐是克服此难题的有效方法。学术论文推荐研究主要集中于基于内容过滤、基于引文网络、基于合著网络和基于论文评价指标等方法。基于内容过滤的论文推荐是指根据用户的历史操作、评论、兴趣标注等信息建立用户模型并推荐。然而,这种推荐方法需要在信息的收集上花费大量时间。基于引文网络的论文推荐是利用论文之间的引用关系来向用户推荐论文,但是引用关系本身具有的不确定性常常会影响推荐结果的质量。基于合著网络的论文推荐是一种利用学者间通过合著而形成的复杂网络进行推荐的方法。基于论文评价指标的论文推荐通过论文或作者的引用、共引、期刊质量因子和H指数等评价指标对论文过滤并推荐。本文在已有研究的基础上,提出一种基于多兴趣的学术论文推荐算法,主要贡献如下:(1)识别学者的多个研究兴趣。根据学者通常具有多个研究兴趣的事实,利用聚类算法将每位学者的发表论文集划分为多个兴趣集,每个兴趣集都代表学者的一个研究兴趣。(2)分别提出基于VSM和基于频繁模式的两种多兴趣学者模型。基于VSM的多兴趣学者模型将一个兴趣集中所有发表论文的模型加权融合,并将融合后的特征向量作为相应的兴趣模型。基于频繁模式的多兴趣学者模型首先使用LDA预处理兴趣集,然后使用FP-Growth算法从处理结果中挖掘一个频繁模式集,最后化简该频繁模式集并建立相应的兴趣模型。(3)提出研究兴趣重视度的概念,并根据兴趣集中论文的数目给出研究兴趣重视度的计算公式,同时将其引入到两种多兴趣学者模型中。我们利用真实的数据进行了三组对比实验。结果表明,与已有的算法相比,基于多兴趣的学术论文推荐算法提高了推荐准确率。
[Abstract]:With the explosive growth of the number of academic papers, how to quickly find the interested documents from the huge database of papers has become a difficult problem to be solved. The recommendation of academic papers is an effective way to overcome this problem. The research on the recommendation of academic papers is mainly focused on the methods of content filtering, citation network, co-authoring network and evaluation index. Content filtering based paper recommendation refers to the establishment of user model and recommendation based on user's historical operation, comment, interest tagging and other information. However, this method of recommendation takes a lot of time to collect information. The paper recommendation based on citation network makes use of the citation relation between papers to recommend the paper to the user, but the uncertainty of the citation relationship itself often affects the quality of the recommendation result. The paper recommendation based on coauthor network is a method of making use of the complex network formed by scholars. The paper recommendation based on the paper evaluation index is filtered and recommended by the paper or the author's citation, co-citation, periodical quality factor and H index. In this paper, based on the existing research, a multi-interest recommendation algorithm for academic papers is proposed. The main contributions of this algorithm are as follows: 1) recognition of multiple research interests of scholars. According to the fact that scholars usually have more than one research interest, each published paper collection is divided into multiple interest sets by clustering algorithm. Each interest set represents a research interest of a scholar. (2) two kinds of multi-interest scholar models based on VSM and frequent pattern are proposed respectively. The multi-interest scholar model based on VSM is a weighted fusion of all the published models in a set of interests, and the fused feature vector is taken as the corresponding interest model. The multi-interest scholar model based on frequent pattern first uses LDA to preprocess interest set, and then uses FP-Growth algorithm to mine a frequent pattern set from the processing result. Finally, the frequent pattern set is simplified and the corresponding interest model is established. (3) the concept of the interest degree of interest is proposed, and the calculation formula of the interest degree of interest is given according to the number of papers in the interest set. At the same time, it is introduced into two kinds of multi-interest scholars' models. We conducted three sets of comparative experiments using real data. The results show that, compared with the existing algorithms, the recommendation algorithm based on multi-interest academic papers improves the accuracy of recommendation.
【学位授予单位】:内蒙古大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3
【参考文献】
相关期刊论文 前2条
1 吴海峰;孙一鸣;;引文网络的研究现状及其发展综述[J];计算机应用与软件;2012年02期
2 杨思洛;;国外网络引文研究的现状及展望[J];中国图书馆学报;2010年04期
相关硕士学位论文 前1条
1 王若松;基于合著网络的论文混合推荐算法研究[D];哈尔滨工程大学;2013年
,本文编号:1941957
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1941957.html