基于热度矩阵的微博热点话题发现
发布时间:2018-05-08 04:32
本文选题:热度矩阵 + 主题模型 ; 参考:《计算机工程》2017年02期
【摘要】:现有微博热点话题发现模型对微博数量规模较敏感,发现速度较慢。为此,提出一种基于热度矩阵的主题模型。通过热度矩阵获取各潜在主题的热度和主题-词概率分布,并以词间的共有热度来挖掘其语义关系,进而准确识别数据中的热点话题及热点词汇。在真实微博数据上的实验结果表明,与潜在狄利克雷分布模型相比,该模型的效率和准确率较高,发现的热点话题与实时事件保持一致,具有较好的热点识别效果。
[Abstract]:The current hot topic discovery model of Weibo is sensitive to the quantity and scale of Weibo, and the speed of discovery is slow. Therefore, a topic model based on heat matrix is proposed. The heat intensity and topic-word probability distribution of each potential topic are obtained by the heat matrix, and the semantic relationship is mined by the common heat between words, and then the hot topics and hot words in the data can be accurately identified. The experimental results on the real Weibo data show that the proposed model is more efficient and accurate than the potential Drickley distribution model, and the hot topics detected are consistent with the real-time events and have a better hot spot recognition effect.
【作者单位】: 武汉大学软件工程国家重点实验室;中国电子科技集团公司第二十八研究所;
【基金】:国家自然科学基金重点项目(U1135005)
【分类号】:TP391.1
【参考文献】
相关期刊论文 前5条
1 李敬;印鉴;刘少鹏;潘雅丽;;基于话题标签的微博主题挖掘[J];计算机工程;2015年04期
2 孙永利;李东;张s,
本文编号:1859985
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1859985.html