当前位置:主页 > 管理论文 > 移动网络论文 >

基于微博的位置推测技术研究

发布时间:2019-06-24 10:49
【摘要】:微博已经成为人们快速分享和传播信息的平台,其特点是全民都可以在微博上随时随地发布和分享信息。为了实现基于位置的服务,如何从分散和多样化的信息中推测用户的位置成为了微博时代面临的一个难点和热点问题。结合国内外现有的位置推测技术,以已知的位置知识为前提,针对如何提高不同地理粒度下位置推测的准确率和解决位置信息的稀疏性问题,本文对基于微博的位置推测技术进行了研究。 首先,为了实现在城区和街道级别的粒度下位置推测,提出了一种基于语言模型的微博位置推测方法。充分利用微博中城区和街道粒度下地理信息的特征,,通过改进的本地词汇提取算法来构建基于语言模型的微博位置推测方法。实验结果表明,该方法可以实现一元语言模型和二元语言模型下的城区级别位置推测,f-measure分别为0.32和0.34。同时可以实现城区和街道粒度下位置推测,且召回率分别达到了24.9%和16.36%;同时实验结果也表明现有微博位置推测技术的准确率和召回率还有待提高,尤其是需要解决位置信息稀疏性的问题。 其次,针对在微博位置信息稀疏性情况下位置推测精度不高的问题,提出了一种基于微博内容的用户位置的推测方法。先从用户的微博内容中提取与地理相关的本地词汇,并计算不同地区本地词汇的权重;然后凭借分词后的微博内容与本地词汇的匹配程度来对用户的位置进行推测。实验结果表明,基于微博内容的位置推测方法在省份级别和城市级别上的准确率分别达到了68.49%和66.52%,优于已有的基于基准算法、地名词典和TEDAS的位置推测方法。 最后,为了进一步提高位置推测精度,提出了一种基于微博内容和互粉好友的用户位置推测方法。该方法通过将基于微博内容位置推测和基于互粉好友位置两种推测方法相结合来提高位置推测位置的精度。实验结果表明,本方法的推测位置准确率优于基于微博内容、互粉好友、基准算法、地名词典和TEDAS的位置推测方法;在微博位置信息稀疏的情况下省份级别位置推测精度达到81.39%,城市级别位置推测精度达到78.85%。
[Abstract]:Weibo has become a platform for people to share and disseminate information quickly, which is characterized by the ability of the whole people to publish and share information anytime, anywhere on Weibo. In order to realize location-based service, how to speculate the location of users from decentralized and diversified information has become a difficult and hot issue in the Weibo era. Combined with the existing location conjecture technology at home and abroad, and on the premise of known location knowledge, this paper studies the location conjecture technology based on Weibo in order to improve the accuracy of location conjecture under different geographical granularity and solve the problem of sparsity of location information. Firstly, in order to realize location conjecture at the urban and street levels, a Weibo location conjecture method based on language model is proposed. Making full use of the characteristics of geographical information in urban and street granularity in Weibo, an improved local vocabulary extraction algorithm is used to construct Weibo location estimation method based on language model. The experimental results show that the proposed method can realize the urban level location conjecture under the unilanguage model and the binary language model, and the f-measure is 0.32 and 0.34, respectively. At the same time, the location conjecture under urban and street granularity can be realized, and the recall rate is 24.9% and 16.36%, respectively. at the same time, the experimental results also show that the accuracy and recall rate of the existing Weibo location speculation technology still need to be improved, especially the problem of location information sparsity needs to be solved. Secondly, in order to solve the problem that the accuracy of location estimation is not high under the condition of sparsity of Weibo location information, a method of user location estimation based on Weibo content is proposed. Firstly, the local vocabulary related to geography is extracted from the Weibo content of the user, and the weight of the local vocabulary in different regions is calculated, and then the location of the user is deduced by the matching degree between the Weibo content and the local vocabulary after word segmentation. The experimental results show that the accuracy of the location estimation method based on Weibo content at the provincial level and the urban level is 68.49% and 66.52%, respectively, which is superior to the existing location estimation methods based on benchmark algorithm, toponymic dictionary and TEDAS. Finally, in order to further improve the accuracy of location estimation, a user location estimation method based on Weibo content and mutual friends is proposed. In this method, the accuracy of location estimation is improved by combining the two methods based on Weibo content location estimation and mutual powder friend location estimation. The experimental results show that the accuracy of location estimation in this method is better than that based on Weibo content, mutual friends, benchmark algorithm, toponymic dictionary and TEDAS, and the accuracy of location estimation at provincial level and city level is 81.39% and 78.85% respectively when Weibo position information is sparse.
【学位授予单位】:杭州电子科技大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092;TP391.1

【参考文献】

相关期刊论文 前10条

1 孙茂松,邹嘉彦;汉语自动分词研究评述[J];当代语言学;2001年01期

2 王达;崔蕊;;数据平滑技术综述[J];电脑知识与技术;2009年17期

3 郑伟发;;一种基于上下文的隐马尔可夫模型的汉语句法分析模型的实现[J];福建电脑;2009年07期

4 张敏;王春红;;基于统计方法的Web新词分词方法研究[J];计算机工程与科学;2010年05期

5 黄昌宁;赵海;;中文分词十年回顾[J];中文信息学报;2007年03期

6 何黎;何跃;霍叶青;;微博用户特征分析和核心用户挖掘[J];情报理论与实践;2011年11期

7 杨小朋;何跃;;腾讯微博用户的特征分析[J];情报杂志;2012年03期

8 刘博;郑家恒;张虎;;规则与统计相结合的分词一致性检验[J];计算机工程与设计;2008年07期

9 王晓光;;微博客用户行为特征与关系特征实证分析——以“新浪微博”为例[J];图书情报工作;2010年14期

10 孙岚;罗钊;吴英杰;王一蕾;;面向路网限制的位置隐私保护算法[J];山东大学学报(工学版);2012年05期



本文编号:2505002

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2505002.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户a68be***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com