自动标签标注系统研究
发布时间:2018-06-28 10:25
本文选题:自动标注 + 查询扩展 ; 参考:《北京邮电大学》2016年硕士论文
【摘要】:我们现在处于互联网的黄金时代、大数据的时代。每天,互联网上都有数以亿计的信息产生着。我们作为用户,每天面对着眼花缭乱的信息,如何找到我们感兴趣的、需要的信息是我们越来越关注的问题。与此同时,数量众多的网站和信息的生产者希望自己的信息送到用户面前。以新闻报道为例,在现在这样一个信息爆炸的时代,新闻的双方都希望能高效的传递新闻。标签就是为解决类似的问题而产生的。而面对数量如此众多的新闻,传统靠有限的人力进行标注工作越来越不能满足要求。自动标注技术越来越受到大家的重视。本文对标签标注的问题进行调研,设计与实现出一套自动标签标注的原型系统。在该自动标注系统中,实现了基于查询扩展和基于协同过滤的两种自动标注算法。并且在系统中还集成了自动摘要模块和文本分类模块。利用分类技术,还对原来的自动标注算法的效率进行了有效的提升。并且实现了利用web页面来交互的系统界面。系统可以通过多种方法接受输入,在后台进行标注动作后将结果显示在浏览器页面上。
[Abstract]:We are now in the golden age of the Internet, the age of big data. Every day, hundreds of millions of messages are generated on the Internet. As users, we face the dazzling information every day, how to find the information we are interested in and need is the problem that we pay more and more attention to. At the same time, a large number of websites and information producers want to send their information to users. Take news reports as an example. In this era of information explosion, both sides of news hope to deliver news efficiently. Labels are created to solve similar problems. In the face of such a large number of news, the traditional labeling work by limited manpower can not meet the requirements more and more. More and more attention has been paid to automatic tagging technology. This paper investigates the problem of label tagging, designs and implements a prototype system of automatic label tagging. In this automatic annotation system, two automatic annotation algorithms based on query expansion and collaborative filtering are implemented. The automatic summary module and text classification module are also integrated in the system. By using classification technology, the efficiency of the original automatic labeling algorithm is improved effectively. And realized the system interface that uses web page to interact. The system can accept input in many ways and display the result on the browser page after tagging in the background.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1
【参考文献】
相关期刊论文 前6条
1 李静;;Folksonomy的网络性质研究与应用[J];情报科学;2009年10期
2 程慧荣;黄国彬;孙坦;;国外基于大众标注系统的标签研究[J];图书情报工作;2009年02期
3 张玫;张晓林;;Connotea中Social Tagging机制研究[J];现代图书情报技术;2007年07期
4 马颖华,王永成,苏贵洋,张宇萌;一种基于字同现频率的汉语文本主题抽取方法[J];计算机研究与发展;2003年06期
5 陈桂林,王永成;Internet网络信息自动摘要的研究[J];高技术通讯;1999年02期
6 杨晓兰,钟义信;基于文本理解的自动文摘系统研究与实现[J];电子学报;1998年07期
,本文编号:2077677
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2077677.html