当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于Web的碳素行业信息数据挖掘搜索引擎技术研究

发布时间:2018-05-02 21:16

  本文选题:Web + 数据挖掘 ; 参考:《电子科技大学》2013年硕士论文


【摘要】:搜索引擎的出现改变了人们遨游网络的方式,人们通过搜索引擎可以快速的获取想要查询的资源,但随着网络的不断发展,网页数量以惊人的速度不断增长,导致其中的资源信息质量良莠不齐,给人们的搜索与辨识带来不小的困扰。尽管现有的大型搜索引擎如百度、谷歌等都通过各种方法处理返回给用户的结果,争取能满足各类用户的需求,但是要做到满足各类人群的搜索需求还是有相当的难度。特别是针对某一行业信息的搜索,返回的信息往往不能得到满意的结果。那么针对碳素行业信息的搜索,需要一种专门的搜索引擎来提高行业内用户的搜索体验。同时由于用户之间的个体兴趣差异,就算是对碳素行业信息进行搜索,其关注的方面也有所不同,因而需要一种个性化搜索方式来优化搜索引擎。首先对数据挖掘技术进行了研究。包括数据挖掘的含义与功能、Web内容挖掘、Web结构挖掘、Web使用挖掘,依据搜索引擎所应满足的个性化需求,结合三种Web数据挖掘的方法,提高用户搜索体验。接着对碳素行业用户访问兴趣模型进行了研究。包括碳素行业用户访问信息获取方式的选择,数据的准备和访问用户的识别,以及对获取到网页信息进行概念提取、概念关联建立一种用户访问的兴趣模型。然后对搜索引擎技术进行了研究。经过对网页搜集以及分词、消除重复网页、评估网页重要程度的预处理后为搜索用户提供查询服务。最后设计出一种简单的碳素行业信息搜索引擎,一定程度上实现了碳素行业信息的挖掘与个性化服务。
[Abstract]:The appearance of the search engine has changed the way people travel in the network. People can quickly obtain the resources that want to query through the search engine, but with the continuous development of the network, the number of web pages is increasing at an amazing speed, which leads to the quality of resources and information, which brings no small trouble to people's search and identification. The existing large search engines, such as Baidu and Google, all deal with the results returned to users through various methods, and strive to meet the needs of all types of users, but it is difficult to meet the search requirements of all types of people. In particular, the information returned is often not satisfied with the search for information of a certain industry. So the search for carbon industry information requires a special search engine to improve the user's search experience in the industry. At the same time, because of the individual interest differences between users, even if the information of the carbon industry is searched, its attention is different, so a personalized search method is needed to optimize the search. First, we studied the data mining technology, including the meaning and function of data mining, Web content mining, Web structure mining, Web usage mining, according to the personalized requirements that the search engine should meet, and combined with three kinds of Web data mining methods to improve the user's search experience. Then, the model of the interest of the carbon industry users was carried out. It includes the choice of access to information access in the carbon industry, the preparation of data and the identification of access to users, the concept extraction of the information obtained from the web page, and the establishment of an interest model for user access by concept association. Then the search engine technology is studied. Through the collection of web pages and participle, the repeating network is eliminated. Page, the assessment of the importance of web pages to provide search services for the search users. Finally, a simple carbon industry information search engine is designed to some extent, to a certain extent, it realizes the mining and personalized service of the carbon industry information.

【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3;TP311.13

【参考文献】

相关期刊论文 前2条

1 马亚娜,钱焕延,孙亚民;用Cookie构建Web安全的实现[J];计算机工程;2002年11期

2 张巍,李志蜀;基于PageRank算法的搜索引擎优化策略[J];计算机应用;2005年07期

相关硕士学位论文 前1条

1 李仁义;数据挖掘中聚类分析算法的研究与应用[D];电子科技大学;2012年



本文编号:1835531

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1835531.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户135e5***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com