基于关键词的深度万维网查询
发布时间:2018-08-06 17:41
【摘要】:深度万维网蕴藏着海量的信息,由于其隐藏性,现有的搜索引擎很难搜索到其中的内容。目前还没有研究出很好的方法和模型来捕获其内容,很大程度上制约了人们获取更多、更有价值的信息,因此如何充分的获取深度万维网中的信息成为了一个难题。 本文提出了基于关键词的深度万维网的数据库的查询方法,该方法在考虑了深度万维网自身特点的基础上,又借鉴了传统的搜索引擎思想。文章比较了各种分类算法,最终采用朴素贝叶斯算法对关键词进行分类,找到其所属的领域,该算法适合深度万维网查询模型,简单明了,准确性也很高。关键词与属性关联引入了本体的概念,利用WordNet词典生成的概念层次树,提出了基于本体的语义相似度计算方法。该方法摆脱了人们的主观意识,简单而直观,代价也不是很大。确定了查询关键词所属的领域和其所对应的关系表中的属性后,最终生成查询的SQL语句,把查询的页面信息反馈给用户。 实验部分也举例验证和测试了所提出的方法的可行性与准确性,不仅解决了深度万维网多领域的数据库查询,而且能够与现有的搜索引擎进行整合,帮助用户快速有效的查询。
[Abstract]:The deep World wide Web contains a great deal of information. Because of its concealment, it is difficult for the existing search engines to search for the contents. At present, there is no good method and model to capture its content, which restricts people to obtain more and more valuable information, so how to obtain the information in the depth of the World wide Web has become a difficult problem. In this paper, a query method of the database based on the keyword depth World wide Web is proposed. This method not only takes into account the characteristics of the depth World wide Web, but also draws lessons from the traditional search engine idea. This paper compares various classification algorithms, and finally classifies keywords by naive Bayes algorithm, and finds its domain. The algorithm is suitable for deep World wide Web query model, simple and clear, and has high accuracy. The concept of ontology is introduced by associating keywords with attributes, and the semantic similarity calculation method based on ontology is proposed by using the concept hierarchy tree generated by WordNet dictionary. This method gets rid of people's subjective consciousness, simple and intuitionistic, and the cost is not very big. After determining the domain to which the query keywords belong and the attributes in the corresponding relational table, the SQL statement of the query is generated, and the page information of the query is fed back to the user. The experiment also verifies and tests the feasibility and accuracy of the proposed method, which not only solves the database query of deep World wide Web, but also integrates with the existing search engine to help users to query quickly and effectively.
【学位授予单位】:上海师范大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.1;TP311.13
本文编号:2168489
[Abstract]:The deep World wide Web contains a great deal of information. Because of its concealment, it is difficult for the existing search engines to search for the contents. At present, there is no good method and model to capture its content, which restricts people to obtain more and more valuable information, so how to obtain the information in the depth of the World wide Web has become a difficult problem. In this paper, a query method of the database based on the keyword depth World wide Web is proposed. This method not only takes into account the characteristics of the depth World wide Web, but also draws lessons from the traditional search engine idea. This paper compares various classification algorithms, and finally classifies keywords by naive Bayes algorithm, and finds its domain. The algorithm is suitable for deep World wide Web query model, simple and clear, and has high accuracy. The concept of ontology is introduced by associating keywords with attributes, and the semantic similarity calculation method based on ontology is proposed by using the concept hierarchy tree generated by WordNet dictionary. This method gets rid of people's subjective consciousness, simple and intuitionistic, and the cost is not very big. After determining the domain to which the query keywords belong and the attributes in the corresponding relational table, the SQL statement of the query is generated, and the page information of the query is fed back to the user. The experiment also verifies and tests the feasibility and accuracy of the proposed method, which not only solves the database query of deep World wide Web, but also integrates with the existing search engine to help users to query quickly and effectively.
【学位授予单位】:上海师范大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.1;TP311.13
【参考文献】
相关期刊论文 前9条
1 姚全珠;彭程;宋志理;李薇;;基于关联规则的搜索引擎方法[J];计算机工程与应用;2011年09期
2 刘伟;孟小峰;孟卫一;;Deep Web数据集成研究综述[J];计算机学报;2007年09期
3 刘玉奎;周立柱;范举;;中文深度万维网数据库的现状研究[J];计算机学报;2011年02期
4 范举;周立柱;;基于关键词的深度万维网数据库选择[J];计算机学报;2011年10期
5 姜芳艽;孟小峰;;Deep Web数据集成中查询处理的研究与进展[J];计算机科学与探索;2009年02期
6 林玲;周立柱;;基于简单查询接口的Web数据库模式识别[J];清华大学学报(自然科学版);2010年04期
7 田萱;杜小勇;李海华;;语义查询扩展中词语-概念相关度的计算[J];软件学报;2008年08期
8 赵志宏;黄蕾;刘峰;陈振宇;;Deep Web搜索技术进展综述[J];山东大学学报(工学版);2009年02期
9 赵朋朋;崔志明;高岭;仲华;;关于中国Deep Web的规模、分布和结构[J];小型微型计算机系统;2007年10期
,本文编号:2168489
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2168489.html