基于最优查询的多领域deep Web爬虫

发布时间：2018-03-17 17:36

本文选题：deep　切入点：Web　出处：《计算机应用研究》2009年09期 　论文类型：期刊论文

【摘要】：Deep Web信息通过在网页搜索接口提交查询词获得。通用搜索引擎使用超链接爬取网页,无法索引deep Web数据。为解决此问题,介绍一种基于最优查询的deep Web爬虫,通过从聚类网页中生成最优查询,自动提交查询,最后索引查询结果。实验表明系统能自动、高效地完成多领域deep Web数据爬取。
[Abstract]:Deep Web information is obtained by submitting query words in the web search interface. Universal search engines use hyperlinks to crawl web pages and cannot index deep Web data. In order to solve this problem, a deep Web crawler based on optimal query is introduced. By generating the optimal query from the clustering web page, submitting the query automatically, and finally indexing the query results, the experiment shows that the system can automatically and efficiently crawl the multi-domain deep Web data.
【作者单位】：浙江大学计算机科学与技术学院;
【基金】：浙江省科技计划基金资助项目(2007C23086)
【分类号】：TP393.092
，

本文编号：1625775

资料下载

论文发表

支付宝下载

Download by Alipay
微信下载

Download by Wechat
会员下载

Download by Member

本文链接：https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1625775.html

上一篇：王宝强离婚成了谁的狂欢——娱乐新闻引爆舆论背后的思考
下一篇：基于图形化定制的语义搜索系统的设计与实现

论文发表

·知网|万方|维普|龙源|省级|国家级|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|