林业动态信息快速搜索与集成
[Abstract]:Forestry in China is the basic industry of the national economy and undertakes the important mission of ecological environment construction and social sustainable development. Mankind is the main beneficiary group in the construction of forestry industry. When the forestry industrial structure forms, people play their different roles in it. In recent years, forestry informatization has promoted forestry credit. The sharing of information resources provides convenience for the public and promotes the development of forestry industry. However, the development of forestry informatization is still necessary. How to make better use of forestry information resources and provide services for scientific researchers, teaching workers and foresters in the field of Forestry Science in China is an urgent need to search and collect forestry information quickly. It is.
How to quickly find the information users need from the mass of information has become a major problem facing the public in the search of information in a specific field. Forestry information on the Internet is becoming more and more complex and disorderly, and ordinary search engines can no longer meet the needs of the public for personalized information. When searching, the general search engine needs to spend a lot of time and energy to find the information needed, and the recall rate and accuracy of the subject information are relatively low, which can not meet the needs of users. Therefore, the public urgently needs a forestry subject search engine with accurate classification, comprehensive data and timely update.
The research content of this paper comes from the key project of Hunan Science and Technology Program (2010 nk2004), which is presided over by the tutor. Guided by the theories of system science, forestry, informatics and statistics, this paper makes a comprehensive study on the search and integration of forestry dynamic information. In the course of the study, the research on the search and integration of forestry dynamic information at home and abroad is carried out. In this paper, the demand analysis and classification of forestry dynamic information, subject crawler searcher and text recognition classifier are summarized.
(1) The existing theories and practices of search engines at home and abroad are analyzed comprehensively, which indicates the importance and necessity of establishing a forestry subject search engine at present, and the key technologies are studied deeply. The forestry subject search engine is divided into three layers: data collection layer, data storage layer and data representation layer. At the same time, we discuss and summarize the relevant methods in these three levels.
(2) Using the information published on the web pages and combining with the demand of various departments and the public for forestry dynamic information, the types of forestry dynamic information which are really meaningful to the departments and the public are defined, and the required forestry dynamic information is classified and divided into seven groups, so as to concretize the various forestry dynamic information. Forestry means of production, market supply and demand information for forest products, flower information, forestry policies and regulations, Forestry labor information, meteorological and environmental information.
(3) According to the established forestry dynamic information classification system, collect the relevant forestry professional websites, identify the source of information collection websites, collect the domain name of the websites provided by the data we need, and collect the content after the domain name, at the same time identify the websites collected, so as to realize the collection and classification of forestry dynamic information sources.
(4) Using a new search strategy based on content analysis and link structure analysis, through comprehensive analysis and evaluation, the topic relevance of the pages pointed by the candidate URLs is judged and the candidate URLs are sorted to achieve the optimal forestry theme crawler searcher, so that the downloaded pages are related to forestry topics. And the importance is highlighted in decreasing order.
(5) Adopting SVM automatic text categorization technology of computer intelligence, the sample data is trained by machine, and the dynamic forestry information collected by the subject crawler searcher is classified and stored, so as to optimize the data collection layer of the forestry subject search engine.
Forestry dynamic information search and integration is based on the research and optimization of existing search and integration technology, which integrates the public demand for forestry dynamic information. The accuracy, comprehensiveness and success rate of public access to forestry dynamic information have been significantly improved. New methods and new technologies will be further applied to the rapid search and integration of forestry dynamic information, and forestry information management and service will also take a new step.
【学位授予单位】:中南林业科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:S712
【参考文献】
相关期刊论文 前10条
1 方鸿锦;孙旭东;刘燕德;;江西省农业信息化与新农村建设的研究[J];安徽农业科学;2007年34期
2 张黎烁;李鑫;徐猛;;基于PageRank的网页主题相关性算法研究[J];光盘技术;2008年12期
3 王灏,黄厚宽,田盛丰;文本分类实现技术[J];广西师范大学学报(自然科学版);2003年01期
4 刘林,汪涛,樊孝忠;主题爬虫的解决方案[J];华南理工大学学报(自然科学版);2004年S1期
5 郑丽桑;兰樟仁;卢毅敏;;福建省林业信息服务平台的研究[J];集美大学学报(自然科学版);2006年02期
6 钱功伟;倪林;曹荣;;基于网页链接和内容分析的改进PageRank算法[J];计算机工程与应用;2007年21期
7 欧阳柳波,李学勇,李国徽,王鑫;专业搜索引擎搜索策略综述[J];计算机工程;2004年13期
8 吴明礼,施水才;一种结合超链接分析的搜索引擎排序方法[J];计算机工程;2004年15期
9 李勇;韩亮;;主题搜索引擎中网络爬虫的搜索策略研究[J];计算机工程与科学;2008年03期
10 牛振国,符海芳,崔伟宏;面向多层用户的农业信息分类初步研究[J];计算机与农业.综合版;2003年03期
相关硕士学位论文 前7条
1 陈杰;主题搜索引擎中网络蜘蛛搜索策略研究[D];浙江大学;2006年
2 郑火国;农业信息服务平台的构建与实现[D];中国农业科学院;2006年
3 刘玮玮;搜索引擎中主题爬虫的研究与实现[D];南京理工大学;2006年
4 郑健珍;定题爬虫搜索策略研究[D];厦门大学;2007年
5 陈丛丛;主题爬虫搜索策略研究[D];山东大学;2009年
6 王冬坡;基于Lucene的主题搜索引擎的研究与实现[D];河北科技大学;2010年
7 冯明丽;面向个性化主题搜索的用户—查询词语义本体构建[D];西华大学;2010年
,本文编号:2189131
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2189131.html