基于深度学习的农业信息分类方法研究
本文选题:农业信息分类 + 深度学习 ; 参考:《西北农林科技大学》2017年硕士论文
【摘要】:随着国家对农业的大力扶持以及互联网技术的迅猛发展,农业相关信息不断地膨胀扩大,农业信息化发展迅速,在线农业信息已经海量化。如何从海量化的农业信息中实现农业信息的快速搜索和准确定位已经变得越来越困难。在这样的背景下,选择优化的农业信息分类方法,辅助实现农业信息的快速检索、准确定位是至关重要的。本文对基于决策树、贝叶斯和深度学习的农业信息分类方法进行了研究。重点探讨了深度学习中的卷积神经网络的网络结构和网络训练过程,实现了对农业信息的自动分类,提高了文本分类的精度和效率,来增加信息的利用价值。主要工作如下:(1)数据获取及预处理部分。利用爬虫程序从中国农业信息网上获得相关栏目下的文档作为农业信息数据集,然后利用Jieba分词和Pynlpir两种分词方法对数据集进行分词处理,并利用停顿词表去除分词文件中的符号、数字等一些不能代表文本特征的无用词汇,接着运用常用的特征选择评价函数进行特征选择,在此基础上证明了利用卷积神经网络自动提取农业信息特征的可行性。(2)农业信息的两种向量化表示方法。一种是中文分词、去停顿词后抽取文本特征词然后表示成文本向量方法;一种是中文分词、去停顿词后直接表示成词向量方法;利用词向量的方法避免了传统向量表示维数过高的问题,利用深度学习的方法可以自动提取农业信息的特征词。(3)基于预处理生成的向量文件,分别利用决策树、贝叶斯和深度学习的卷积神经网络模型实现了农业信息分类,并对运行结果进行了理论分析,针对二分类与十分类的运行结果差异提出了思考,接着运用聚类的方法验证了数据集类别文本的分布情况并利用饼状图直观显示,从而验证二分类和十分类运行结果的差异是因为数据集各类别文档数目不平衡造成的。通过实验验证了卷积神经网络应用于农业信息分类问题上的可行性,并与其他现有的分类器进行比较,分析了卷积神经网络在农业信息分类上的优越性。(4)针对农业信息分类的卷积神经网络结构提出了优化思考,对实验结果进行了理论对比分析。结果表明,针对农业信息分类的网络结构中各节点均采用Sigmoid激励函数时网络分类性能下降明显,而各节点均采用Relu激励函数时网络分类性能显著提高。在调整卷积核个数实验中,增多网络模型中卷积核的个数到原来的两倍,网络最终达到了99.40%的分类精确率。
[Abstract]:With the strong support of agriculture and the rapid development of Internet technology, agricultural information has been expanding and expanding, agricultural information has developed rapidly, and the online agricultural information has become massive. It has become more and more difficult to realize the rapid search and accurate positioning of agricultural information from mass agricultural information. In the background, it is very important to select the optimized classification method of agricultural information and assist in the rapid retrieval of agricultural information and accurate positioning. This paper studies the classification method of agricultural information based on decision tree, Bias and deep learning. The network structure and network training of convolution neural network in deep learning are discussed in this paper. It realizes the automatic classification of agricultural information, improves the accuracy and efficiency of text classification to increase the use value of information. The main work is as follows: (1) data acquisition and preprocessing parts. Using the crawler program to obtain the documents under the related columns from the Chinese agricultural information network as the agricultural information data set, and then use the Jieba participle and the Py Nlpir two participle methods are used to divide the data sets, and use the pause word list to remove the symbols in the participle files, numbers and other useless words that can not represent the text features. Then use the common features to select the evaluation function for feature selection. On this basis, it is proved that the agricultural information is automatically extracted by the convolution neural network. (2) two quantitative representation methods of agricultural information. One is Chinese participle, the text feature words are extracted after the pause words are extracted and then expressed as text vector methods; one is Chinese participle, the word vector method is directly expressed after the pause word, and the method of word vector is used to avoid the problem that the dimension of traditional vector expression is too high. The characteristic words of agricultural information can be automatically extracted by means of deep learning. (3) based on the vector files generated by preprocessing, the agricultural information classification is realized by using the decision tree, Bias and the convolution neural network model of deep learning, and the operation results are analyzed in theory, and the difference between the two classification and the very class operation results is different. The thinking is put forward, and then the clustering method is used to verify the distribution of the text of the dataset and use the pie chart to display it intuitively, thus verifying that the difference between the two classification and the very class operation results is caused by the imbalance of the number of documents in the data sets. The application of convolution neural network to the classification of agricultural information is verified by experiments. The feasibility of the problem is compared with the other existing classifiers and the superiority of convolution neural network in the classification of agricultural information is analyzed. (4) the optimization thinking is put forward for the convolution neural network structure of agricultural information classification, and the experimental results are compared and analyzed. The results show that the network node for the classification of agricultural information has been shown. The network classification performance decreases obviously when each node uses the Sigmoid excitation function, while the network classification performance is significantly improved when each node uses the Relu excitation function. The number of convolution kernel in the increased network model is two times that of the original convolution kernel, and the network reaches 99.40% classification accuracy at the end of the network.
【学位授予单位】:西北农林科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:S126
【参考文献】
相关期刊论文 前10条
1 马世龙;乌尼日其其格;李小平;;大数据与深度学习综述[J];智能系统学报;2016年06期
2 王媛媛;张瑞;;复杂网络与网站评估体系研究[J];甘肃科技纵横;2016年04期
3 阎博;舒力;王东升;白静洁;郭子明;徐家慧;;基于语义WEB技术的电网运行数据智能检索系统的研究与实现[J];华北电力技术;2015年09期
4 熊富林;邓怡豪;唐晓晟;;Word2vec的核心架构及其应用[J];南京师范大学学报(工程技术版);2015年01期
5 周练;;Word2vec的工作原理及应用探究[J];科技情报开发与经济;2015年02期
6 张建明;詹智财;成科扬;詹永照;;深度学习的研究与发展[J];江苏大学学报(自然科学版);2015年02期
7 刘建伟;刘媛;罗雄麟;;深度学习研究进展[J];计算机应用研究;2014年07期
8 余凯;贾磊;陈雨强;徐伟;;深度学习的昨天、今天和明天[J];计算机研究与发展;2013年09期
9 曹金山;张泽滨;;非结构化数据的ETL设计[J];现代电子技术;2011年10期
10 俞新凯;李斌;毛敏;;基于网状结构的农业信息分类[J];现代农业科技;2011年03期
相关硕士学位论文 前7条
1 赵新苗;基于中心向量的聚类算法在农业信息分类中的研究与应用[D];新疆农业大学;2016年
2 魏紫京;农业信息搜索引擎分类器的研究[D];东北农业大学;2015年
3 彭凯;基于距离度量学习的文本分类研究[D];上海交通大学;2013年
4 何屹;基于Web分类技术的农业信息获取系统的研究与实现[D];北京邮电大学;2010年
5 张彪;文本分类中特征选择算法的分析与研究[D];中国科学技术大学;2010年
6 谢光华;中文网页自动分类的研究及其应用[D];大连理工大学;2007年
7 彭璐;支持向量机分类算法研究与应用[D];湖南大学;2007年
,本文编号:1826042
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1826042.html