基于随机森林算法的土壤图斑分解
[Abstract]:The drawing method of polygon and the process of using long field survey and aerial picture interpretation determine that the efficiency of the traditional soil map is relatively low and time-consuming, and the precision of the traditional soil map is difficult to meet the development of modern science. The traditional soil map is mainly faced with the following problems. First, the size of the mapping scale often determines the size of the smallest plot. The larger the scale, the smaller the smallest map that can be expressed in the soil map, so the traditional soil map will ignore the small spots in the large plot because of the scale limitation during the drawing process. The space and attribute of the soil map are simplified. Secondly, the expression of the hand polygon also neglects the characteristics of the soil spatial gradient. The mutation of the polygon boundary leads to the mutation of the soil space and properties that have been changed continuously. Finally, based on the expert experience and manual drawing, it is very time-consuming and easy to produce people. However, the traditional soil map, which contains a large number of expert knowledge, is the valuable information left by the history, and still has important reference value for the present research. This paper takes the water river basin of huayuhe Town, Hong'an County, Huanggang City, Hubei Province as the research area, and combines the traditional soil map obtained by the National Second Soil Census. Some terrain data and multi spectral data are used in GIS platform and R language environment to excavate soil environmental knowledge, and use this model to decompose the original soil map in space, and get more detailed spatial distribution map of spatial information. The specific research steps are divided into following steps: 1) extraction and research area The initial environmental variables in this selection include soil parent material data, topographic data and multispectral data, using gradient, slope, terrain humidity index, curvature along the contour, horizontal curvature and horizontal curvature to extract normalized vegetation from multi spectral data. Index, normalized water index, the first principal component, deviation, information entropy, variance, mean value, and the dependent variable.2 used in the research of the parent material. The sampling point is designed with the weighted sampling pattern of the patch area to ensure that each spot has at least 10 samples, and the 6686 samples are finally determined. Boundary factor data and classification of sample data according to the parent material.3) environmental factors screening. In order to ensure mapping precision and efficiency, we need to eliminate a part of the factors that have low contribution to the model. This study uses the variable importance measure importance () function provided by the R language to determine the parameters of the.4 model. Two very important parameters, mtry and nTree, can be used to judge the.5) model through the calculation of the external error of the random forest model and the calculation of the model stability respectively. Using the Random Forest packet in the R language, the data are modeled and four groups of models under the four matrix units are obtained, and the four groups of models are used to study each grid position in the area. The environment factor information is voted to determine the soil type in each location by voting, and then the soil map of the area is obtained. The study shows that the whole soil map after the decomposition of the map is significantly increased in the number of spots compared with the traditional soil map, and the spatial distribution is more detailed, showing more details. In this study, we use the RF model to achieve a better expression on the classification problem. It shows that the knowledge of using the RF model to obtain the soil environmental relationship is true and credible. It can provide a efficient method for the fine digital soil mapping. In addition, the variable importance measure function provided by the random forest algorithm can be important to the variables. In order to delete the factor of small contribution to the model, it not only ensures the accuracy of the classification, but also greatly improves the efficiency of the calculation. It provides a reliable method and basis for the soil map decomposition in large area in the future.
【学位授予单位】:华中农业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:S159.9
【参考文献】
相关期刊论文 前10条
1 刘雪琦;朱阿兴;杨琳;缪亚敏;曾灿英;;土壤图更新中基于土壤类型面积分级的训练样点选择方法[J];土壤学报;2017年01期
2 王茵茵;齐雁冰;陈洋;解飞;;基于多分辨率遥感数据与随机森林算法的土壤有机质预测研究[J];土壤学报;2016年02期
3 黄魏;罗云;汪善勤;陈家赢;韩宗伟;祁大成;;基于传统土壤图的土壤—环境关系获取及推理制图研究[J];土壤学报;2016年01期
4 郭澎涛;李茂芬;罗微;林清火;唐群锋;刘志崴;;基于多源环境变量和随机森林的橡胶园土壤全氮含量预测[J];农业工程学报;2015年05期
5 赵北庚;;基于R语言randomForest包的随机森林建模研究[J];计算机光盘软件与应用;2015年02期
6 韩宗伟;黄魏;罗云;张春弟;祁大成;;基于路网的土壤采样布局优化——模拟退火神经网络算法[J];应用生态学报;2015年03期
7 杨琳;朱阿兴;张淑杰;安艺明;;土壤制图中多等级代表性采样与分层随机采样的对比研究[J];土壤学报;2015年01期
8 宁亮亮;张晓丽;;基于纹理信息的Landsat-8影像植被分类初探[J];中南林业科技大学学报;2014年09期
9 韩宗伟;黄魏;张春弟;罗云;;基于土壤养分-景观关系的土壤采样布局合理性研究[J];华中农业大学学报;2014年01期
10 张淑杰;朱阿兴;刘京;杨琳;;基于样点的数字土壤属性制图方法及样点设计综述[J];土壤;2012年06期
相关硕士学位论文 前2条
1 周银;基于决策树方法的县级土壤数字制图研究[D];浙江大学;2011年
2 李杭燕;时间序列NDVI数据集重建方法研究[D];兰州大学;2010年
,本文编号:2146909
本文链接:https://www.wllwen.com/kejilunwen/nykj/2146909.html