利用网络爬网技术对公路运输货运市场的检验与分析
发布时间:2021-05-21 23:40
随着中国经济的快速发展,货运市场持续增长,公路、铁路、水路等货运量不断增加。然而,货运量信息是很难监控的,因为它数据来源较广,需要将众多分散信息汇总、过滤并储存在数据库中。因此,本文利用网络爬虫技术对公路、水路货运市场的运量数据进行了获取和汇总,运用二次曲线等模型对缺失数据进行了修复,并基于历史数据采用指数平滑模型对未来运量走势进行了预测。本文首先利用八爪鱼网络爬虫工具从不同网站上获取货运市场的运量信息,按周、月、季、年、运输方式、地区和货运量进行细分,以地区为单位,对同种运输方式下不同时间段内的运量数据进行汇总,同时探索出数据中所有的缺失值之后以华北、山东、华中地区的货运市场的运量信息为例,进行数据修复。选取缺失数据所在年份的其他数据,利用二次曲线模型、指数模型、对数模型等进行曲线拟合,根据不同模型的R方值,选取拟合效果最好的模型,确定模型参数并计算缺失值,依次修复全部缺失值。最后,以2016年、2017年、2018年的数据为基础,预测2019年前22周的数据。采用指数平滑法建立了上述三个地区的未来货运量预测模型,考虑到数据明显的季节性,采用三次指数平滑法(霍尔特冬季指数平滑法)对比...
【文章来源】:北京交通大学北京市 211工程院校 教育部直属院校
【文章页数】:81 页
【学位级别】:硕士
【文章目录】:
Acknowledgement
中文摘要
ABSTRACT
1 INTRODUCTION
1.1 MOTIVATION
1.2 OBJECTIVES
1.3 SCOPE OF RESEARCH
1.4 RESEARCH SIGNIFICANCE
1.5 THESIS OUTLINE
2 LITERATURE REVIEW
2.1 TIME SERIES STUDIES
2.2 SIMPLE EXPONENTIAL SMOOTHING MODEL
2.2.1. Mathematical Formulation
2.3. Measuring Forecast Error
2.3.1 Choosing the Best Value for Smoothing Constant
2.4 WEB CRAWLER
2.5 REVIEW OF THE DATA SCRAPING TOOLS THAT NEITHER REQUIRE PROGRAMMING NOR CODING
2.5.1 Octoparse
2.5.2 Outwit hub
2.5.3. Visual scraper
2.5.4 Helium scraper
2.6 WHAT IS SQL
3 RESEARCH METHODOLOGY
3.1 OCTOPARSE OVERVIEW
3.1.1 Installation
3.1.2 Features
3.1.3 Setting up basic information
3.1.4 Morkflow Design
3.1.5 Extraction Options
3.2 IMPORT OF DATA EXTRACTED TO DATABASE
3.2.1 Data and Database
3.2.2 Creating table
3.3 ACCESSING DATA
3.4 MODEL FORMULATION
3.4.1 Problem Definition
3.4.2 Data repair
3.4.3 The mean of adjacent points
3.5 QUADRATIC CURVE MODEL
3.6 APPLICATION OF SPSS TO CALCULATE
3.7 EXPONENTIAL SMOOTHING MODEL
3.7.1 Parameters for the model:
4 DATA INSPECTION AND ANALYSIS
4.1 DATA SOURCES
4.2 TRAFFIC FREIGHT VOLUME
4.3 SELECTION OF MODEL FOR NORTH CHINA,SHANDONG PENINSULA AND HUAZHONG
4.4 QUADRATIC CURVE MODEL APPLICATION
4.5 ANALYSIS OF THE QUADRATIC CURVE MODEL
4.6 REPLACE MISSING DATA BEFORE FORECAST
4.7 DEFINE DATE FROM DATA
4.8 PREDICTED CRITERIA FOR CUBIC EXPONENTIAL SMOOTHING
4.9 THE AVERAGE ABSOLUTE ERROR(MAE)AND ROOT MEAN SQUARE ERROR(RMSE)
4.10 COMPARISON OF THE PERFORMANCE MEASURES
4.11 SUMMARY
5 CONCLUSIONS
REFERENCES
APPENDIX
AUTHOR PROFILE AND RESEARCH ACHIEVEMENTS OBTAINED DURING THE STUDYFOR A MASTER'S / DOCTORAL DEGREEIPUBLICATIONS
DATASET FOR THE MASTER'S THESIS
本文编号:3200573
【文章来源】:北京交通大学北京市 211工程院校 教育部直属院校
【文章页数】:81 页
【学位级别】:硕士
【文章目录】:
Acknowledgement
中文摘要
ABSTRACT
1 INTRODUCTION
1.1 MOTIVATION
1.2 OBJECTIVES
1.3 SCOPE OF RESEARCH
1.4 RESEARCH SIGNIFICANCE
1.5 THESIS OUTLINE
2 LITERATURE REVIEW
2.1 TIME SERIES STUDIES
2.2 SIMPLE EXPONENTIAL SMOOTHING MODEL
2.2.1. Mathematical Formulation
2.3. Measuring Forecast Error
2.3.1 Choosing the Best Value for Smoothing Constant
2.4 WEB CRAWLER
2.5 REVIEW OF THE DATA SCRAPING TOOLS THAT NEITHER REQUIRE PROGRAMMING NOR CODING
2.5.1 Octoparse
2.5.2 Outwit hub
2.5.3. Visual scraper
2.5.4 Helium scraper
2.6 WHAT IS SQL
3 RESEARCH METHODOLOGY
3.1 OCTOPARSE OVERVIEW
3.1.1 Installation
3.1.2 Features
3.1.3 Setting up basic information
3.1.4 Morkflow Design
3.1.5 Extraction Options
3.2 IMPORT OF DATA EXTRACTED TO DATABASE
3.2.1 Data and Database
3.2.2 Creating table
3.3 ACCESSING DATA
3.4 MODEL FORMULATION
3.4.1 Problem Definition
3.4.2 Data repair
3.4.3 The mean of adjacent points
3.5 QUADRATIC CURVE MODEL
3.6 APPLICATION OF SPSS TO CALCULATE
3.7 EXPONENTIAL SMOOTHING MODEL
3.7.1 Parameters for the model:
4 DATA INSPECTION AND ANALYSIS
4.1 DATA SOURCES
4.2 TRAFFIC FREIGHT VOLUME
4.3 SELECTION OF MODEL FOR NORTH CHINA,SHANDONG PENINSULA AND HUAZHONG
4.4 QUADRATIC CURVE MODEL APPLICATION
4.5 ANALYSIS OF THE QUADRATIC CURVE MODEL
4.6 REPLACE MISSING DATA BEFORE FORECAST
4.7 DEFINE DATE FROM DATA
4.8 PREDICTED CRITERIA FOR CUBIC EXPONENTIAL SMOOTHING
4.9 THE AVERAGE ABSOLUTE ERROR(MAE)AND ROOT MEAN SQUARE ERROR(RMSE)
4.10 COMPARISON OF THE PERFORMANCE MEASURES
4.11 SUMMARY
5 CONCLUSIONS
REFERENCES
APPENDIX
AUTHOR PROFILE AND RESEARCH ACHIEVEMENTS OBTAINED DURING THE STUDYFOR A MASTER'S / DOCTORAL DEGREEIPUBLICATIONS
DATASET FOR THE MASTER'S THESIS
本文编号:3200573
本文链接:https://www.wllwen.com/weiguanjingjilunwen/3200573.html