当前位置:主页 > 科技论文 > 路桥论文 >

基于机器学习与统计学模型的道路事故严重程度预测模型效果评价

发布时间:2024-05-16 21:56
  道路交通事故数据分析对于交通安全有着重要意义。事故分析的重要性在于可以揭示导致事故的不同类型因素的影响。道路事故风险模型的预测准确性需要不断提高。数据挖掘方法可以用于道路交通事故数据分析。其中,统计学模型OP、MNL等以及机器学习模型CART,SVM,KNN,GNB和RF等均可用于道路交通事故的数据集分析。这给我们提供了去研究更加准确模型的机会。本文对比了基于具有不同建模逻辑的各种机器学习和统计学模型在道路事故损害程度预测中的精确度。基于香港不同地区委员会收集的道路事故数据,将这些模型用于预测与各道路事故程度等级相对应的损害严重程度。本文计算并比较每个模型在测试数据集上的预测准确性,然后进行灵敏度分析以推断解释变量对道路事故严重程度判断的重要性。并对比了OP和MNL统计学模型对于变量影响的估计。从灵敏度分析中,我们可以获得五个选定的机器学习模型对于碰撞事故严重程度的影响大小。结果表明,尽管机器学习模型的方法存在过度拟合的问题,但其相比统计学模型的方法具有更高的预测准确性。RF,GNB,KNN,SVM和CART的致命事故分类准确率分别为82.77%,55.53%,82.82%,77.93...

【文章页数】:85 页

【学位级别】:硕士

【文章目录】:
摘要
Abstract
Chapter 1 Introduction
    1.1 Background
    1.2 Statistics Summary of Crash Contributing Factors
    1.3 Problem Statement and Intention of This Thesis
    1.4 Research Aim and Objectives
    1.5 Outline of the thesis
Chapter 2 Literature Survey
    2.1 Introduction
    2.2 Status of Road Safety
        2.2.1 Status of Road Safety around the World
    2.4 Literature Survey of Statistical Models for Crash Injury Severity
    2.5 Literature Survey of Machine Learning Models for Crash Injury Severity
    2.6 Summarization and Limitations
Chapter 3 Methodology for Crash Modeling
    3.1 Introduction
    3.2 The Design of Methodology
    3.3 Statistical Models
        3.3.1 Ordered Probit Regression Model
        3.3.2 Checking for Multi-Collinearity
        3.3.3 Multinomial Logistic Model(MNLM)Design
    3.4 Machine Learning Models
        3.4.1 Classification and Adaptive Regression Trees(CART)
            3.4.1.1 Data Set Portioning
            3.4.1.2 Choose Cost Function and Training Model
            3.4.1.3 Decision Tree Algorithm Advantages and Disadvantages
        3.4.2 Support Vector Machine
        3.4.3 Naive Bayes Classifier
            3.4.3.1 What is Bayes Theorem?
            3.4.3.2 Types of Naive Bayes Algorithm
            3.4.3.3 Representation Used By Naive Bayes Models
            3.4.3.4 Make Predictions with a Naive Bayes Model
            3.4.3.5 Na?ve Bayes Algorithm Advantages and Disadvantages
        3.4.4 K-Nearest Neighbors– Classification
            3.4.4.1 Algorithm
            3.4.4.2 K-NN Algorithm Advantages and Disadvantages
        3.4.5 Random Forest
            3.4.5.1 How does The Random Forests Algorithm work?
            3.4.5.2 Feature Importance
            3.4.5.3 Random Forest Algorithm Advantages and Disadvantages
Chapter 4 Crash Data Collection and Data Description
    4.1 Introduction
        4.1.1 Hong Kong Transportation Department Accident Data
        4.1.2 Variables Considered In the Study
    4.2 Data Preparation
        4.2.1 Based on Accident
        4.2.2 Based on Vehicle
        4.2.3 Based on casualty
    4.3 Data Pre-Processing
        4.3.1 Missing Data Treatment
        4.3.2 Data Normalization
    4.4 Estimation of Accuracy in Classification
    4.5 Models Selection by Performance Evaluation
Chapter 5 Data Analysis and Modeling Results
    5.1 Statistical Models Results
    5.2 Machine Learning Models Analysis Results
    5.3 Experiments and Results
        5.3.1 CART Experimental Results
        5.3.2 Support Vector Machine Results
        5.3.3 K-Nearest Neighbor Results
        5.3.4 Gaussian Na?ve Bayes Results
        5.3.5 Random Forest Results
    5.4 Results Comparison of Machine Learning Models
    5.5 Summary
Chapter 6 Discussion of Findings
    6.1 Sensitivity Analysis
    6.2 Comparison of Variable Impact on Crash Severity from ML Models
    6.3 Summary
Chapter 7 Conclusion and Recommendations
    7.1 Conclusion
    7.2 Recommendations
References
Acknowledgements
List of Figures
List of Tables
List of Acronyms



本文编号:3974965

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/daoluqiaoliang/3974965.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户fc0ce***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com