基于机器学习的软件故障预测
发布时间:2021-12-25 05:43
用于软件测试的资源通常是有限的,但是软件测试往往需要消耗大量费时昂贵的软件模块。此外,由于软件开发过程中测试往往并不充分,导致传统的软件测试手段并不足以保证软件的质量。因此,早期软件产业发展阶段中的软件故障自动预测技术在当前仍然存在。现今软件故障预测主要用于设定与优化软件测试的优先级,以充分利用有限的测试资源并尽可能地提升软件质量。在这方面,机器学习方法得到了较为广泛的应用。然而,将机器学习方法应用于精确的软件故障预测对数据质量有较高的要求。遗憾的是,真实的数据集却质量欠佳。在软件故障预测中,人们可以借助已标注的实例来构建一个模型以预测迄今尚未发现的新实例的类别。如果用于训练预测模型的数据集受到污染,则会给训练阶段和最终得到的模型都带来不利影响。一个可预期的结果是最终得到的模型精度必然不高。因此,提升数据集质量的一个有效策略是对带有缺陷数据的数据集进行清洗,主要是通过侦测数据集中可能存在的问题并消除这些问题来实现的。通过对现有软件故障预测领域相关文献的综述我们发现,分类在此领域中的大多数场合有着不可替代的重要作用。在一些特殊的场合,一些辅助的策略在应对数据质量挑战中也不可或缺。一些无足...
【文章来源】:西南交通大学四川省 211工程院校 教育部直属院校
【文章页数】:126 页
【学位级别】:博士
【文章目录】:
摘要
Abstract
List of Abbreviations
List of Symbols
1 Introduction
1.1 Background
1.2 Data Quality and Software Fault Prediction
1.3 Motivation
1.4 Dissertation Objectives
1.5 Research Significance
1.6 Dissertation Outline
2 Related Work
2.1 Software Testing.
2.2 Software Testing Goal
2.3 Software Fault Prediction
2.3.1 Common Software Fault Prediction Process
2.3.2 Machine Learning Application in Software Fault Prediction
2.3.3 Software Metrics
2.4 Data Quality Challenges
2.4.1 High Dimensionality
2.4.2 Class Imbalance Problem.
2.4.3 Noise Filtering
2.4.4 Instance Selection
2.4.5 Outlier Analysis
2.5 Model Validation Techniques
2.6 Performance Evaluation Metrics
2.7 Summary
3 A Combined-Learning Based Framework for Improved Software Fault Prediction
3.1 Overview
3.2 Hypothesis
3.3 Combined-Learning Based Framework
3.3.1 Software Metrics
3.3.2 Feature Selection Techniques
3.3.3 Data Balancing
3.4 Experimental Design
3.5 Analysis and Discussions
3.5.1 Classification Performance on mc1 SCM
3.5.2 Classification Performance on jm1 SCM
3.5.3 Classification Performance on camel-1.6 OOM
3.5.4 Classification Performance on prop-4 OOM
3.5.5 Classification Performance on ComML and ComLC Metrics
3.5.6 Comparison:SCM and OOM
3.6 Summary
4 A Three-Stage Based Ensemble Learning for Improved Software Fault Prediction
4.1 Overview
4.2 Three-Stage Based Ensemble Learning Framework
4.2.1 Stage One:Information Gain Based Feature Filtering
4.2.2 Stage Two:Synthetic Faulty Prone Over-sampling Based Data Sampling
4.2.3 Stage Three:Fusion of Classifiers Strategy Based Noise Filtering
4.3 Experimental Design
4.4 Analysis and Discussions
4.4.1 Performance in Stage One
4.4.2 Performance in Stage Two
4.4.3 Performance in Stage Three
4.4.4 Multiple Comparison of Three-Stages Using Different Performance Met-rics
4.5 Summary
5 Software Fault Prediction Using Hybrid Data Reduction Approaches
5.1 Overview
5.2 Hybrid Data Reduction Based Framework
5.2.1 Instance Selection
5.2.2 Outlier Analysis
5.3 Experimental Design
5.4 Analysis and Discussions
5.4.1 Performance of Single Data Reduction Approach
5.4.2 Performance of Two-Hybridized Data Reduction Approaches
5.4.3 Performance of Three-Hybridized Data Reduction Approaches
5.4.4 Multiple Comparison and Statistical Test of Eleven Data Reduction Ap-proaches
5.5 Summary
6 Conclusions and Future Works
6.1 Conclusions
6.2 Future Works
Acknowledgements
References
List of Publications
Research Fundings
本文编号:3551891
【文章来源】:西南交通大学四川省 211工程院校 教育部直属院校
【文章页数】:126 页
【学位级别】:博士
【文章目录】:
摘要
Abstract
List of Abbreviations
List of Symbols
1 Introduction
1.1 Background
1.2 Data Quality and Software Fault Prediction
1.3 Motivation
1.4 Dissertation Objectives
1.5 Research Significance
1.6 Dissertation Outline
2 Related Work
2.1 Software Testing.
2.2 Software Testing Goal
2.3 Software Fault Prediction
2.3.1 Common Software Fault Prediction Process
2.3.2 Machine Learning Application in Software Fault Prediction
2.3.3 Software Metrics
2.4 Data Quality Challenges
2.4.1 High Dimensionality
2.4.2 Class Imbalance Problem.
2.4.3 Noise Filtering
2.4.4 Instance Selection
2.4.5 Outlier Analysis
2.5 Model Validation Techniques
2.6 Performance Evaluation Metrics
2.7 Summary
3 A Combined-Learning Based Framework for Improved Software Fault Prediction
3.1 Overview
3.2 Hypothesis
3.3 Combined-Learning Based Framework
3.3.1 Software Metrics
3.3.2 Feature Selection Techniques
3.3.3 Data Balancing
3.4 Experimental Design
3.5 Analysis and Discussions
3.5.1 Classification Performance on mc1 SCM
3.5.2 Classification Performance on jm1 SCM
3.5.3 Classification Performance on camel-1.6 OOM
3.5.4 Classification Performance on prop-4 OOM
3.5.5 Classification Performance on ComML and ComLC Metrics
3.5.6 Comparison:SCM and OOM
3.6 Summary
4 A Three-Stage Based Ensemble Learning for Improved Software Fault Prediction
4.1 Overview
4.2 Three-Stage Based Ensemble Learning Framework
4.2.1 Stage One:Information Gain Based Feature Filtering
4.2.2 Stage Two:Synthetic Faulty Prone Over-sampling Based Data Sampling
4.2.3 Stage Three:Fusion of Classifiers Strategy Based Noise Filtering
4.3 Experimental Design
4.4 Analysis and Discussions
4.4.1 Performance in Stage One
4.4.2 Performance in Stage Two
4.4.3 Performance in Stage Three
4.4.4 Multiple Comparison of Three-Stages Using Different Performance Met-rics
4.5 Summary
5 Software Fault Prediction Using Hybrid Data Reduction Approaches
5.1 Overview
5.2 Hybrid Data Reduction Based Framework
5.2.1 Instance Selection
5.2.2 Outlier Analysis
5.3 Experimental Design
5.4 Analysis and Discussions
5.4.1 Performance of Single Data Reduction Approach
5.4.2 Performance of Two-Hybridized Data Reduction Approaches
5.4.3 Performance of Three-Hybridized Data Reduction Approaches
5.4.4 Multiple Comparison and Statistical Test of Eleven Data Reduction Ap-proaches
5.5 Summary
6 Conclusions and Future Works
6.1 Conclusions
6.2 Future Works
Acknowledgements
References
List of Publications
Research Fundings
本文编号:3551891
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/3551891.html