基于集成学习的学习者反应矩阵补全方法研究

发布时间：2018-05-25 10:14

本文选题：教育数据挖掘 + 智能学习　；参考：《华中师范大学》2017年硕士论文

【摘要】：教育数据挖掘是计算机科学、教育学和心理学的交叉性研究课题。它通过分析学习者在智能学习系统中的反馈数据,了解学习者掌握知识的情况和学习内容包含知识点的情况。在国家提出的互联网+教育大数据的基础上,教育数据挖掘将会在信息化建设中发挥更加重要的作用,实现互联网教育因材施教的目标。然而,在实际应用中,反馈数据经常会出现学习者反馈不足的情况。本论文主要研究学习者反馈矩阵补全的问题,这个问题具有深刻的理论意义和实际应用前景,一方面学生反应矩阵是天然的低秩矩阵,对其进行研究有利于进一步加强低秩矩阵恢复理论的理解和深入,另一反面反应矩阵补全对于个性化教学也具有非常实际的意义。本论文的工作大致可以分为两部分。第一部分用基于集成学习的方法改进经典的矩阵补全方法,在集成学习的Bagging和AdaBoost基础上构建了新颖的矩阵补全算法,即BaggingMC和AdaBoostMC算法;第二部分结合两者的优缺点提出了 Improved AdaBoostMC算法,解决了 BaggingMC因简单的投票使矩阵补全的误差仍然较高和AdaBoostMC中阈值随机选取对资源造成的浪费这两个问题。论文在模拟数据和真实数据上分别进行了实验,通过分析矩阵补全在不同数据集上的误差分布来判断不同算法的准确率和补全效果。实验结果表明,BaggingMC与三种经典的矩阵补全算法在补全误差上很接近,AdaBoostMC的误差相对要小,Improved AdaBoostMC在相同数据集和相同采集率的条件下误差最小。同时通过二值Lena图可以直观地看到在同一个数据集中,采集率越高的图片通过矩阵补全恢复后的效果越好。
[Abstract]:Educational data mining is an intersecting research subject of computer science, pedagogy and psychology. By analyzing the feedback data of the learners in the intelligent learning system, it can understand the situation of the learners' mastery of knowledge and the situation that the learning content contains the knowledge points. On the basis of the big data of Internet education put forward by our country, educational data mining will play a more important role in the construction of information technology, and realize the goal of teaching Internet education according to its aptitude. However, in practical application, the feedback data often appear the situation that the learner feedback is insufficient. This paper focuses on the problem of the complement of the learner feedback matrix, which has profound theoretical significance and practical application prospect. On the one hand, the student response matrix is a natural low rank matrix. The research on it is beneficial to the further understanding and deepening of the theory of low rank matrix recovery, and the complement of another reverse reaction matrix is also of great practical significance for individualized teaching. The work of this paper can be divided into two parts. In the first part, the classical matrix complement method is improved based on integrated learning, and a novel matrix complement algorithm, BaggingMC and AdaBoostMC, is constructed on the basis of Bagging and AdaBoost. In the second part, combining the advantages and disadvantages of the two algorithms, the Improved AdaBoostMC algorithm is proposed, which solves the two problems that the error of matrix complement caused by BaggingMC is still high because of simple voting and the waste of resources caused by random selection of threshold in AdaBoostMC. In this paper, experiments are carried out on the simulated data and the real data respectively. By analyzing the error distribution of matrix complement on different data sets, the accuracy and complement effect of different algorithms are judged. The experimental results show that the errors of BaggingMC and three classical matrix complement algorithms are very close to that of Ada boost MC and the error of improved AdaBoostMC is the smallest under the same data set and the same acquisition rate. At the same time, it can be seen intuitively by binary Lena diagram that in the same data set, the higher the collection rate is, the better the effect is after the restoration through matrix complement.
【学位授予单位】：华中师范大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：G434;TP311.13

【相似文献】