支持自然语言的智能阅卷算法研究与实现

发布时间：2018-05-25 15:16

本文选题：智能作文评阅 + Adaboost/CT　；参考：《济南大学》2017年硕士论文

【摘要】：英语教学堪称国内教学之重之难。其重要性在高考、考研、考博、出国留学等人生重要考试中的地位已无须多言;其困难性在于环境与师资的严重不足。以英语作文为例,学生要想在英语作文上有较大的提高,需要得到教师对学生作文进行有效的批改。这种教学上的内在需求,不仅仅要求英语教师花费大量的时间去批改作文,还要求教师有较高的英语写作水平。目前国内的英语教师基本上为本土教师,母语并非英语,无论是水平上,还是精力上都不能胜任这项工作。在实际的英语写作教学过程中,常见的情形是:学生人数多,教师人数少,阅卷效率低,“从考试到阅卷到反馈”的周期太长,学生写作练习机会太少,反馈滞后致使不足之处得不到及时纠正提高等普遍存在的现象,不一而足。除此之外,在阅卷的过程中,阅卷教师给出的得分很容易受到主观情绪等不利因素的影响,甚至以偏概全,给出比较极端的判卷结果。随着计算智能的长足发展,近年来“机器阅卷”已走入大众视野。它克服了人工阅卷效率低的不足,弥补了掺杂教师情感的缺陷,保证了阅卷的准确高效以及评价的客观性和一致性,将教师从繁重的劳动力中解脱出来,以便于做更有意义的工作。国外对于写作智能评阅的研究比较早,并且已有比较成熟的系统,实际的应用中表现出了较好的可靠性,但这些系统的设计都是针对以英语作为母语写作来进行评阅的,对于国内的英语考试,考虑到考生和阅卷老师母语都不是英语,且针对英语写作能力的要求也不同于英语作为母语的要求,若套用这些系统进行评阅,必然存在于人工评阅的“不兼容”,因此针对国内学生及其文化的诉求,作文智能评阅算法在指标选取与反馈中作了改进。本研究以英语作文智能评分算法为探索背景,收集真实考试的样本试卷,研究可能影响作文结果的阅卷指标,归纳对作文分数影响显著的算法指标,提取出考试的一般规律,在此基础上建立一个分层的阅卷模型。通过综合考虑英语四级作文的评分标准,以及借鉴前人总结的一些指标,构建了一套指标体系,考虑到构成作文的基本要素是单词、句子和篇章结构三大方面的指标,通过各个指标跟作文分数进行主成分分析,挖掘出影响作用显著的指标,并对这些指标进行分析,根据指标与作文分数的线性拟合,得到一个最佳的值,构建一个分层的评阅模型。然而作文分数不是最终的目的,本研究提出的智能评阅系统不仅仅包括评阅分数,还包括分析信息的自然语言反馈以及个人学习建议,从而达到以评促学的目的。本系统通过三种工具计算潜在影响作用的指标,对来自不同专业和班级的三种话题312篇作文进行了测试。实验表明,智能作文评阅与人工评阅相对比的精确准确性为79.66%,邻接准确率为94%,最大误差率均小于20%,智能评分系统不存在奇异值性误差。结果表明改进后的Adaboost/CT算法能够很好地应用于智能作文评分。
[Abstract]:The importance of English teaching is very difficult in domestic teaching. The importance of English teaching is not necessary in the important examinations of life, such as college entrance examination, entrance examination, examination and study abroad. The difficulty lies in the serious shortage of environment and teachers. The internal demand of this kind of teaching is not only required by English teachers to spend a lot of time to correct their compositions, but also to require teachers to have a higher level of English writing. At present, English teachers in China are basically native teachers, their mother tongue is not English, both at the level and in their energy. In the course of English writing teaching, the common situation is: the number of students, the number of teachers, the low reading efficiency, the long period from examination to reading to the feedback, the students' writing practice is too little, and the feedback lag causes the deficiency to be corrected and raised in time. In the process, the scores given by the teachers are easily affected by the adverse factors such as subjective emotion, even to the extreme. With the rapid development of the computational intelligence, the "machine reading" has entered the public field of vision in recent years. It overcomes the shortage of manual reading efficiency and makes up for the emotion of adulterant teachers. Defects, which guarantee the accuracy and efficiency of the marking and the objectivity and consistency of the evaluation, release the teachers from the heavy labor force in order to make it easier to do more meaningful work. The design of the system is aimed at reading English as a native language. For the English test in China, the native language of the examinee and the reading teacher is not English, and the requirements for the English writing ability are different from the English language as a mother tongue. Therefore, in view of the demands of domestic students and their culture, the composition intelligent evaluation algorithm has been improved in the selection and feedback of the index. This study takes the English composition intelligent scoring algorithm as the exploration background, collects the sample test papers of the true examination, studies the reading index which may affect the composition results, and sums up the algorithms that have a significant influence on the composition scores. On the basis of the general rules of the examination, a hierarchical reading paper model is set up, and a set of index system is built through the comprehensive consideration of the scoring standard of four grades of English composition and some indexes summed up by the predecessors. The basic elements of the composition are the indicators of the three aspects of words, sentences and text structures. Each index and composition score are analyzed by principal component analysis, and the index is excavated, and the indexes are analyzed. According to the linear fitting of the index and composition score, an optimal value is obtained and a hierarchical review model is constructed. However, the composition score is not the ultimate goal, the intelligent review system proposed in this study It includes not only the evaluation score, but also the natural language feedback and personal study advice of the analysis information, so as to achieve the goal of learning and promoting learning. The system uses three tools to calculate the indicators of potential impact, and tests 312 compositions from three topics from different majors and classes. The accuracy of the artificial review is 79.66%, the adjacency accuracy is 94%, the maximum error rate is less than 20%, and the intelligent scoring system does not have the singular value error. The results show that the improved Adaboost/CT algorithm can be well applied to the intelligent composition score.
【学位授予单位】：济南大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：H319;TP391.1

【参考文献】