汉语语用能力测试评分员效应的实证研究

发布时间：2018-01-08 14:01

本文关键词：汉语语用能力测试评分员效应的实证研究　出处：《广东外语外贸大学》2014年硕士论文　论文类型：学位论文

【摘要】：本文主要通过控制评分员变量，研究评分员的评分员效应，包括整理严厉度差异、与评分标准的交互作用、集中趋势以及与试题之间的交互作用，来探讨不同变量评分员之间的评分行为差异。总共有6位评分员及30位受试参与了该研究。在6位评分员当中，3位来自中国北方地区，3位来自南方地区，3位男性，3位女性。研究结果显示，6位评分员的内部一致性信度都在可接受范围之内，但评分员间严厉度差异较大，存在18个严厉度层级。由于牵涉到了社会变量，评分员显示了较多且复杂的评分员---试题的交互作用。评分员组间也存在一定差异。其中，北方评分员相较于南方严厉，其中2位北方男评分员考察了考生的跨文化意识。对于陌生人，北方评分员相较南方评分员更为开放、热情。男评分员比女评分员更有等级、尊卑以及权力意识。本研究得出以下结论，，对于汉语语用能力测试而言，评分员之间存在较大的差异，主要体现在社会变量上，因此对于评分员细致，有针对性的培训对于提高语用能力测试的信度和效度有非常重要的意义。同时，此次研究证明了使用定量及定性分析相结合来研究评分员效应的可行性，以多层面拉西模型的结果为依据，以此进行定性分析造成评分员效应的因素，对于提高评分员培训效果也有实践意义。
[Abstract]:This paper mainly studies the grader effect of the grader by controlling the variables of the grader, including finishing the severity difference, the interaction with the scoring standard, the concentration trend and the interaction between the grader and the test questions. A total of 6 raters and 30 participants participated in the study. Of the 6 raters, 3 were from northern China and 3 were from southern China. Three men and three women. The results showed that the internal consistency reliability of the six graders was within the acceptable range, but there was a large difference in severity among the graders, and there were 18 severity levels, because of the social variables involved. The raters showed the interaction of more and more complicated raters-questions. There were also some differences among the raters groups. Among them, the northern raters were more severe than the southern ones. Two northern male raters examined candidates' cross-cultural awareness. For strangers, northern graders were more open and enthusiastic than southern raters. Male graders were more rated than female raters. The sense of inferiority and power. This study draws the following conclusions, for the test of Chinese pragmatic competence, there is a great difference between the graders, mainly reflected in the social variables, so the graders are careful. Targeted training is of great significance in improving the reliability and validity of pragmatic competence testing. At the same time, this study has proved the feasibility of using quantitative and qualitative analysis to study the raters effect. On the basis of the results of the multi-level Rasi model, the qualitative analysis of the factors causing the effect of the grader is also of practical significance for the improvement of the training effect of the grader.
【学位授予单位】：广东外语外贸大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：H136

【相似文献】