大学英语四级写作测试分项评分量表的制定及其效度研究
发布时间:2024-11-20 21:33
近年来,考试用户越来越关注考试分数的解释和意义,因此如何提供更加合理、清楚的分数解释,以便促进合理的使用分数就成为语言测试开发者面临的重要问题(Chapelle,Enright&Jamieson,2008)。在这种背景下,大规模外语考试中写作测试的效度越来越引起研究者的关注,因为写作测试所测量的语言能力往往不太明晰,极大地影响了考试用户对写作测试分数的理解。鉴于此,写作测试中使用的评分量表成为语言测试领域的研究热点。研究者们一致认为,评分量表体现了写作测试实际测量的语言能力(McNamara,1996;McNamara,2002;Turner,2000;Weigle,2002)。然而,现有的研究发现,大规模考试写作测试中使用的评分量表通常都存在一些问题(Brindley,1998;Knoch,2009;Todd,Thienpermpool&Keyuravong,2004;Upshur&Turner,1995)。目前,专门聚焦大规模考试中使用的写作评分量表的实证研究并不多见。考虑到写作测试在国内外大规模外语考试(如TOEFL、IELTS、CET、TEM)中的广泛应用...
【文章页数】:229 页
【学位级别】:博士
【文章目录】:
Acknowledgements
Abstract
摘要
List of Abbreviations
CHAPTER 1 INTRODUCTION
1.1 Context of the Study
1.2 Research Background
1.2.1 CET and CET Writing
1.2.2 The scoring of the CET writing
1.2.3 Concerns over the scoring of the CET writing
1.3 Aim and Significance of This Study
1.4 Research Questions
1.5 Definition of Key Concepts
1.5.1 Rating scale
1.5.2 Rating criteria and rating category
1.5.3 Writing proficiency
1.6 Organization of the Dissertation
CHAPTER 2 LITERATURE REVIEW:RATING SCALES AS A RESEARCHDOMAIN
2.1 The Role of Rating Scales in Writing Assessment
2.1.1 Validity of writing assessment
2.1.2 The role of rating scales
2.2 Fundamental Considerations in Designing a Rating Scale
2.2.1 Intended use of rating scales
2.2.2 Types of rating scale
2.2.2.1 Holistic scale
2.2.2.2 Analytic scale
2.2.2.3 Holistic scale or analytic scale:theoretical arguments
2.2.3 Basis of rating scales
2.2.3.1 Behaviorally based rating scale
2.2.3.2 Theoretically-based rating scale
2.2.4 Approaches to rating scale design
2.2.4.1 Intuitive approach
2.2.4.2 Empirical approach
2.2.5 Number of scale levels and formulation of level descriptors
2.2.6 Validation of rating scales
2.3 Summary
CHAPTER 3 CONCEPTUALISING AND MEASURING THE EFL WRITINGCONSTRUCT WITH RATING SCALES:TOWARD A TENTATIVEDESCRIPTIVE FRAMEWORK
3.1 Theoretical Construct of the EFL Writing Assessment
3.1.1 Theories of communicative competence
3.1.1.1 Communicative competence
3.1.1.2 Communicative Language Ability
3.1.2 Theories of EFL writing competence
3.1.2.1 The model of text construction
3.1.2.2 Taxonomy of language knowledge in writing
3.1.3 Content knowledge in EFL writing assessments
3.1.4 The construct of the CET4 writing
3.2 Rating Criteria determined through Empirical Studies
3.3 Rating Criteria Adopted in EFL Writing Assessments
3.3.1 EFL writing assessments and their scoring methods
3.3.2 A summary of the rating criteria adopted in EFL writing assessments
3.4 A Tentative Conceptualized Framework for the CET4 Writing
3.4.1 Relationships between theoretical components and rating criteria
3.4.2 Definitions of the rating criteria
3.4.2.1 Rating criteria measuring linguistic knowledge
3.4.2.2 Rating criteria measuring discourse knowledge
3.4.2.3 Rating criteria measuring sociolinguistic knowledge
3.4.2.4 Rating criterion measuring content knowledge
3.4.2.5 Other rating criteria
3.5 Summary
CHAPTER 4 RESEARCH METHODOLOGY AND RESEARCH DESIGN
4.1 Research Methodology
4.2 Overall Research Design
4.2.1 The a priori validation phase
4.2.2 The a posteriori validation phase
4.3 Research Design of the a Priori Validation Phase
4.3.1 Research design of Stage I
4.3.1.1 Procedure of data collection
4.3.1.2 Research instruments
4.3.1.3 Participants
4.3.1.4 Data collection
4.3.1.5 Data analysis
4.3.2 Research design of Stage II
4.3.2.1 Document analysis
4.3.2.2 Initial editing of the descriptors
4.3.2.3 Categorizing the descriptors
4.3.2.4 Qualitative validation of the descriptors
4.3.2.5 Calibrating the descriptors
4.4 Research Design of the a Posteriori Validation Phase
4.4.1 Procedure of data collection
4.4.2 Research instruments
4.4.2.1 Instruments used in the rating experiment
4.4.2.2 Instrument used in the interview
4.4.3 Participants
4.4.4 Data collection
4.4.5 Data analysis
4.4.5.1 Quantitative analysis
4.4.5.2 Qualitative analyses
4.5 Summary
CHAPTER 5 CONSTRUCTING THE SCALE:RESULTS AND DISCUSSION
5.1 Evaluating the Existing Rating Scale
5.1.1 Results of descriptive analysis
5.1.2 A comparison of the raters’attitudes
5.2 Establishing the Analytic Rating Scheme
5.2.1 Results of the questionnaire survey
5.2.2 Results of raters’interviews
5.2.3 Further discussion
5.3 Calibrating the Descriptors for CET4 W-ARS
5.3.1 Checking the data-model fit
5.3.1.1 The item-person map
5.3.1.2 The reliability estimates,separation index and fitting statistics
5.3.2 Checking functionality of the Likert scale structure
5.3.3 Checking the unidimensionality of the rating scale
5.3.4 Item statistics
5.3.5 Setting cut-off points
5.3.6 The divided levels
5.3.7 Establishing the analytic sub-scales
5.4 Summary
CHAPTER 6 VALIDATING THE SCALE:RESULTS AND DISCUSSION
6.1 Comparison of the Two Scales
6.1.1 Results of the MFRM analysis
6.1.1.1 Variable maps
6.1.1.2 A comparison of the key statistics
6.2 Category Functionality of CET4 W-ARS
6.2.1 Key statistics relating to the four sub-scales
6.2.2 Category functionality
6.2.2.1 Statistics showing category functionality
6.2.2.2 An exploratory analysis of rater-category interaction
6.2.3 Summary of quantitative findings
6.3 Raters’Perceptions of CET4 W-ARS
6.3.1 Raters’comments on CET4 W-ARS
6.3.2 Suggestions for further scale revision
6.4 Further Scale Revision
6.4.1 Refining level descriptors
6.4.2 Rephrasing descriptors
6.4.3 Readjusting level divisions
6.5 A Summary of Validity Evidence
6.5.1 Construct validity of CET4 W-ARS
6.5.2 Reliability of CET4 W-ARS
6.5.3 Authenticity of CET4 W-ARS
6.5.4 Impact of CET4 W-ARS
6.5.4.1 The scale impact at the micro level
6.5.4.2 The scale impact at the macro level
6.5.5 Practicality of CET4 W-ARS
6.6 Summary
CHAPTER 7 CONCLUSION
7.1 Recapitulation of the Study
7.2 Summary of the Major Findings
7.3 Implications of the Study
7.3.1 Theoretical implications
7.3.2 Methodological implications
7.3.3 Practical implications
7.4 Limitations of the Study
7.4.1 Limitation of the a priori validation phase
7.4.2 Limitations of the a posteriori phase
7.5 Suggested Areas for Future Research
7.6 Concluding Remarks
References
学术论文和科研成果附录
本文编号:4012390
【文章页数】:229 页
【学位级别】:博士
【文章目录】:
Acknowledgements
Abstract
摘要
List of Abbreviations
CHAPTER 1 INTRODUCTION
1.1 Context of the Study
1.2 Research Background
1.2.1 CET and CET Writing
1.2.2 The scoring of the CET writing
1.2.3 Concerns over the scoring of the CET writing
1.3 Aim and Significance of This Study
1.4 Research Questions
1.5 Definition of Key Concepts
1.5.1 Rating scale
1.5.2 Rating criteria and rating category
1.5.3 Writing proficiency
1.6 Organization of the Dissertation
CHAPTER 2 LITERATURE REVIEW:RATING SCALES AS A RESEARCHDOMAIN
2.1 The Role of Rating Scales in Writing Assessment
2.1.1 Validity of writing assessment
2.1.2 The role of rating scales
2.2 Fundamental Considerations in Designing a Rating Scale
2.2.1 Intended use of rating scales
2.2.2 Types of rating scale
2.2.2.1 Holistic scale
2.2.2.2 Analytic scale
2.2.2.3 Holistic scale or analytic scale:theoretical arguments
2.2.3 Basis of rating scales
2.2.3.1 Behaviorally based rating scale
2.2.3.2 Theoretically-based rating scale
2.2.4 Approaches to rating scale design
2.2.4.1 Intuitive approach
2.2.4.2 Empirical approach
2.2.5 Number of scale levels and formulation of level descriptors
2.2.6 Validation of rating scales
2.3 Summary
CHAPTER 3 CONCEPTUALISING AND MEASURING THE EFL WRITINGCONSTRUCT WITH RATING SCALES:TOWARD A TENTATIVEDESCRIPTIVE FRAMEWORK
3.1 Theoretical Construct of the EFL Writing Assessment
3.1.1 Theories of communicative competence
3.1.1.1 Communicative competence
3.1.1.2 Communicative Language Ability
3.1.2 Theories of EFL writing competence
3.1.2.1 The model of text construction
3.1.2.2 Taxonomy of language knowledge in writing
3.1.3 Content knowledge in EFL writing assessments
3.1.4 The construct of the CET4 writing
3.2 Rating Criteria determined through Empirical Studies
3.3 Rating Criteria Adopted in EFL Writing Assessments
3.3.1 EFL writing assessments and their scoring methods
3.3.2 A summary of the rating criteria adopted in EFL writing assessments
3.4 A Tentative Conceptualized Framework for the CET4 Writing
3.4.1 Relationships between theoretical components and rating criteria
3.4.2 Definitions of the rating criteria
3.4.2.1 Rating criteria measuring linguistic knowledge
3.4.2.2 Rating criteria measuring discourse knowledge
3.4.2.3 Rating criteria measuring sociolinguistic knowledge
3.4.2.4 Rating criterion measuring content knowledge
3.4.2.5 Other rating criteria
3.5 Summary
CHAPTER 4 RESEARCH METHODOLOGY AND RESEARCH DESIGN
4.1 Research Methodology
4.2 Overall Research Design
4.2.1 The a priori validation phase
4.2.2 The a posteriori validation phase
4.3 Research Design of the a Priori Validation Phase
4.3.1 Research design of Stage I
4.3.1.1 Procedure of data collection
4.3.1.2 Research instruments
4.3.1.3 Participants
4.3.1.4 Data collection
4.3.1.5 Data analysis
4.3.2 Research design of Stage II
4.3.2.1 Document analysis
4.3.2.2 Initial editing of the descriptors
4.3.2.3 Categorizing the descriptors
4.3.2.4 Qualitative validation of the descriptors
4.3.2.5 Calibrating the descriptors
4.4 Research Design of the a Posteriori Validation Phase
4.4.1 Procedure of data collection
4.4.2 Research instruments
4.4.2.1 Instruments used in the rating experiment
4.4.2.2 Instrument used in the interview
4.4.3 Participants
4.4.4 Data collection
4.4.5 Data analysis
4.4.5.1 Quantitative analysis
4.4.5.2 Qualitative analyses
4.5 Summary
CHAPTER 5 CONSTRUCTING THE SCALE:RESULTS AND DISCUSSION
5.1 Evaluating the Existing Rating Scale
5.1.1 Results of descriptive analysis
5.1.2 A comparison of the raters’attitudes
5.2 Establishing the Analytic Rating Scheme
5.2.1 Results of the questionnaire survey
5.2.2 Results of raters’interviews
5.2.3 Further discussion
5.3 Calibrating the Descriptors for CET4 W-ARS
5.3.1 Checking the data-model fit
5.3.1.1 The item-person map
5.3.1.2 The reliability estimates,separation index and fitting statistics
5.3.2 Checking functionality of the Likert scale structure
5.3.3 Checking the unidimensionality of the rating scale
5.3.4 Item statistics
5.3.5 Setting cut-off points
5.3.6 The divided levels
5.3.7 Establishing the analytic sub-scales
5.4 Summary
CHAPTER 6 VALIDATING THE SCALE:RESULTS AND DISCUSSION
6.1 Comparison of the Two Scales
6.1.1 Results of the MFRM analysis
6.1.1.1 Variable maps
6.1.1.2 A comparison of the key statistics
6.2 Category Functionality of CET4 W-ARS
6.2.1 Key statistics relating to the four sub-scales
6.2.2 Category functionality
6.2.2.1 Statistics showing category functionality
6.2.2.2 An exploratory analysis of rater-category interaction
6.2.3 Summary of quantitative findings
6.3 Raters’Perceptions of CET4 W-ARS
6.3.1 Raters’comments on CET4 W-ARS
6.3.2 Suggestions for further scale revision
6.4 Further Scale Revision
6.4.1 Refining level descriptors
6.4.2 Rephrasing descriptors
6.4.3 Readjusting level divisions
6.5 A Summary of Validity Evidence
6.5.1 Construct validity of CET4 W-ARS
6.5.2 Reliability of CET4 W-ARS
6.5.3 Authenticity of CET4 W-ARS
6.5.4 Impact of CET4 W-ARS
6.5.4.1 The scale impact at the micro level
6.5.4.2 The scale impact at the macro level
6.5.5 Practicality of CET4 W-ARS
6.6 Summary
CHAPTER 7 CONCLUSION
7.1 Recapitulation of the Study
7.2 Summary of the Major Findings
7.3 Implications of the Study
7.3.1 Theoretical implications
7.3.2 Methodological implications
7.3.3 Practical implications
7.4 Limitations of the Study
7.4.1 Limitation of the a priori validation phase
7.4.2 Limitations of the a posteriori phase
7.5 Suggested Areas for Future Research
7.6 Concluding Remarks
References
学术论文和科研成果附录
本文编号:4012390
本文链接:https://www.wllwen.com/shoufeilunwen/rwkxbs/4012390.html
上一篇:人际视角下合作的引发情境和促进因素 ——基于脑间同步的研究
下一篇:没有了
下一篇:没有了