当前位置:主页 > 科技论文 > 软件论文 >

面向英文文章自动评改的词性标注技术的研究与实现

发布时间:2019-01-06 15:17
【摘要】:随着时代的发展,中国英文学习者的数量在急剧上升。有限的教师资源和巨大的学习需求使得智能自动辅助教学备受关注。英文文章智能评改系统是一款为中国英文学习者写的英文文章自动评改系统,这很大程度的缓解了英文学习者过多和教师资源不足的矛盾。面向中国学生英文文章的词性标注是实现对中国学生英文文章自动评改的基础。到目前为止已有大量的研究者对英文词性标注做了很多有益的研究,然而,对中国学生写的英文文章词性标注的研究却是非常少见。另外,在现有的绝大部分词性标注方法中,人工提取的特征提取过程是必不可少的。由于中国学生写的英文文章可能出现大量的未知错误,并且不同层次的英文学习者写的文章犯的错误非常不同,因此对这类文章词性标注所需要提取的特征是非常不容易被发现的。本文从词向量的角度,对中国学生写的英文文章词性标注研究。本文提出一种基于词向量的两层词性标注方法。这种方法只有少量的人工提取的特征被提取,大部分的特征通过词向量与第一层标注概率向量自动训练得到。另外,这种方法还将标注集分成两类,按照两层结构对句子进行词性标注。提出一种特征值动态更新方法。该方法在标注模型训练过程中对特征值按照一定的规则动态更新。本文的词性标注模型使用上述特征值动态更新方法训练,然后使用基于词向量的两层词性标注方法对文本进行词性标注,其准确率达到了 95.63%,超过了现有的基于词向量词性标注器对中国学生写的英文文章词性标注的准确率。
[Abstract]:With the development of the times, the number of Chinese English learners is rising sharply. The limited teacher resources and the huge learning demand make the intelligent automatic assistant teaching pay more attention. The intelligent evaluation system of English articles is an automatic evaluation system for Chinese English learners, which greatly alleviates the contradiction between the excessive number of English learners and the shortage of teachers' resources. Part of speech tagging for Chinese students' English articles is the basis for automatic evaluation of Chinese students' English articles. Up to now, a large number of researchers have done a lot of useful research on English part-of-speech tagging. However, it is very rare to study the part of speech tagging in English articles written by Chinese students. In addition, in most of the existing parts of speech tagging methods, the artificial feature extraction process is essential. Due to the fact that there may be a large number of unknown errors in English articles written by Chinese students, and the errors made by English learners at different levels are very different. Therefore, it is very difficult to find the features that need to be extracted for this kind of articles. From the point of view of word vector, this paper studies the part of speech tagging of English articles written by Chinese students. In this paper, we propose a two-layer tagging method based on word vector. In this method, only a small number of artificial features are extracted, and most of the features are obtained by automatically training the word vector and the first layer tagging probability vector. In addition, the method divides the tagging set into two categories, and carries on the part of speech tagging according to the two-layer structure. A dynamic updating method for eigenvalues is proposed. The method dynamically updates the eigenvalues according to certain rules during the training process of the annotation model. The part of speech tagging model in this paper uses the dynamic updating method of the above eigenvalues to train, and then uses the two-layer tagging method based on word vector to label the text in part of speech. The accuracy of this method is 95.63. It exceeds the accuracy of the existing word vector tagging devices for Chinese students' English articles.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1;TP18

【参考文献】

相关期刊论文 前10条

1 谭咏梅;吴坤;;面向英语文章的词性标注算法[J];北京邮电大学学报;2014年06期

2 谭咏梅;杨雪;;结合实体链接与实体聚类的命名实体消歧[J];北京邮电大学学报;2014年05期

3 朱敏;贾真;左玲;吴安峻;陈方正;柏玉;;中文微博实体链接研究[J];北京大学学报(自然科学版);2014年01期

4 张杨;;如何提高学生英语写作水平[J];黑龙江教育学院学报;2012年05期

5 李业刚;孙福振;李鉴柏;吕新宇;;语义角色标注研究综述[J];山东理工大学学报(自然科学版);2011年06期

6 金雪丹;施朝健;;图像处理与神经网络识别技术在船舶分类中的应用(英文)[J];上海海事大学学报;2007年01期

7 李红;;大学生英语写作常见错误归类分析[J];当代教育论坛;2006年16期

8 胡耀垓,李伟,胡继明;一种改进激活函数的人工神经网络及其应用[J];武汉大学学报(信息科学版);2004年10期

9 邢永康;马少平;;统计语言模型综述[J];计算机科学;2003年09期

10 陈传波,彭炎,陆枫;基于聚类的神经网络及其在预测中的应用[J];华中科技大学学报(自然科学版);2003年06期

相关硕士学位论文 前2条

1 吴广财;HMM增量学习算法在中文命名实体识别中的应用研究[D];华南理工大学;2011年

2 雷静;基于图像处理和神经网络的车牌识别系统研究[D];中南大学;2009年



本文编号:2402964

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2402964.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户08059***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com