当前位置:主页 > 文艺论文 > 语言学论文 >

基于语料库的汉语词汇拼音拼写研究

发布时间:2019-02-22 07:11
【摘要】:汉语拼音作为汉字拼写和认读的工具,无论是在传统的母语教学、新兴的对外汉语教学、辞书出版、拼音教材的编写和出版还是中文信息处理中都起着无可替代的作用,而且随着汉语在国际地位上的日益提高,汉语拼音的作用越来越明显。给汉字注音的历史比较久远,但是用汉语拼音给汉字注音仅仅是上世纪才开始的。1958年颁布的《汉语拼音方案》为汉语拼音拼写提供了规范性的依据,1988年公布的《汉语拼音正词法基本规则》又规定了“词式”书写的拼写方式。《汉语拼音方案》和《汉语拼音正词法基本规则》像汉语拼音规范化的两条腿一样,在普通话的注音过程中起着重要的规范和引导作用。 但是《汉语拼音方案》和《汉语拼音正词法基本规则》都是基本的方案和规则,而在实际的拼写过程中还是存在很多具体的问题的。对于汉语拼音拼写方法的研究目前还不是很多。本文正是基于以上原因才进行的研究。 本文采用的是基于语料库的研究方法。我们首先建立了汉语词汇语料库,收录了来自《辞海》第6版、《现代汉语常用词表》、《编年本新词语》(2003~2009)、《中医词条》、《现代汉语词典》第5版、《新词语》等六本词典中的词条,,然后对语料库进行加工,标注词条信息,在标注的基础上提取出了12万条左右的普通语词,并按音节数将其分为单音节词库、双音节词库、三音节词库、四音节词库、五音节及以上词库。在语料库中,按照2012年新颁布的《汉语拼音正词法基本规则》对词条进行了拼音标注,并根据其结构类型的特点做了分类标注。 本文就是在上述语料库标注加工的基础上,分类来说明各个音节的词条的拼写方法。由于单、双音节的拼写非常简单,在这里不做分析,仅对三音节及以上的词条做了拼写法的说明。三音节、四音节独立成章,五音节及以上的成一章,每一类里面又分为词和短语两个大的部分来讨论拼写方法。而词的拼写方法以连写为主。短语的拼写方法是在语法结构分析的基础上进行的。有的短语稍微复杂些,不止一层结构,所以在分析的时候要分析彻底,这样才能保证拼写的正确性。除此之外,我们还会参考语音和语用的原则。 本文的研究成果是对《汉语拼音正词法基本规则》的一种细化、实践和应用。对于《汉语拼音正词法基本规则》的实施和推广有着非常重要的价值。而对于辞书注音、对外汉语教学、拼音教材的出版业有着非常重要的参考价值。
[Abstract]:As a tool for spelling and recognition of Chinese characters, Hanyu Pinyin plays an irreplaceable role in the traditional mother tongue teaching, the emerging teaching of Chinese as a foreign language, the publication of dictionaries, the compilation and publication of Pinyin textbooks and the processing of Chinese information. And with the increasing international status of Chinese, the role of Pinyin is becoming more and more obvious. The history of pronunciation of Chinese characters is quite long, but the Pinyin of Chinese characters began only in the last century. The "Hanyu Pinyin Program" promulgated in 1958 provides a normative basis for spelling the Chinese phonetic alphabet. The basic rules of orthography of Chinese phonetic alphabet published in 1988 also stipulated the spelling method of "word type" writing. "the scheme of Chinese phonetic alphabet" and "basic rule of orthography of Chinese phonetic pronunciation" are like the two legs of standardization of Chinese phonetic alphabet. Putonghua phonetic process plays an important role in the regulation and guidance. However, the Chinese Pinyin Scheme and the basic rules of orthography of Chinese Pinyin are both basic schemes and rules. However, there are still many concrete problems in the actual spelling process. At present, there are not many researches on spelling methods of Chinese phonetic alphabet. This paper is based on the above reasons for the study. This paper uses a corpus-based approach. We first set up the Chinese lexical corpus, which was collected from the 6th edition of Cihai, the list of Common words in Modern Chinese, the New words of chronology (20032009), the entry of traditional Chinese Medicine, the Fifth edition of the Dictionary of Modern Chinese. "New words" and other words in six dictionaries, then processed the corpus, annotated the entry information, extracted about 120000 common words on the basis of tagging, and divided them into monosyllabic lexicon and dicyllabic lexicon according to the number of syllables. A three-syllable lexicon, a four-syllable lexicon, a five-syllable lexicon and more. In the corpus, according to the "basic rules of orthography of Chinese Pinyin" promulgated in 2012, the Pinyin tagging is carried out, and the tagging is made according to the characteristics of its structure type. Based on the processing of corpus annotation, this paper explains the spelling method of each syllable by classification. Since the spelling of monosyllabic and dicyllabic is very simple, there is no analysis here, only three syllables and above are spelt out. Three syllables, four syllables in separate chapters, five syllables and more in one chapter, each of which is divided into two large parts of words and phrases to discuss spelling. The spelling of words is mainly composed of continuous writing. The spelling of phrases is based on the analysis of grammatical structures. Some phrases are slightly more complex, more than one layer of structure, so the analysis should be thorough to ensure the correct spelling. In addition, we will refer to the principles of phonetics and pragmatics. The research result of this paper is a kind of refinement, practice and application to the basic rules of orthography of Chinese phonetic alphabet. It has very important value for the implementation and popularization of the basic rules of orthography of Chinese phonetic alphabet. For the pronunciation of dictionaries, teaching Chinese as a foreign language, the publishing of Pinyin textbooks has very important reference value.
【学位授予单位】:鲁东大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:H125.3

【参考文献】

相关期刊论文 前10条

1 金艳艳;;词典中的AABB式[J];辞书研究;2005年04期

2 曹国安;;《现汉》中的AABB式词语[J];辞书研究;2007年03期

3 余桂林;;对《现代汉语词典》成语注音分连写的再思考[J];辞书研究;2007年05期

4 张超男;姜岚;;中文词典编纂中三音节词目的拼音标注问题[J];辞书研究;2008年02期

5 王飙;;从《现代汉语词典》(第5版)谈成语拼音分连写标准[J];辞书研究;2009年01期

6 杨书俊;;国家语委词表三音节词语统计与分析[J];辽东学院学报(社会科学版);2008年02期

7 祝克懿;中缀说略[J];贵州师范大学学报(社会科学版);1994年04期

8 马庆株;;促进汉语拼音科学发展,构建和谐语言世界[J];北华大学学报(社会科学版);2009年01期

9 庞玉琪;;浅谈《汉语拼音方案》的用途及影响[J];成才之路;2009年03期

10 马志伟;乔永;;成语注音问题再研究[J];辞书研究;2007年05期

相关会议论文 前1条

1 邱立坤;;单音节名词(缀)的释义模式与三音节名词的语义结构关系[A];内容计算的研究与应用前沿——第九届全国计算语言学学术会议论文集[C];2007年

相关博士学位论文 前1条

1 胡孝斌;现代汉语双叠四字格AABB式研究[D];北京语言大学;2007年

相关硕士学位论文 前3条

1 陈雯;《现代汉语词典》中定型四字格的语义分析[D];南京师范大学;2011年

2 程洲;《现代汉语词典》三音节词及固定语声音形式和语法结构研究[D];北京语言大学;2007年

3 史海菊;现代汉语三音节惯用语问题研究[D];上海师范大学;2007年



本文编号:2427949

资料下载
论文发表

本文链接:https://www.wllwen.com/wenyilunwen/yuyanxuelw/2427949.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户91ed4***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com