基于语料库的现代汉语“所”字结构边界及功能识别研究
发布时间:2019-03-09 18:56
【摘要】:本文以信息处理为目标,以现代汉语“所”字结构为研究对象,在考察大规模语料的基础上,研究“所”字结构边界及功能的自动识别问题。全文共分为三个部分: 第一部分是绪论,确定研究对象,阐述选题的目的和意义,综述“所”及“所”字结构的研究现状,短语边界识别的研究现状,,介绍本文的研究思路与拟采用的理论方法,最后交代语料来源。 第二部分是正文,包括第二章至第五章。 第二章是语料的分类与选取。本章利用国家语委的分词与词性标注软件对所有含有“所”字的语料进行了分词和词性标注的预处理,并根据“所”字的词性对语料进行了分类,选取出研究所要依据的有效语料。 第三章是现代汉语“所”字结构边界分析。本章分别对语料中“所”前和“所”后的成分进行分析,进而总结出“所”字结构的左右边界词或相关词,为第五章的边界识别打下基础。 第四章是现代汉语“所”字结构功能分析。本章对语料中“所”字结构的句法功能进行了细致的描述,进而总结其形式上的规律,为下一章的功能识别打下基础。 第五章是现代汉语“所”字结构边界及功能识别。本章阐述了“所”字结构边界及功能的具体识别步骤,并编制程序实现这些功能。 第三部分是结语,即第六章。本章总结全文的研究成果,分析存在的不足和尚需解决的问题,并展望后续研究努力的方向。
[Abstract]:This paper aims at information processing and takes the word structure of "suo" as the object of study. On the basis of investigating large-scale corpus, this paper studies the automatic recognition of the boundary and function of the word "suo". The full text is divided into three parts: the first part is the introduction, which determines the research object, expounds the purpose and significance of the topic, summarizes the research status of the word structure of "suo" and "suo", and the research status of phrase boundary recognition. This paper introduces the research ideas and theoretical methods to be adopted, and finally explains the source of the corpus. The second part is the text, including the second chapter to the fifth chapter. The second chapter is the classification and selection of corpus. In this chapter, the segmentation and part-of-speech tagging software of the State language Committee is used to pre-process the segmentation and part-of-speech tagging of all the corpus containing the word "suo", and the corpus is classified according to the part of speech of the word "suo". Select the effective corpus on which the research should be based. The third chapter is the analysis of the structure boundary of "suo" in modern Chinese. In this chapter, the components of "suo" and "suo" in the corpus are analyzed, and then the left and right boundary words or related words in the structure of "suo" are summarized, which lays a foundation for the boundary recognition in chapter 5. The fourth chapter is the analysis of the structure and function of the word "suo" in modern Chinese. This chapter gives a detailed description of the syntactic function of the word "suo" in the corpus, and then sums up the rules of its form, which lays a foundation for the function recognition in the next chapter. The fifth chapter is the structure boundary and function recognition of "suo" in modern Chinese. This chapter describes the specific identification steps of the structure boundaries and functions of the word "suo", and compiles a program to realize these functions. The third part is the conclusion, namely the sixth chapter. This chapter summarizes the research results, analyzes the shortcomings and problems that need to be solved, and looks forward to the direction of follow-up research efforts.
【学位授予单位】:上海师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:H14
本文编号:2437759
[Abstract]:This paper aims at information processing and takes the word structure of "suo" as the object of study. On the basis of investigating large-scale corpus, this paper studies the automatic recognition of the boundary and function of the word "suo". The full text is divided into three parts: the first part is the introduction, which determines the research object, expounds the purpose and significance of the topic, summarizes the research status of the word structure of "suo" and "suo", and the research status of phrase boundary recognition. This paper introduces the research ideas and theoretical methods to be adopted, and finally explains the source of the corpus. The second part is the text, including the second chapter to the fifth chapter. The second chapter is the classification and selection of corpus. In this chapter, the segmentation and part-of-speech tagging software of the State language Committee is used to pre-process the segmentation and part-of-speech tagging of all the corpus containing the word "suo", and the corpus is classified according to the part of speech of the word "suo". Select the effective corpus on which the research should be based. The third chapter is the analysis of the structure boundary of "suo" in modern Chinese. In this chapter, the components of "suo" and "suo" in the corpus are analyzed, and then the left and right boundary words or related words in the structure of "suo" are summarized, which lays a foundation for the boundary recognition in chapter 5. The fourth chapter is the analysis of the structure and function of the word "suo" in modern Chinese. This chapter gives a detailed description of the syntactic function of the word "suo" in the corpus, and then sums up the rules of its form, which lays a foundation for the function recognition in the next chapter. The fifth chapter is the structure boundary and function recognition of "suo" in modern Chinese. This chapter describes the specific identification steps of the structure boundaries and functions of the word "suo", and compiles a program to realize these functions. The third part is the conclusion, namely the sixth chapter. This chapter summarizes the research results, analyzes the shortcomings and problems that need to be solved, and looks forward to the direction of follow-up research efforts.
【学位授予单位】:上海师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:H14
【参考文献】
相关期刊论文 前10条
1 李英;;浅谈对中文信息处理的认识[J];电脑知识与技术;2008年09期
2 朱德熙;;自指和转指——汉语名词化标记“的、者、所、之”的语法功能和语义功能[J];方言;1983年01期
3 董秀芳;重新分析与“所”字功能的发展[J];古汉语研究;1998年03期
4 蓝庆元,任海波;计算语言学概说[J];桂林师范高等专科学校学报(综合版);2001年04期
5 严慈;;现代汉语里的“所字结构”[J];贵阳师院学报(社会科学版);1983年04期
6 王朝贵;;读《现代汉语里的“所字结构”》——与严慈同志商榷[J];贵阳师院学报(社会科学版);1984年02期
7 罗南;;《所》字古今淡[J];贵州师范大学学报(社会科学版);1989年04期
8 董性茂;“所”的语法意义探究[J];福建师大福清分校学报;1986年00期
9 李晓春;“所”字三题[J];淮北煤炭师范学院学报(哲学社会科学版);2004年05期
10 邓盾;;现代汉语“所”及“所”字结构的重新审视与定性[J];汉语学习;2009年02期
本文编号:2437759
本文链接:https://www.wllwen.com/wenyilunwen/hanyulw/2437759.html