[Abstract]:Language is the carrier of information and has the information attribute. With the development of information technology, it is an important task to use information theory and method to study the entropy of language. At present, the research on Chinese information entropy is mainly focused on Chinese information, but very few people use information theory to study language ontology. Based on the information properties of language, this paper systematically combs the theories, viewpoints and methods of the information entropy research of Chinese characters in the fields of informatics, linguistics, mathematics, pedagogy, computer science, etc. Combined with corpus linguistics, this paper expounds the concepts and calculation methods of "word entropy" and "word entropy" in written Chinese. With the help of the principle and algorithm of "word entropy" and "word entropy" in written Chinese, with the support of "familiar corpus", such as tagging part of speech, this paper compares the grammaticalization of prepositions and the style of text, respectively. The author of the Dream of Red Mansions makes a typical case study, which provides the research paradigm of "information entropy" for the study of Chinese ontology and application. From the perspective of information, this paper verifies the universality of the Zipf distribution, which will have a convincing explanation for the evolution laws of acronyms, abbreviations and lexical dichotomies. The full text is divided into five chapters: the first chapter is the introduction; the second chapter is the entropy of Chinese characters and its application in the study of Chinese ontology; the third chapter is the entropy of Chinese words and its application in the study of Chinese ontology; the fourth chapter is the entropy of Chinese and Zipf's law; and the fifth chapter concludes. The first chapter summarizes the feasibility, significance, history, current situation and existing problems of language ontology research by entropy theory, and introduces the guiding theory and research methods of this study. Some problems in the research process are also explained. The second chapter summarizes the achievements and conclusions of the previous researches on the entropy of Chinese characters, discusses the measuring methods and history of the entropy of Chinese characters, and compares the methods of frequency and entropy of characters. Combined with the quantitative analysis of Chinese character entropy of different types of Chinese sample corpus, the average character entropy of the corpus is obtained, and the application method of Chinese character entropy in the study of language ontology is put forward in combination with the analysis of the style of the martial arts works by Gu long and Jin Yong. The third chapter is the focus and center of this study. Words are the smallest unit of language that can be freely used. Since written Chinese is written in Chinese characters, the information entropy of Chinese is replaced by the research conclusion of Chinese character entropy in the past. This chapter first distinguishes the difference between Chinese entropy and Chinese entropy, and gives the measurement value of word entropy. On the basis of this, the redundancy of Chinese is discussed, and the application of word entropy in the study of Chinese ontology is discussed. The application of word entropy in grammaticalization research, comparison between different styles, text diachronic comparison, computational stylistics and so on, is illustrated by a large number of corpus. Chapter four introduces an important law of statistical distribution in language-Zipf's law. This paper analyzes the relationship between Chinese character entropy and Chinese entropy by using Zipf's law, and proves that the distribution of Chinese word entropy conforms to Zipf's law, combined with the statistical results of text entropy of several Chinese corpora. At the same time, it is found that the entropy distribution of different stylistic samples is highly consistent, which further improves the academic value of this study. In the last part, the author summarizes and generalizes the thesis, points out the shortcomings of the research, and puts forward some tentative ideas for further research.
