[Abstract]:Machine Translation is the most effective way of cross language information exchange. With the implementation of the national strategy of "one area and one road", Han Yue Machine Translation becomes more and more important. There is a great deal of cooperation between China and Vietnam in the metallurgical industry. There are a lot of translation needs in the text of metallurgy, scientific literature, industry information and so on, and the information is translated automatically. It is of great significance to promote the international cooperation and exchange of information between the Han and Vietnam bilateral metallurgical industries. The research work of the Han and Vietnamese Machine Translation is relatively weak, especially in the specific field of Machine Translation research, which seriously restricts the cross language information exchange for the industry. There are great differences in the language of the Han Dynasty and Vietnam. The translation of the industry is also characterized by many fields. The traditional translation method can not be fully adapted to the Machine Translation in the field of metallurgy. It is faced with the acquisition of bilingual terminology, the automatic tagging of bilingual word alignment, the Machine Translation problem adapted to the differences and domain characteristics of the Han Yue language, combining the differences of the Chinese and Vietnamese language and the metallurgical collar. In this paper, the key technologies and methods of Machine Translation in the area of Han Yue metallurgy are discussed in this paper. This paper focuses on the study of the key technologies, such as the acquisition of Sino Vietnamese bilingual terminology, the alignment of Chinese and Vietnamese bilingual words, the tree to the tree syntactic statistics Machine Translation, the syntactic statistics of the domain characteristics of the syntactic statistics Machine Translation and other key technologies. Innovative achievements: (1) in view of the problem that the Chinese and Vietnamese corpus are scarce and the bilingual terminology is difficult to obtain, the automatic acquisition method of bilingual terminology in metallurgical field based on pivot language is proposed, with the help of the existing Chinese English, English and Vietnamese bilingual contrast domain text and scientific literature, the conditional random field model is used in the source language to the Chinese domain. The text carries out the terminology recognition, and then, based on the phrase - based statistical Machine Translation thought, the Chinese - English phrase probability table is constructed, the English - Vietnamese phrase probability table is used to obtain the phrase probability table of Chinese to Vietnamese by the mash of the pivot, and the Chinese Vietnamese phrase table is used to construct Han Yue metallurgy with the Chinese domain terms. The bilingual terminology Library of the gold field has proved that the proposed method has achieved a good term extraction effect. In the case of scarcity of Chinese and Vietnamese bilingual align resources, the problem of bilingual terminology extraction in the Han Yue metallurgy field is effectively solved. (2) in view of the problem of automatic tagging in the alignment of the Chinese and Vietnamese words, the Chinese Vietnamese words with the characteristics of the language difference and the deep learning are put forward. In order to improve the performance and accuracy of the bilingual word alignment learning, the homogeneity method, combining with the differences of the postposition of the attributive, the postposition of adverbials and the position of the language structure, defines the position transformation function and the structural adjustment function of the language, and combines these functions as a constraint to integrate the linguistic structure difference into the loss function of the two-way RNN learning. The results of the bilingual word alignment show that the proposed method has a good effect. Language characteristics and two-way context information can effectively improve the effect of word alignment. (3) according to the characteristics of the Chinese and Vietnamese language differences, the Chinese Vietnamese tree to tree statistical Machine Translation method is proposed. The language difference characteristics have a good effect on the Machine Translation. This paper analyzes the differences between the Chinese and Vietnamese language, defines the Chinese Vietnamese language differentiation rules, defines the language characteristics of the attributive postposition reward, the time adverbial postposition reward, the place adverbial postposition reward and so on. With the help of the Chinese and Vietnamese bilingual words, the language difference features are fused to the tree to tree translation rule extraction process when the template is extracted. In the decoding process, the language is used in the decoding process. The difference rules are used to prune and optimize the candidate sentences, obtain the optimal translation sequence and improve the efficiency and accuracy of template extraction and decoding. The results of Chinese Vietnamese bilingual sentence translation experiments show that the proposed method has achieved good results. The use of syntactic differences can effectively improve the performance and accuracy of translation. (4) to improve the domain text In translation effect, the Chinese Vietnamese syntactic statistics Machine Translation method, which combines the characteristics of the domain, is proposed, and the characteristics of the domain and its influence on Machine Translation are analyzed. With the use of domain terms and corpus, the bilingual terminology theme distribution model, the topic coherence model in the paragraph domain, and the domain knowledge model based on Freebase are used to fuse the language characteristics. In the tree to tree translation model, bilingual domain terminology database, bilingual term - topic probability distribution, paragraph domain coherence and domain knowledge relation are applied to the selection of candidate translation, combination and pruning optimization, so as to better use the domain characteristics to improve the translation effect of the domain. The method proposed by Ming has achieved good results, and the domain theme, paragraph theme coherence and domain knowledge have significant effect on the translation of domain texts.


