微引擎流水线机器翻译系统结构(Micro engine line Machine Translation system architecture).docVIP

  • 6
  • 0
  • 约2.64万字
  • 约 30页
  • 2017-08-03 发布于河南
  • 举报

微引擎流水线机器翻译系统结构(Micro engine line Machine Translation system architecture).doc

微引擎流水线机器翻译系统结构(Micro engine line Machine Translation system architecture) Micro engine line Machine Translation system architecture Liu Qun Institute of computational linguistics, Peking University Institute of computing technology, Chinese Academy of Sciences Liuqun@ Abstract: This article from the knowledge representation, knowledge acquisition, knowledge level for the three rules in the development process in Machine Translation system method and statistical method are summarized and analyzed, and introduced the Machine Translation micro engine pipeline system our proposed structure. Keywords: Machine Translation, hybrid approach, multi engine strategy, micro engine, pipeline architecture 1. rules and statistics combined Machine Translation method The rules, methods, and statistical methods used in the study of Natural Language Processing and Machine Translation (or rationalism and empiricism) are two mainstream approaches. We think this kind of formulation is too general. Here we classify some existing research methods from two aspects of knowledge representation and knowledge acquisition. From the point of view of knowledge representation, the existing methods can be classified into the following categories: Rule 1.: symbolic rule is a very intuitive knowledge representation for the linguists, expression is very convenient, the size can be thick or thin, with great flexibility; but two value logic knowledge representation of the general rules to comply with either this or that, the poor robustness of the system; 2. data: in a variety of statistical models, knowledge is reflected through the data; and the actual data type knowledge is not necessarily statistical knowledge, such as fuzzy set membership degree, is a data type knowledge; 3. rule + data: This is a hybrid form of knowledge representation, typically like various probabilistic grammars, that adds probabilistic information or confidence information to each rule; 4. corpus: in most cases, the knowledge c

文档评论(0)

1亿VIP精品文档

相关文档