基于层次短语的统计翻译系统中规则冗余的高效约束方法.pdfVIP

下载本文档

8
0
约1.05万字
约 7页
2017-09-02 发布于重庆
举报
版权申诉

基于层次短语的统计翻译系统中规则冗余的高效约束方法.pdf

1、本文档共7页，可阅读全部内容。
2、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。
3、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。
5、该文档为VIP文档，如果想要下载，成为VIP会员后，下载免费。
6、成为VIP后，下载本文档将扣除1次下载权益。下载后，不支持退款、换文档。如有疑问请联系我们。
7、成为VIP后，您将拥有八大权益，权益包括：VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
8、VIP文档为合作方或网友上传，每下载1次，网站将根据用户上传文档的质量评分、类型等，对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档

基于层次短语的统计翻译系统中规则冗余的高效约束方法方李成宗成庆中科院自动化研究所模式识别国家重点实验室北京 100190 {lcfang, cqzong}@nlpr.ia.ac.cn 摘要：基于层次短语的统计机器翻译模型是近年来比较流行且翻译质量较好的一种模型。层次短语翻译系统有效地将同步上下文无关文法的重排序能力构建于成熟的普通短语翻译系统之上，得到了在重排序和捕捉上下文信息方面都具有优势的模型。然而，层次短语翻译系统在计算复杂度方面远高出普通短语翻译系统，使用的规则存在大量的冗余。本文分析了基于层次短语的翻译系统的规则冗余问题，提出了一种基于重排序分割点的约束方法，使得学习重排序规则的训练过程集中在训练语料中重排序真实发生的片段。实验证明这种方法大幅度减少了规则数量和解码时间，且使训练时间减少了一个量级，而翻译质量仅有微小损失，并保持了基于层次短语的翻译系统和普通短语翻译系统相比翻译质量的优势。关键词：统计机器翻译，层次短语，同步上下文无关文法，重排序分割点，重排序 An Efficient Constraint to Reduce the Redundancy of Rules in Hierarchical Phrase-Based Translation Systems FANG Licheng, ZONG Chengqing National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190 {lcfang, cqzong}@nlpr.ia.ac.cn Abstract: Hierarchical phrase-based translation model is a popular statistical translation model which yields high quality translation by combining the reordering power of synchronous context free grammars and the proved wisdom of conventional phrase-based translation models. However, the model suffers from a significant high computational cost and a large redundancy of rules compared with conventional phrase-based systems. This paper analyzes the rule redundancy in hierarchical phrase-based systems, and proposes a rift-based constraint that forces the rules with reordering power to focus on where reordering has actually happens. Experimental results show that our method greatly reduces the number of rules extracted and used in the system, the decoding time, and reduces the training time by an order of magnitude. The sacrifice in translation quality is little and the advantage over conventional phrase-based systems is maintained. key words: Statistical machine translation, hierarchical phrases, synchronous context free grammars, rift, reordering 1