- 4
- 0
- 约1.78万字
- 约 7页
- 2020-06-01 发布于陕西
- 举报
中国科技论文在线
语料对中文指代消解影响研究#
高俊伟,孔芳,朱巧明,李培峰**
(苏州大学计算机科学与技术学院,江苏苏州 215006 )
5 摘要:指代是自然语言中一种常见的语言现象,对简化语言,减少冗余有很大的作用。指代
消解是用计算机找出这些指代现象的一个过程。近几年英文指代消解研究取得了很大的成
就,然而,中文指代消解研究目前还较少,一方面是由于中文自然语言处理的研究起步较晚,
相关的知识较少,另外一方面就是中文相关的语料库较少,目前已知的仅有 ACE2005,
10 OntoNotes 等。为了探讨语料库对中文指代消解的影响,本文实现了一个基于机器学习方法
的中文名词短语指代消解平台和一个基于无监督聚类方法的中文名词短语指代消解平台,在
此平台的基础上从语料库的数量和质量两个方面来探讨语料对中文名词短语指代消解的影
响。
关键词:指代消解;名词短语;无监督;聚类;语料
15 中图分类号:TP391
Research on the effect of the corpus to the Chinese Noun
Phrase Anaphora Resolution
Gao Junwei, KONG Fang, ZHU Qiaoming, LI Peifeng
20 (School of Computer Science Technology, Soochow University, Jiangsu, Suzhou 215006,
China)
Abstract: Coreference is a common phenomenon in natural language, it has a great effect that
makes the natural language clear and explicit illusions. Coreference resolution is the process that
finds these phenomenons by using the computer. A great deal of research has been done on this
25 task in English and achieved a great achievement in recent years. However, much less work has
been done in this area in Chinese. One problem is that the research of Chinese NLP is later than
English, the other problem is that the lack of public corpus in the research of Chinese NLP, the
public corpus of Chinese just have ACE2005, OntoNotes and so on. To discuss the effect of the
corpus to the Chinese Noun Phrase Anaphora Resolution, we present a Chinese noun phrase
30 coreference resolution system that based on machine learning approach and another system that
based on unsupervised clustering approach. We discussed the effect of the corpus to the Chinese
noun phr
原创力文档

文档评论(0)