基于概率统计与图的多文档自动文摘研究.pdfVIP

下载本文档

0
0
约1.69万字
约 4页
2017-08-19 发布于安徽
举报

基于概率统计与图的多文档自动文摘研究.pdf

lA 计算机科学2007V01．34No．1 基于概率统计和图的多文档自动文摘研究刘量谌卫军王建民 (清华大学软件学院北京100084) 摘要本文介绍了一种新的多文档自动文摘方法。该方法的基本思想是通过估计词在文摘中出现的概率采选择最能反映多文档集信息内容的核心词，然后根据该概率计算核心词的权值，最后将整个文档集合中的句子及其信息内容重叠关系表示为一个无向有权图，从中连选出丈摘句。本文利用DUC2003和DUG2004的敦裾集对该方法进行评测。测试表明，该方法所产生的文摘质量优于其它的一些方法比如LexRank 关键词弧IDF，Graph，LexRank，DUC ResearchOil Document on SummarizationBased Modeland Multiple Probability Graph LIU CHENWei-JunWANG Liang Jian-Min of (School Software。TsinghuaUniversity，Beijing100084) AbstractThe anovel tO a from documents．The paperpresents approachautomaticallyproducesummarymultiple firSt ofour istOselectasetofcentralfeatureswhichcail themainthemeofthedocuments． step approach represent T}地selectiondecisionsaremade thefeatures’distributionsinthefuture the byconsidering summary．1rhenweight of铭chfeatureis whichilleasLlre喳itscontributiontOthe of丑sentenceincludedinthe computed possibility being allundirectedisconstructedto thesentencesetandUSedforthe sentence summary．k峙tly graph represent sulYRnal7 extraction．ThemethodwasevaluatedontheDUCdatasetsof2003and resultsshow proposed 2004．Experimental thatourmethodoffers tOothersummarizersuchasLexRank． promisingperformancecompared systems KeywordsTF．正)F，Graph，LexRank．DUC 别原文档集中最重要的信息，然后利用自

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

基于概率统计与图的多文档自动文摘研究.pdfVIP