一种基于动态步长的微博搜索排序算法-湖北大学学报.pdf

一种基于动态步长的微博搜索排序算法-湖北大学学报.pdf

一种基于动态步长的微博搜索排序算法-湖北大学学报

第38卷第3期 湖北大学学报(自然科学版) Vol.38  No.3  2016年5月 Journal of Hubei University(Natural Science) May,2016  文章编号:1000 2375(2016)03 0258 09 一种基于动态步长的微博搜索排序算法 张妍琰,姚远,张娜 (河南城建学院计算机科学与工程学院,河南 平顶山467036) 摘要:微博搜索主要是计算文档与查询词之间的相关性,通过统计方法确定词量的权重,再用向量空间模型计算相 关度.然而使用词量搜索方法,搜索精度并不高,检测到某条微博的信息含量有限,难以保证用户查询的关注度.针对这 一问题,提出基于动态步长的微博搜索排序算法.该算法的主要实现过程:首先对微博已有的特征进行分析,然后用信息 熵的方法计算微博信息含量,不使用词量为计算单位,而以词性为单位计算微博的相关度.最后把动态步长加入到 ListNet排序算法中,并用Armijo⁃Goldstein准则对步长进行优化.通过仿真实验表明,本算法排序效果更优. 关键词:微博;搜索排序;ListNet算法;Armijo⁃Goldstein准则;特征值;动态步长 中图分类号:TP391.6    文献标志码:A    DOI:10.3969/ j.issn.1000⁃2375.2016.03.016 A microblog search sort algorithm based on dynamic stepsize ZHANG Yanyan,YAO Yuan,ZHANG Na (Institute of Computer Science and Engineering,Henan University of Urban Construction,Pingdingshan467036,China) Abstract:Microblog search is mainly calculation the relevance between the document and query,these weight ofwordsaredeterminedbythestatisticalmethod,andtherelevancedegreeiscalculatedbyvector space model.However,searching bywordsisnot enough accuracy,theinformation content of microblogunit detection through this method is limited,thus inadequate to show the true attention paid by users in their query.Aiming to this problem,weproposedasortalgorithmformicroblogsearchbasedondynamicstepsize.Themainprocess of algorithm:firstly,the existing features of microblog were analyzed. Secondly,the information content of microblog were calculated by using information entropy method,words were not as the calculating unit,but calculation the relevance of microblog based on part of speech.Finally,the dynamic stepsizewas introduced to the ListNet sort algorithm,and it was optimized by Armijo⁃Goldstein principle. The simulation experiment results show that the alg

文档评论(0)

1亿VIP精品文档

相关文档