异构信息网络上的相似性搜索分析.pdfVIP

下载本文档

3
0
约8.89万字
约 61页
2019-06-21 发布于广东
举报

异构信息网络上的相似性搜索分析.pdf

万方数据摘要摘要异构信息网络分析是近年来数据挖掘领域中非常热门而新颖的研究方向，受到越来越多专家学者的关注。现实世界中来自各种不同领域的系统中往往都存在多种类型的对象，不同类型的对象之间有着不同类型的链接关系，可以建模为异构信息网络。异构信息网络分析中的一项典型工作就是相似性搜索，通过研究异构信息网络相似性搜索可以有效地挖掘网络中丰富的语意和隐藏的知识。本文研究分析了传统的基于特征的相似性度量方法以及当前同构信息网络与异构信息网络下基于链接的相似性搜索算法和其中存在的问题。针对PathSim 算法相似性搜索结果容易产生影响力偏差和领域偏差的问题，本文提出了异构信息网络中对象影响力因子的概念以及计算方法，并以此为基础提出了一种基于元路径并且综合影响力因子的相似性搜索算法 APSim 。该算法通过在元路径中加入中间核心类型对象影响力因子的综合考量使得异构信息网络中基于元路径的相似性搜索更加具有现实意义。最后，通过对PathSim 和APSim 两种算法在不同元路径下的执行效果进行比较分析，说明了APSim 算法能够较好地避免相似性搜索结果影响力偏差和领域偏差的问题，验证了本文提出的APSim 算法的正确性与准确性。关键字：异构信息网络相似性搜索影响力因子元路径万方数据异构信息网络上的相似性搜索研究万方数据 Abstract Abstract In recent years, heterogeneous information network analysis becomes a very popular and novel research direction in the field of data mining, attracting more and more attention of experts and scholars. In the real world, systems of different areas often exist multiple types of objects and different types of links between the different types of objects and can be modeled as a heterogeneous information network. Similarity search is a typical job in heterogeneous information network analysis, By researching similarity search in heterogeneous information network we can effectively mine the rich semantic and hidden knowledge of networks. This paper analyzes the traditional feature-based similarity measurement

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

异构信息网络上的相似性搜索分析.pdfVIP