利用层次模型实现P2P网络的全文检索推荐.docVIP

  • 3
  • 0
  • 约3.31万字
  • 约 13页
  • 2018-08-23 发布于河北
  • 举报

利用层次模型实现P2P网络的全文检索推荐.doc

利用层次模型实现P2P网络的全文检索推荐.doc

利用层次模型实现P2P网络的全文检索 摘 要:本文的研究对象是P2P搜索问题。P2P搜索算法的理想目标是:一方面能够达到IR(Information Retrieval)算法的搜索质量,另一方面能够保证搜索的可扩展性。然而,已有研究提出的搜索算法尚不能同时满足这两个条件,为此,本文从层次聚类的思路提出一种新的DHC算法。其主要过程是:首先将共享文件转化成向量样本,然后增量式地向层次树中添加样本,样本按照一定要求放置于合适的位置上。在物理层面上,层次树的节点分置于各个servent,通过servent通讯实现层次树的调整。搜索时,query发起节点首先路由到层次树的根节点,从根节点出发向下逐层搜索,通过比较query与各个下层节点的距离,选出合适的分支继续搜索。在层次树中,叶节点代表样本,当搜索到达叶节点时,满足要求的样本将被发送回初始节点。理论分析和初步的仿真试验表明,DHC算法具有较高的查全率,其搜索深度和更新代价与servent总数的对数成正比。由此可得,基于层次聚类的DHC算法既能达到IR算法的搜索质量,又具有搜索可扩展性,是一种有效的P2P搜索算法。 关键词:P2P搜索;可扩展性;分布式;层次聚类;内容索引 Using hierarchical model to harness full-text retrieval in peer-to-peer network Abstract: Ideal content-based routing algorithm should not only provide IR algorithms’ effectiveness, but guarantee routing’s scalability. However, former works did not really achieve both aims. In this paper, we present a novel method named Distributed Hierarchical Clustering to address it. Firstly, files in vector-format are placed to appropriate position in Hierarchical Clustering Tree (HC-Tree). In physical network, HC-Tree nodes may be placed on different servents, and clustering is established by servents communicating. Working in a top-down fashion, a query will be sent from root to relevant sub-nodes. When it reaches leaf nodes which are responsible for files, routing is terminated. The physical addresses of those relevant files will be returned to original node. Results from theoretical analysis and simulations show that, under preservation of a stable recall, DHC is incrementally scalable, with lookup costs scaling logarithmically with the number of servents. In conclusion, DHC is an efficient p2p routing algorithm. Key words: peer-to-peer routing, scalability, distributed, hierarchical clustering, content-based 1简介 近来,Peer-to-peer系统(简称P2P系统)在文件共享和信息搜索等方面得到了越来越多的应用,Morpheus ADDIN REFMGR.CITE RefmanCiteYear2003/YearRecNum14/RecNumIDTextMorpheus/IDTextMDL Ref_Type=Internet CommunicationRef_TypeInternet Communication/Ref_TypeRef_ID14/Ref_IDTitle_PrimaryMorpheus/Title_Prima

文档评论(0)

1亿VIP精品文档

相关文档