Scalable Techniques for Clustering the Web可伸缩技术的聚类网络.pptVIP

  • 3
  • 0
  • 约6.55千字
  • 约 37页
  • 2017-03-08 发布于上海
  • 举报

Scalable Techniques for Clustering the Web可伸缩技术的聚类网络.ppt

Scalable Techniques for Clustering the Web可伸缩技术的聚类网络

Scalable Techniques for Clustering the Web Taher H. Haveliwala Aristides Gionis Piotr Indyk Stanford University {taherh,gionis,indyk}@ Project Goals Generate fine-grained clustering of web based on topic Similarity search (“What’s Related?”) Two major issues: Develop appropriate notion of similarity Scale up to millions of documents Prior Work Offline: detecting replicas [Broder-Glassman-Manasse-Zweig’97] [Shivakumar-G. Molina’98] Online: finding/grouping related pages [Zamir-Etzioni’98] [Manjara] Link based methods [Dean-Henzinger’99, Clever] Prior Work: Online, Link Online: cluster resu

文档评论(0)

1亿VIP精品文档

相关文档