天网搜索平台PARADISE.ppt

天网搜索平台PARADISE.ppt

* 量化查询与摘要相关性 3 分查询与摘要很相关,阅读摘要后决定打开链接或在摘要中已经发现查找内容 2 分查询与摘要一定相关,阅读摘要后有打开链接的倾向 1 分查询与摘要不太相关,不倾向打开链接 0 分查询与摘要不相关,不会打开链接 查询数: 30 链接文档数:150 有效链接: 149 实验结果和数据放在: /src/paradise/reference/snippet/snippet_exp/ 本文算法公式(2)计算结果:0.973 百度摘要公式(2)计算结果:1.033 相比百度略优6% * QA! 谈应用:1) 实用发现理论不足;2) 解决方案;3)进展 * T. Meng and H. Yan, On the Peninsula Phenomenon in Web Graph and Its Implications on Web Search, Computer Networks, vol. 51(1), pp. 177-189, 2007. Jonathan J.H. Zhu, Tao Meng, Zhengmao Xie, Geng Li, Xiaoming Li, A Teapot Graph and Its Hierarchical Structure of the Chinese Web. Proceedings of the World Wide Web Conference 2008 (WWW08), (poster paper) Nan Di, Conglei Yao, Mengcheng Duan, Jonathan J. H. Zhu and Xiaoming Li, Representing a Web Page as Sets of Named Entities of Multiple Types -- A Model and Some Preliminary Applications Proceedings of the World Wide Web Conference 2008 (WWW08), (poster paper) Conglei Yao, Yongjian Yu, Sicong Shou, Xiaoming Li, Towards a global schema for web entities, in Proceeding of the 17th international conference on World Wide Web. 2008, ACM: Beijing, China. Lianen Huang, Jonathan J.H. Zhu, and Xiaoming Li, “Building a Story Tracer out of a Web Archive,” JCDL 2008 (poster), Pittsburgh, PA, June 16-20, 2008 Lianen Huang, Jonathan J.H. Zhu, Xiaoming Li, HisTrace: Building a Search Engine of Historical Events Proceedings of the World Wide Web Conference 2008 (WWW08), (poster paper) Lianen Huang, Lei Wang, Xiaoming Li, Achiveving both High Precision and High Recall in near-duplicate Detection, Proceedings of the 17th ACM International Conference on Information and Knowledge Management ( CIKM08). ( full paper) * 天网搜索平台的引出 PARADISE设计与实现 关注研究问题 Shakespeare’s collected works definitely aren’t large enough for demonstrating many of the points in this course Table 4.1 Typical system parameters in 2007. The seek time is the time needed to position the disk head in a new position. The transfer time per byte is the rate of transfer from disk to

文档评论(0)

1亿VIP精品文档

相关文档