WBIA关心…-北京大学网络与信息系统研究所.ppt

WBIA关心…-北京大学网络与信息系统研究所.ppt

  1. 1、本文档共63页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
WBIA关心…-北京大学网络与信息系统研究所

* * * * * * * 目录分类是信息组织的基本方法 文件组织——硬盘 图书馆-学科体系-图书分类号 浏览/搜索中限定类别 * 目录分类是信息组织的基本方法 文件组织——硬盘 图书馆-学科体系-图书分类号 浏览/搜索中限定类别 * Most commonly used technique for predicting a specific outcome such as response / no-response, high / medium / low-value customer, likely to buy / not buy. * Useful for exploring data and finding natural groupings. Members of a cluster are more like each other than they are like members of a different cluster. Common examples include finding new customer segments, and life sciences discovery. * * * Useful for exploring data and finding natural groupings. Members of a cluster are more like each other than they are like members of a different cluster. Common examples include finding new customer segments, and life sciences discovery. * Termination conditions Several possibilities, e.g., A fixed number of iterations. Doc partition unchanged. Centroid positions don’t change. * * * * * * * * * * * * * * * IR实现技术: Ian H. Witten, Alistair Moffat, and Timothy C. Bell. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images. San Francisco, CA: Morgan Kaufmann. [MG] Pierre Baldi, Paolo Frasconi and Padhraic Smyth. 2003. Modeling the Internet and the Web:Probabilistic Methods and Algorithms. Wiley. [MIW] Soumen Chakrabarti. 2003. Mining the Web: Discovering Knowledge from Hypertext Data. Amsterdam: Morgan Kaufmann. [MW] Ricardo Baeza-Yates, Berthier Ribiero-Neto and Berthier Ribeiro-Neto. 1999. Modern Information Retrieval. Addison-Wesley. [MIR] Ian H. Witten, Eibe Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Elsevier. [DM] 李晓明, 闫宏飞, 王继民. 2005. 搜索引擎原理、技术与系统. 北京: 科学出版社. [SE] · Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, 2008.Introduction to Information Retrieval, Cambridge University Press. [IIR] · Pierre Baldi, Paolo Frasconi and Padhraic Smyth. 2003. Modeling the Internet and the Web:Probabilistic Methods and Algorithms. Wiley. [MIW] * One Web , One Dream * * Homework 一个网站的网页数量有多少?

文档评论(0)

tangzhaoxu123 + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档