- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
DSCI 5240 Graduate PresentationXxxxxx
Research paper: Web Mining Research: A survey
SIGKDD Explorations, June 2000. Volume 2, Issue 1
Author: R. Kosala and H. Blockeel
Introduction
Web Mining
Web Content Mining
Web Structure Mining
Web Usage Mining
Conclusion
Outline
The World Wide Web is a popular and interactive medium to disseminate information
Information users may encounter four problems
1. Finding relevant information
a. low precision b. low recall
2. Creating new knowledge out of the information available on the web
data-triggered process
3. Personalizing of the information
People differ in the content and presentations of information
4. Learning about consumers or individual users
Mass customizing or even personalizing
Introduction
Definition: web mining refers to the overall process of discovering potentially useful and previously unknown information or knowledge from the web data
Four subtasks
Resource finding: retrieving intended web documents
Information selection and pre-processing: selecting and pre-processing specific information
Generalization: discovering general patterns
Analysis: validation and/or interpretation of mined patterns
Web Mining
Web Mining and Information Retrieval
Definition: IR is the automatic retrieval of all relevant documents while at the same time retrieving as few of the non-relevant documents as possible.
goal: indexing and searching for useful documents
Web Mining and Information Extraction
IE has the goal of transforming a collection of documents into information that is more readily digested and analyzed.
Compare IR and IE
a. aims
b. fields
Web Mining
Web Mining and the Agent Paradigm
Web mining is often viewed from or implemented within an agent paradigm
User interface agents
Distributed agents
Mobile agents
Two approaches used to develop intelligent agents
Content-based approach
Collaborative approach
Web Mining
Definition: discover
您可能关注的文档
- Do Saguaros Grow Older in Higher Elevations在高海拔地区长大做仙人掌.ppt
- Do OLA Reading Programs Make a Difference做OLA阅读计划有所作为.ppt
- Do Social Networks Improve eCommerce A Study on Social社会网络完善电子商务社会学.ppt
- Do Visualizations Improve Program Comprehensibility 做可视化提高程序的可理解性.ppt
- Do We Have Free Will Parmenides Foundation我们有自由意志巴门尼德的基础.ppt
- Do to Classroom to Do with Classroom ETS做课堂做课堂等.ppt
- Do you know your grains Grain Based Food 你是否知道你的谷物为基础的食物.ppt
- Do We Really Know that the WTO increases Trade我们是否真的知道世贸组织增加了贸易.ppt
- Do You See What I See 4thGradeSteele home你看我看什么4thgradesteele回家.ppt
- docile Wikispaces温顺是wiki空间.ppt
- Dsire's Baby Kate Chopin 复旦大学精品课程D、233先生233E的宝贝凯特萧邦复旦大学精品课程.ppt
- DsM5 An Overview and Critique Grand Rounds DSM5的概述和批判的大查房.pptx
- Dspace项目实施过程简述 以厦门大学学术典藏库为例The Dspace project implementation process by Xiamen University institutional repository as an example.ppt
- Dtresse respiratoire promomed07D & 233编织promomed07呼吸.free.fr.ppt
- Du gnie gntique aux organismes gntiquement 的G & 233聂克和233N与233蜱各G & 233N与233tiquement.ppt
- DT101 RAT uer's guide 恒盈电子国际有限公司大鼠用户指南恒盈电子国际有限公司dt101防伪.ppt
- DSS Chapter 1 faculty决策支持系统1 学院.ksu.sa.pptx
- Dual Credit neisd双重信用NEISD.net.pptx
- Dual Coordinate Descent Algorithms for Efficient Large Margin有效大幅度的双坐标下降算法.pptx
- DT Group presentationDT集团介绍.pptx
最近下载
- 发现你的行为风格 -DISC:提高职场沟通效率 完整版.ppt VIP
- 全球数字疗法产业报告.pptx VIP
- 是谁在敲【知识精研】一年级上册音乐粤教花城版.pptx VIP
- 第2课+开放互联——网络协议与标准+课件+2024—2025学年清华大学版(2024)B版初中信息技术七年级上册.pptx VIP
- 一株双歧杆菌发酵条件的研究.pdf VIP
- 小学四年级英语校本课程.doc VIP
- 中学教育学课程.pptx VIP
- GB_T 3880.2-2024一般工业用铝及铝合金板、带材 第 2 部分力学性能.docx VIP
- 中国国家标准 GB/T 24067-2024温室气体 产品碳足迹 量化要求和指南.pdf
- 三级公共营养师基础知识考试刷题(附答案).doc VIP
文档评论(0)