基于改进决策树算法的网络关键资源页面判定-清华大学智能技术与.pdf

基于改进决策树算法的网络关键资源页面判定-清华大学智能技术与.pdf

  1. 1、本文档共9页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
基于改进决策树算法的网络关键资源页面判定-清华大学智能技术与

1000-9825/2005/16(11)1958 ©2005 Journal of Software Vol.16, No.11 ∗ + , , ((), 100084) Web Key Resource Page Judgment Based on Improved Decision Tree Algorithm + LIU Yi-Qun , ZHANG Min, MA Shao-Ping (State Key Laboratory of Intelligent Technology and Systems (Tsinghua University), Beijing 100084, China) + Corresponding author: Phn: +86-10 E-mail: liuyiqun03@, Received 2004-07-26; Accepted 2005-06-02 Liu YQ, Zhang M, Ma SP. Web key resource page judgment based on improved decision tree algorithm. Journal of Software, 2005,16(11):1958−1966. DOI: 10.1360/jos161958 Abstract : Key resource page is one of the most important search target pages for Web search users. Decision tree learning is one of the most widely-used and practical methods for inductive inference in machine learning. Because of the difficulty in uniform sampling of Web pages, there are not enough negative instances for training a key resource decision tree. To solve the problem, the original algorithm is partly modified to learn from global instead of individual instance information. With the same evaluation method as TREC (Text Retrieval Conference) 2003, large scale retrieval experiments based on improved decision tree algorithm achieves more than 40% improvement than the ones based on the original algorithm. It not only offers an effective way for selecting Web key resource pages, but also shows a possible way to improve decision tree learning performances. Key words: Web information retrieval; key resource page; machine learning; decision tree : ,. ,. Web , . , , . 2003

文档评论(0)

magui + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

版权声明书
用户编号:8140007116000003

1亿VIP精品文档

相关文档