决策树C4.5文.docVIP

  1. 1、本文档共31页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
  5. 5、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
  6. 6、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们
  7. 7、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
  8. 8、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
决策树C4.5文

摘 要 数据挖掘(DM)是当前涉及统计学、人工智能、数据库等学科的热门的研究领域,是从数据中提取人们感兴趣的、潜在的、可用的知识,并表示成用户可理解的形式。分类是数据挖掘的一个重要分支,分类能找出描述数据类或概念的模型,以便能使用模型预测类标记未知的对象类。 本文研究的是基于决策树的分类技术。数据挖掘分类技术决策树Abstract Data mining ( DM ) is relevant to statistics, artificial intelligence, database and other disciplines hot research field, is extracted from the data of interest, potential, the available knowledge, and understandable form. Classification is an important branch of data mining, classification can find to describe the data type or conceptual model, so as to use the model to predict the class label unknown object class. The earliest decision-making algorithms is CLS-1966, by Hunt et al. The most influential decision tree algorithm is ID3 proposed by Quinlan in 1986 and 1993, the C4.5. ID3 can handle only a discrete description of property, it chooses the information to gain the greatest attribute divided training samples, the purpose is carried branching entropy of the system, thereby improving the computational speed and accuracy of the algorithm. The major drawback of the ID3 algorithm, information gain as the choice of branches properties of the standard, biased in favor of the more the value of the property, and in some cases, these properties may not provide much valuable information. C4.5 is the ID3 algorithm, the improved algorithm can handle not only the discrete description of property, can handle continuous description of the property. C4.5 uses information gain ratio as the standard to select the branching property, to make up for the lack of ID3 algorithm. Of this study is based on decision tree classification techniques. Use of a set of data classification and generates a decision tree algorithm C4.5, the first data processing, the use of rules and decision tree induction algorithm to generate readable, and then use the decision-making to analyze the new data. Keywords: Data mining; classification; decision tree;C4.5 目录 第一章 绪论 1 1.1 研究背景及意义 1 1.2 国内外研究现状 2 1.2.1 国外研究现状 2 1.2.2 国内研究现状 3 第二章

文档评论(0)

kejie8080230 + 关注
实名认证
文档贡献者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档