决策树C4.5论文摘要.docVIP

  • 46
  • 1
  • 约1.6万字
  • 约 31页
  • 2017-07-05 发布于湖北
  • 举报
摘 要 数据挖掘(DM)是当前涉及统计学、人工智能、数据库等学科的热门的研究领域,是从数据中提取人们感兴趣的、潜在的、可用的知识,并表示成用户可理解的形式。分类是数据挖掘的一个重要分支,分类能找出描述数据类或概念的模型,以便能使用模型预测类标记未知的对象类。 本文研究的是基于决策树的分类技术。数据挖掘分类技术决策树Abstract Data mining ( DM ) is relevant to statistics, artificial intelligence, database and other disciplines hot research field, is extracted from the data of interest, potential, the available knowledge, and understandable form. Classification is an important branch of data mining, classification can find to describe the data type or conceptual model, so as to use the model to predict the class label unknown object class. The earliest decision-making algorithms is CLS-1966, by Hunt et al. The most influential decision tree algorithm is ID3 proposed by Quinlan in 1986 and 1993, the C4.5. ID3 can handle only a discrete description of property, it chooses the information to gain the greatest attribute divided training samples, the purpose is carried branching entropy of the system, thereby improving the computational speed and accuracy of the algorithm. The major drawback of the ID3 algorithm, information gain as the choice of branches properties of the standard, biased in favor of the more the value of the property, and in some cases, these properties may not provide much valuable information. C4.5 is the ID3 algorithm, the improved algorithm can handle not only the discrete description of property, can handle continuous description of the property. C4.5 uses information gain ratio as the standard to select the branching property, to make up for the lack of ID3 algorithm. Of this study is based on decision tree classification techniques. Use of a set of data classification and generates a decision tree algorithm C4.5, the first data processing, the use of rules and decision tree induction algorithm to generate readable, and then use the decision-making to analyze the new data. Keywords: Data mining; classification; decision tree;C4.5 目录 第一章 绪论 1 1.1 研究背景及意义 1 1.2 国内外研究现状 2 1.2.1 国外研究现状 2 1.2.2 国内研究现状 3 第二章

文档评论(0)

1亿VIP精品文档

相关文档