- 0
- 0
- 约1.59万字
- 约 52页
- 2018-02-26 发布于江苏
- 举报
【计算机】CHAP9_ADVANCED_CLUSTER_ANALYSIS
(C) Vipin Kumar, Parallel Issues in Data Mining, VECPAR 2002 (C) Vipin Kumar, Parallel Issues in Data Mining, VECPAR 2002 Data MiningCluster Analysis: Advanced Concepts and Algorithms Hierarchical Clustering: Revisited Creates nested clusters Agglomerative clustering algorithms vary in terms of how the proximity of two clusters are computed MIN (single link): susceptible to noise/outliers MAX/GROUP AVERAGE: may not work well with non-globular clusters CURE algorithm tries to handle both problems Often starts with a proximity matrix A type of graph-based algorithm CURE: Another Hierarchical Approach Uses a number of points to represent a cluster Representative points are found by selecting a constant number of points from a cluster and then “shrinking” them toward the center of the cluster Cluster similarity is the similarity of the closest pair of representative points from different clusters CURE Shrinking representative points toward the center helps avoid problems with noise and outliers CURE is better able to handle clusters of arbitrary shapes and sizes Experimental Results: CURE Experimental Results: CURE CURE Cannot Handle Differing Densities Graph-Based Clustering Graph-Based clustering uses the proximity graph Start with the proximity matrix Consider each point as a node in a graph Each edge between two nodes has a weight which is the proximity between the two points Initially the proximity graph is fully connected MIN (single-link) and MAX (complete-link) can be viewed as starting with this graph In the simplest case, clusters are connected components in the graph. Graph-Based Clustering: Sparsification The amount of data that needs to be processed is drastically reduced Sparsification can eliminate more than 99% of the entries in a proximity matrix The amount of time required to cluster the data is drastically reduced The size of the problems that can be handled is increased Graph-Based Clustering: Sparsification … Clustering
您可能关注的文档
- 【doc】团体健康险直付理赔服务模式初探.doc
- 【PPT】-中国人寿保险公司健康险话术专题(48页)-保险话术.ppt
- 【安邦】车险人伤理赔实务手册.doc
- 【全国百强校首发】黑龙江省哈尔滨市第三中学2016届高三上学期第二次检测数学(文)试题.docx
- 【大学信息技术导论】10 信息化与信息技术.ppt
- 【全国百强校首发】黑龙江省哈尔滨市第三中学2016届高三上学期第二次检测数学(理)试题.docx
- 【广发金工】成交量缩减,波动率C_P创新低-ETF期权每周跟踪(20150720-20150724).docx
- 【广发金工】标的下跌 波动率回升,Put价值凸显-ETF期权每周跟踪(20150727-20150731).docx
- 【广发金工】标的巨震引爆单日成交量-ETF期权每周跟踪(20150601-20150605).docx
- 【最新资料】水轮机调节复习资料.doc
最近下载
- 宫颈机能不全诊治中国专家共识(2025版).pptx VIP
- 统编版三年级语文下册第四单元创新情境卷(含答案).docx VIP
- 新型电力系统中的构网型储能技术.docx VIP
- 构网型储能技术.doc VIP
- 构网型储能站储能能量管理系统(EMS)技术规范书.doc VIP
- 构网型储能变流升压一体机技术规范书.doc VIP
- 2024年新高考(辽宁省、黑龙江省、吉林省)高考生物真题试卷(含答案).pdf VIP
- 2025年全国高考生物真题试卷(黑龙江、吉林、辽宁、内蒙古)【含答案】.pdf
- 62835何晓兵《网络营销—— 基础、策略与工具》(第3版)-综合练习题答案(认证教师可下载).docx VIP
- 3.2地区产业结构变化高中地理人教版选择性必修2.pptx VIP
原创力文档

文档评论(0)