结合KD-树和k-means方法的动态聚类方法研究-计算机应用研究.doc

下载文档

15
0
约1.77万字
约 7页
2017-04-18 发布于天津
举报
版权申诉
保障服务

结合KD-树和k-means方法的动态聚类方法研究-计算机应用研究.doc

1、本文档共7页，可阅读全部内容。
2、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。
3、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。

计算机应用研究 Application Research of Computers 计算机应用研究 Application Research of Computers 基金项目：黑龙江省教育厅科学技术研究项目 (No。作者简介: 万静(1972－),女,教授,硕导, 博士,主要研究方向为数据库理论及应用;张义(1989－),男,硕士研究生,主要研究方向为空间数据挖掘;何云斌(1972－),男,教授,研究生导师,主要研究方向:数据库理论与应用、时空数据库、嵌入式系统;李松(1977－),男,副教授,博士,主要研究方向:空间数据库理论及应用（ HYPERLINK mailto:837734463@ 837734463@）基于KD-树和K-means动态聚类方法研究万静,张义,何云斌,李松（哈尔滨理工大学计算机科学与技术学院，哈尔滨 150080）摘要：针对传统K-means聚类算法对初始中心点比较敏感，易陷入局部最优，首先提出基于KD-树的初始聚类中心点选取方法。该方法通过建立KD-树将数据集分割成矩形单元，计算每个矩形的矩形单元中心、矩形单元密度，并将计算所得矩形单元密度降序排列，通过选取前k个矩形单元中心作为初始聚类中心可有效克服传统算法对初始中心点的敏感。此外，针对传统K-means聚类算法不能有效处理动态数据聚类的问题，进一步提出了KDTK-means聚类算法。算法对基于KD-树优化选取的k个聚类中心和增量数据建立新的KD-树，利用近邻搜索策略将增量数据分配到相应的聚类簇中，完成聚类。实验结果表明，与传统的K-means聚类算法相比，提出的基于KD-树优化初始聚类中心点选取的算法能够有效选取具有代表性的初始中心，提出的KDTK-means聚类算法能够快速高效的处理增量数据聚类问题。关键词: K-means聚类; KD-树; 增量聚类; 初始聚类中心中图分类号: TP1 Dynamic clustering algorithm based on KD-tree and K-means method Wan Jing, Zhang Yi, He Yunbin, Li Song (School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China) Abstract: The traditional K-means algorithm is sensitive to the initial center and easy to trap in local optimums. For overcoming this disadvantages, this paper proposes a new method based on KD-tree.The new method firstly divides the data into a series rectangular units by using KD-tree, and sorts the rectangular units by the density, then chooses the k data objects with high density as the initial clustering centers. The experimental result shows that the proposed method has the weak dependence on initial data and better quality of clustering. Meanwhile, since the traditional K-means algorithm can not effectively organize the dynamic clustering, a new improved algorithm called KDTK-means algorithm is proposed. The KDTK-means algorithm builds a new KD-tree by the incremental data and the optimized k initial centers, and then assigns each incremental data to corresponding cluster by the strategy of nearest neighbor searching..The experiment