A Clustering Algorithm Based on Trust Values.doc

下载文档

5
0
约9.61千字
约 5页
2016-03-12 发布于安徽
举报
版权申诉
保障服务

A Clustering Algorithm Based on Trust Values.doc

1、本文档共5页，可阅读全部内容。
2、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。
3、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。

A Clustering Algorithm Based on Trust Values JIANG Qing-feng 1 Departmen Of Computer Science and Technology, Daqing Normal University , Daqing 163712, China qingfeng_jiang@163.com LI Zi 2 Departmen Of Computer Science and Technology, Daqing Normal University , Daqing 163712, China LI Jian-li3 Departmen Of Computer Science and Technologyt, Harbin Engineering University Harbin 150001, China Abstract—For the shortage of K-Means algorithm, a clustering algorithm based on trust value called TrustCluster is proposed, The algorithm does not need to pre-specify the number of clusters, and clustering results do not depend on the selection of the initial values, clusters of nonspherical shape can be found and the outliers can be identified effectively . TrustCluster clustering algorithm was verified on the real data and artificial data, and compared with the K-Means and Voting-K-Means algorithms, the result showed that TrustCluster algorithm is feasible and effective. Keywords- trust value ; cluster; K-Means algorithm; Voting-K-Means algorithm 一种基于信任值的聚类算法蒋庆丰1，李梓2李健利3 1. 大庆师范学院计算机科学与技术学院, 黑龙江大庆,中国，163712 2. 大庆师范学院计算机科学与技术学院, 黑龙江大庆,中国，163712 3. 哈尔滨工程大学计算机科学与技术学院, 黑龙江哈尔滨中国， 150001 qingfeng_jiang@163.com 【摘要】针对K-Means算法的不足，提出了一种基于信任值的聚类算法TrustCluster，该算法不需预先给定聚类个数，聚类结果不依赖于初始值的选取，可以发现非球形的簇，还可以有效识别孤立点。在真实数据和人造数据上验证了TrustCluster聚类算法，并与K-Means和Voting-K-Means算法进行了对比，实验结果表明TrustCluster算法是有效、可行的。【关键词】信任值; 聚类；K-Means算法；Voting-K-Means算法 1 引言聚类是将数据分成类的过程,使同一个类中的数据之间具有很高的相似度,而不同类中的数据高度相异.聚类是数据分析中的一项重要技术,已经被广泛应用于网络入侵检测、数据挖掘、图像处理、模式识别等领域.目前提出的许多聚类算法可以划分为以k-平均算法(k-Means)[1]为代表的划分法；以AGNES[2]和DIANA[2]为代表的层次聚类算法；以DBSCAN[3]和OPTICS[4]为代表的基于密度的方法；以STING[5]为代表的基于网格的方法；以SOM[6]为代表的基于模型的方法等。 K-Means算法是解决聚类问题的经典算法，具有描述容易、实现简单且快速等优点。但K-Means算法存在以下几个缺点[7]：（1）需要预先给定聚类个数K。(2)算法对初始值的选取依赖性极大。（3）算法容易陷入局部最优解。（4）只能发现球形的簇。（5）对孤立点和噪声点很敏感。针对K-Means算法的不足，本文提出了一种基于信任值的聚类算法TrustCluster，TrustCluster算法不需预先给定聚类个数，聚类结果稳定，不依赖于初始值的选取，可以发现非球形的簇，还可以有效识别孤立点和噪声点。 Voting-K-Means[8]算法对重复执行在不同聚类个数条件下K-Means算