How slow is the k -means method.pptVIP

  1. 1、本文档共31页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
  5. 5、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
  6. 6、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们
  7. 7、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
  8. 8、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
How slow is the k -means method

How slow is the k-means method? David Arthur Sergei Vassilvitskii Stanford University The k-means Problem Given an integer k and n data points in Rd Partition points into k clusters Choose k centers and partition points according to closest center Try to minimize φ = ∑ ||x – c(x)||2 Lloyd’s Algorithm (1982) Simply called the “k-means method” Choose k starting centers Uniformly at random usually Repeat until stable: Assign each point to the closest center Set each center to be center of mass of points assigned to it Example About k-means It always terminates Each step decreases φ At most kn configurations It can stop with arbitrarily bad clusterings About k-means Widely used because it is fast Usually far fewer than n iterations How do you formalize this? Just look at worst-case performance? k-means (Worst case # iterations) Counting number of configurations: Already showed: O(kn) Inaba et al. (SOCG 94): O(nkd) One dimension: Dasgupta (COLT 03): Ω(n) Har-Peled, Sadri (SODA 05): O(nΔ2) Δ = ratio of largest distance to smallest Our Main Result Worst case = 2Ω(√n) k-means is superpolynomial! Proof: High Level Start with configuration M with n points, which requires T iterations Add O(1) clusters, O(k) points These reset initial configuration M M stabilizes to M’ New clusters, points reset M’ to M M now has to stabilize to M’ again Now requires at least 2T iterations Proof: High Level Repeat reset construction m times: O(m2) points O(m) clusters 2m iterations Main Construction (Overview) Main Construction (Overview) Main Construction (Overview) Main Construction (Zoomed in) Main Construction (t=0) Main Construction (t=0…T) Main Construction (t=T+1) Reassigning points to clusters Main Construction (t=T+1) Reassigning points to clusters Main Construction (t=T+1) Reassigning points to clusters Main Construction (t=T+1) Recomputing centers Main Construction (t=T+2) Reassigning points to clusters Main Construction (t=T+2) Reassigning points to clusters Main Const

您可能关注的文档

文档评论(0)

qiaogao + 关注
实名认证
文档贡献者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档