- 1、本文档共31页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 5、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 6、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 7、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 8、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
How slow is the k -means method
How slow is the k-means method? David Arthur Sergei Vassilvitskii Stanford University The k-means Problem Given an integer k and n data points in Rd Partition points into k clusters Choose k centers and partition points according to closest center Try to minimize φ = ∑ ||x – c(x)||2 Lloyd’s Algorithm (1982) Simply called the “k-means method” Choose k starting centers Uniformly at random usually Repeat until stable: Assign each point to the closest center Set each center to be center of mass of points assigned to it Example About k-means It always terminates Each step decreases φ At most kn configurations It can stop with arbitrarily bad clusterings About k-means Widely used because it is fast Usually far fewer than n iterations How do you formalize this? Just look at worst-case performance? k-means (Worst case # iterations) Counting number of configurations: Already showed: O(kn) Inaba et al. (SOCG 94): O(nkd) One dimension: Dasgupta (COLT 03): Ω(n) Har-Peled, Sadri (SODA 05): O(nΔ2) Δ = ratio of largest distance to smallest Our Main Result Worst case = 2Ω(√n) k-means is superpolynomial! Proof: High Level Start with configuration M with n points, which requires T iterations Add O(1) clusters, O(k) points These reset initial configuration M M stabilizes to M’ New clusters, points reset M’ to M M now has to stabilize to M’ again Now requires at least 2T iterations Proof: High Level Repeat reset construction m times: O(m2) points O(m) clusters 2m iterations Main Construction (Overview) Main Construction (Overview) Main Construction (Overview) Main Construction (Zoomed in) Main Construction (t=0) Main Construction (t=0…T) Main Construction (t=T+1)Reassigning points to clusters Main Construction (t=T+1)Reassigning points to clusters Main Construction (t=T+1)Reassigning points to clusters Main Construction (t=T+1)Recomputing centers Main Construction (t=T+2)Reassigning points to clusters Main Construction (t=T+2)Reassigning points to clusters Main Const
您可能关注的文档
- 高考生物必修1《分子与细胞》一轮复习知识梳理【更多资料关注微博@高中学习资料库 】.doc
- 高考生物精华知识点必备23条【更多资料关注微博@高中学习资料库 】.doc
- 高考生物易错点汇总【更多资料关注微博@高中学习资料库 】.doc
- 高考考纲中3500个单词词汇【更多资料关注微博@高中学习资料库 】.doc
- 高考简明版中国现代史知识结构【更多资料关注微博@高中学习资料库 微&信:gzxxzlk 】.doc
- 高考英语 冲刺讲义五 动词的时态和语态【更多关注微博@高中学习资料库 】.doc
- 高考英语作文加分句型【更多资料关注微博@高中学习资料库 】.doc
- 高考英语书面表达最后整理素材【更多资料关注微博@高中学习资料库 】.doc
- 高考英语作文万能句型【更多资料关注微博@高中学习资料库 】.doc
- 高考英语完形填空考点归纳【更多资料关注微博@高中学习资料库 】.doc
文档评论(0)