Harmonic mixtures combining mixture models and graph-based methods for inductive and scalab.pdf

下载文档

11
0
约3.58万字
约 8页
2017-04-13 发布于江苏
举报
版权申诉
保障服务

Harmonic mixtures combining mixture models and graph-based methods for inductive and scalab.pdf

1、本文档共8页，可阅读全部内容。
2、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。
3、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。

Harmonic mixtures combining mixture models and graph-based methods for inductive and scalab

Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning Xiaojin Zhu ZHUXJ@CS.CMU.EDU John Lafferty LAFFERTY@CS.CMU.EDU School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213 USA Abstract Graph-based methods for semi-supervised learn- ing have recently been shown to be promising for combining labeled and unlabeled data in classifi- cation problems. However, inference for graph- based methods often does not scale well to very large data sets, since it requires inversion of a large matrix or solution of a large linear program. Moreover, such approaches are inherently trans- ductive, giving predictions for only those points in the unlabeled set, and not for an arbitrary test point. In this paper a new approach is presented that preserves the strengths of graph-based semi- supervised learning while overcoming the lim- itations of scalability and non-inductive infer- ence, through a combination of generative mix- ture models and discriminative regularization us- ing the graph Laplacian. Experimental results show that this approach preserves the accuracy of purely graph-based transductive methods when the data has “manifold structure,” and at the same time achieves inductive learning with sig- nificantly reduced computational cost. 1. Introduction The availability of large data collections, with only limited human annotation, has turned the attention of a growing community of machine learning researchers to the problem of semi-supervised learning. The broad research agenda of semi-supervised learning is to develop methods that can leverage a large amount of unlabeled data to build more accurate classification algorithms than can be achieved us- ing purely supervised learning. An attractive new family of semi-supervised methods is based on the use of a graphi- cal representation of the unlabeled data—examples of this Appearing in Proceedings of the 22 nd International Conference on Machine