教程文案cs229andrew ng part-notes3 support vector machinesCS229 Notes3 Support向量机.pdfVIP

教程文案cs229andrew ng part-notes3 support vector machinesCS229 Notes3 Support向量机.pdf

CS229Lecturenotes

AndrewNg

PartV

SupportVectorMachines

ThissetofnotespresentstheSupportVectorMachine(SVM)learningal-

gorithm.SVMsareamongthebest(andmanybelieveareindeedthebest)

“oﬀ-the-shelf”supervisedlearningalgorithm.TotelltheSVMstory,we’ll

needtoﬁrsttalkaboutmarginsandtheideaofseparatingdatawithalarge

“gap.”Next,we’lltalkabouttheoptimalmarginclassiﬁer,whichwilllead

usintoadigressiononLagrangeduality.We’llalsoseekernels,whichgive

awaytoapplySVMseﬃcientlyinveryhighdimensional(suchasinﬁnite-

dimensional)featurespaces,andﬁnally,we’llcloseoﬀthestorywiththe

SMOalgorithm,whichgivesaneﬃcientimplementationofSVMs.

1Margins:Intuition

We’llstartourstoryonSVMsbytalkingaboutmargins.Thissectionwill

givetheintuitionsaboutmarginsandaboutthe“conﬁdence”ofourpredic-

tions;theseideaswillbemadeformalinSection3.

Considerlogisticregression,wheretheprobabilityp(y=1|x;θ)ismod-

eledbyh(x)=g(θTx).Wewouldthenpredict“1”onaninputxifand

onlyifh(x)≥0.5,orequivalently,ifandonlyifθTx≥0.Considera

positivetrainingexample(y=1).ThelargerθTxis,thelargeralsois

h(x)=p(y=1|x;w,b),andthusalsothehigherourdegreeof“conﬁdence”

thatthelabelis1.Thus,informallywecanthinkofourpredictionasbeing

averyconﬁdentonethaty=1ifθTx≫0.Similarly,wethinkoflogistic

regressionasmakingaveryconﬁdentpredictionofy=0,ifθTx≪0.Given

atrainingset,againinformallyitseemsthatwe’dhavefoundagoodﬁtto

thetrainingdataifwecanﬁndθsothatθTx(i)≫0whenevery(i)=1,and

更多 >