- 0
- 0
- 约9.09万字
- 约 40页
- 2025-06-06 发布于北京
- 举报
CS229Lecturenotes
AndrewNg
PartV
SupportVectorMachines
ThissetofnotespresentstheSupportVectorMachine(SVM)learningal-
gorithm.SVMsareamongthebest(andmanybelieveareindeedthebest)
“off-the-shelf”supervisedlearningalgorithm.TotelltheSVMstory,we’ll
needtofirsttalkaboutmarginsandtheideaofseparatingdatawithalarge
“gap.”Next,we’lltalkabouttheoptimalmarginclassifier,whichwilllead
usintoadigressiononLagrangeduality.We’llalsoseekernels,whichgive
awaytoapplySVMsefficientlyinveryhighdimensional(suchasinfinite-
dimensional)featurespaces,andfinally,we’llcloseoffthestorywiththe
SMOalgorithm,whichgivesanefficientimplementationofSVMs.
1Margins:Intuition
We’llstartourstoryonSVMsbytalkingaboutmargins.Thissectionwill
givetheintuitionsaboutmarginsandaboutthe“confidence”ofourpredic-
tions;theseideaswillbemadeformalinSection3.
Considerlogisticregression,wheretheprobabilityp(y=1|x;θ)ismod-
eledbyh(x)=g(θTx).Wewouldthenpredict“1”onaninputxifand
θ
onlyifh(x)≥0.5,orequivalently,ifandonlyifθTx≥0.Considera
θ
positivetrainingexample(y=1).ThelargerθTxis,thelargeralsois
h(x)=p(y=1|x;w,b),andthusalsothehigherourdegreeof“confidence”
θ
thatthelabelis1.Thus,informallywecanthinkofourpredictionasbeing
averyconfidentonethaty=1ifθTx≫0.Similarly,wethinkoflogistic
regressionasmakingaveryconfidentpredictionofy=0,ifθTx≪0.Given
atrainingset,againinformallyitseemsthatwe’dhavefoundagoodfitto
thetrainingdataifwecanfindθsothatθTx(i)≫0whenevery(i)=1,and
原创力文档

文档评论(0)